Desktop Developer

Create a .NET Agent

Save time by creating an extensible framework for a .NET agent. The framework offers a dynamically configurable job scheduler and notification service.

Technology Toolbox: VB.NET, C#, XML, .NET Framework 1.1

You've probably thought about creating a Windows service if you've ever needed to create a continuously running job. Maybe you've also considered ways to monitor critical systems and alert someone if anything is not as it should be. Imagine receiving a message on your cell phone if your Exchange server is running out of free space, or an e-mail if last night's batch job hasn't finished yet. Or perhaps you have several tasks that need to be scheduled, but you would like more control than what Windows Task Scheduler provides. Setting up an agent that combines a job scheduler and a notification service offers a wide range of useful applications.

I'll show you how to create an extensible framework for a .NET agent, as well as implementations and examples for key components. My agent framework contains several base classes that work together (see Figure 1). The manager oversees a collection of jobs. Each job has schedulers, notifiers, and a worker. The manager is responsible for checking each of the jobs' schedulers periodically. When a job is scheduled, an event kicks off the job's worker. The worker comes back with a result, and notifications may be sent out based on that result.

To illustrate, let's say you want to create a job that transmits an updated file via FTP every hour of every weeknight. You want to send an e-mail notification only if a transmission fails. First, you create a worker class that can transmit a file via FTP. You can use the scheduler and notification classes I've included in the code samples to do the rest. Add a new job to the agent configuration file, specifying a scheduler, the worker, and a notifier. You may supply multiple schedulers and notifiers for a single job, but only one worker.

I'll also show you how to create a class loader that reads jobs from an XML configuration file. In order to do this, you process an XML document and instantiate objects dynamically. This means you can add new worker classes, for example, simply by providing their assembly DLLs and adding references to them in the configuration file. The loader uses reflection to load the properties and fields of the classes from the configuration file. The loader is mainly generic and may be reused in a wide variety of other applications.

Finally, I'll also show you how to communicate with the agent while it's running. I'll provide a rudimentary example of .NET Remoting with a monitor application that queries the agent for a list of jobs and each job's last known result.

See How AgentManager Works
The AgentManager class is the heart of the system: Its function is to check jobs continuously to see if they're scheduled. A timer is used as a "heartbeat," and the manager checks each job's schedules with every tick of the timer. Of course, this means that the manager isn't really running continuously—if it were, it would probably use 100 percent of your CPU time. You may adjust the granularity of the timer to suit your needs; checking job schedules every second or even every minute is adequate for many purposes. A frequency of even a fraction of a second uses little CPU time compared to a continuous loop.

The AgentManager class's most important public methods are Start and Stop. The Start method gets things going, checking to make sure the class isn't running already. If the manager hasn't been initialized already, the Start method calls the Init method. Most importantly, the Start method starts the timer. With every lapse of the timer, the manager calls the CheckSchedules method of each job. This is all the manager needs to do; if it is time for a job to run, CheckSchedules handles the rest.

The manager also contains a Stop method. The Stop method's main task is to stop the timer. The .NET Framework's timer is multithreaded, so the Elapsed event may be called even after the timer's Stop method is called. You can use the .NET Framework's ManualResetEvent class to signal across threads that the manager has been requested to stop. Call the ManualResetEvent's Set method in the manager's Stop method to indicate that Stop has been called. In the Timer_Elapsed method, use the WaitOne method to determine if Stop was called already. WaitOne accepts zero milliseconds for the amount of time to wait, so you can determine immediately whether a signal has been set (see Listing 1.) You can read more about the ManualResetEvent in the .NET Framework documentation.

The AgentManager also contains the virtual methods OnInit, OnStart, and OnStop. These may be overridden in derived classes, and they will be called by the AgentManager's Init, Start, and Stop methods, respectively.

Notice that I've also included some tracing. You can define a TraceSwitch, which gets its value from the app.config file:

Private Shared _managerSwitch As _
	New TraceSwitch("AgentManager", _
	"AgentManager TraceSwitch")

Trace.WriteLineIf is used in order to log information at different levels of detail. For example, every heartbeat will be logged if the switch is set to TraceVerbose:

Trace.WriteLineIf( _
	_managerSwitch.TraceVerbose, _
	String.Format( _
	"{0:yyyy-MM-dd HH:mm:ss.ffff} " & _
	"{1} Manager {2}", DateTime.Now, _
	AppDomain.GetCurrentThreadId(), _
	"HEARTBEAT"))

You can turn logging off or change the level of detail being logged for any given switch simply by changing the app.config file. You can use as many switches as you want so you can adjust logging in specific functional areas in order to troubleshoot better.

Use AgentJob and Related Classes
Recall that the manager contains a collection of jobs. Each job must be an AgentJob or a class that derives from it. Like AgentManager, AgentJob is not abstract and is fully functional. Jobs have a name and contain a collection of schedulers, a collection of notifiers, and a worker. The only public method is CheckSchedules, which calls the CheckSchedules method of the schedulers collection. The most important method in AgentJob is Scheduler_OnScheduled, which is called when a scheduler determines that a job should be run.

The Scheduler_OnScheduled method calls the worker's Run method, which returns a WorkerResult. In most cases, the Scheduler_OnScheduled method then calls RequestNotification for each of the notifiers, passing the result. Each notifier takes the result into account when determining what to do. For example, you might send a notification to a cell phone only if the result is a critical error. In any case, this is not the AgentJob class's concern; the job simply sends the current result to RequestNotification, and the notifier decides what to do next.

Before calling RequestNotification, the worker's result is passed to the CheckForIgnorableException method. Why would you want to ignore an exception? Perhaps you've created a job that checks a remote server's free disk space every 10 minutes. After running this job for a few weeks, you begin to notice that you get intermittent exceptions due to networking issues beyond your control. You don't want to receive notifications about these exceptions unless they persist for a long period of time; in most cases, you know they will clear up quickly. Each job has a list of "ignorable exceptions." The manager also has a global list that is applied to all jobs. You can specify how many consecutive exceptions should be ignored, and how to long to wait before retrying the job (for example, retry again in 30 seconds, but stop after five exceptions in a row).

For collections of jobs, the AgentJobCollection class derives from CollectionBase and implements Add, Remove, and the indexer. You must derive from the abstract base class AgentScheduler to create a scheduler. It defines a single event, Scheduled, which FireScheduled calls when a job is scheduled to run. CheckSchedule is an abstract method that must be implemented by the derived class; this method does the work to determine whether a job is scheduled to run. CheckSchedule is responsible for calling FireScheduled. The derived class must also implement another abstract method called RequestRescheduling. This method is called when the job wants to reschedule into the future, typically after the worker encounters an ignorable exception.

All the action takes place in the AgentWorker, but it's an abstract class, so derived classes are created in order to perform actions. AgentWorker defines some descriptive properties, but the most important part is the abstract Run method. The Run method, which returns a WorkerResult, must be implemented in derived worker classes. The WorkerResult class includes several properties that contain information about the work that was done, including the WorkerResultStatus, which is an enumeration defining Ok, Warning, Critical, and Exception. The WorkerResult also contains a numeric value for state. The actual value stored in the state is arbitrary, although there is one predefined value, STATE_EXCEPTION, that indicates that an exception was encountered.

The Run method should return different state values depending on what happened. The notifiers keep track of the previous state value, so a change in the state value indicates a potential reason to notify someone. For example, a worker that monitors Web site response time might use a state of zero to indicate acceptable response time, and 1 to indicate response time is too slow. A change to state 1 might also include a status of Critical and cause a notification to be sent. The state would change back to zero if response time improves. Normally, no one would be notified when the state is zero, but a notification is sent because the previous state was 1. If you were paged because response time was poor, you would want to be paged again when it is back to normal.

The AgentNotifier class's RequestNotification method is called by the job after the worker returns a result. AgentNotifier is an abstract class where the Notify method needs to be implemented by derived classes. For example, the Notify method might send an e-mail message or a page, or perhaps add an entry to a database table. The class has several properties that configure the behavior of the notifier. The RequestNotification method takes into account the new worker result in addition to past notifications when determining whether to notify. The result is that you may limit the frequency with which notifications are sent and limit the number of notifications sent in a given time period. If the notifier determines that it really is time to notify, then it calls the FireNotify method, which in turn calls the derived class's implementation of Notify (see Table 1, which reviews the important aspects of the agent's base classes).

The AgentIgnorableException class defines an ignorable exception and it does not need to be extended. An "ignorable exception" is basically a set of properties. Either the message or the exception type name must be specified in order to match to a real exception. The other properties determine the number of consecutive times an exception may be ignored and how long to wait before retrying the job. The collection class adds an additional Find method, which takes an Exception as a parameter and looks for a matching ignorable exception in the collection.

Use a Dynamic XML Object Loader
A convenient way to configure the agent while avoiding recompilations is with an XML configuration file. You can create a relatively generic class loader, DynamicXmlObjectLoader, which is used to load an agent's manager and all of its jobs, along with their schedulers, notifiers, and workers (see Listing 2 for an example XML configuration file). In the file, you specify an element with an attribute named "type" that specifies the name of the System.Type that the element corresponds to. The loader creates an instance of that type dynamically. The type name can be fully qualified, so you can add new assemblies with new types without ever recompiling the agent. This means you can create new jobs with new types of workers simply by adding the assemblies and changing the configuration file.

Once the loader has created an object of the specified type, it then reads all of the attributes and sub-elements from the XML file. Each attribute and sub-element must correspond to either a property or a field of the object. For example, if you had a scheduler class that has TimeOfDay as a property, then you would have a TimeOfDay attribute in the XML file (such as 'TimeOfDay="9:00:00"'). You use attributes for native types or other simple types that you can convert from a string, such as numbers, enums, and so on. You use sub-elements where another object must be instantiated; you load sub-elements recursively, so the object tree might be deep.

In the SetMember method, the loader uses reflection to determine what target type the property or field is expecting. I've added special handling for TimeSpans, Enums, and Arrays. For other types, the loader simply calls the .NET Framework's Convert.ChangeType to coerce the string value in the XML file into the target type. TimeSpan.Parse is used for TimeSpans. Enums are specified by name and parsed into the correct value. If the target type is an array, it is assumed that the elements are comma-delimited.

There is one final special case: If the target type derives from CollectionBase, it is assumed that instead of setting a property, you should add an item to a collection with the specified name. Use the XML element Jobs to add a new job to the Jobs collection. You use reflection to find and invoke the Add method for the collection (see Listing 3 for the implementation of SetMember). You can add support for additional cases by modifying the SetMember and ConvertStringValueToTargetType methods.

DynamicXmlObjectLoader does not contain any references to the base agent classes; the classes to be loaded are only referenced in the XML file. This makes the loader suitable for re-use in a wide variety of applications (anywhere there's a need to load classes dynamically from a configuration file).

Create a Fixed Interval Scheduler
Recall that the AgentScheduler is an abstract class, so you can create a useful implementation called FixedIntervalScheduler, which has the DaysOfWeek, Frequency, StartTime, and EndTime properties. If a job needs to check that an external data provider's link is up, you might want to set the frequency to every five minutes from 8 a.m. to 6 p.m. on weekdays. Frequency, StartTime, and EndTime are specified using a text format that TimeSpan can parse, such as "18:00:00" for 6 p.m., or "00:10:00" for every 10 minutes. I created an enumeration called ScheduleDaysOfWeek for specifying the days of the week; you may separate multiple days by commas. The enumeration uses the FlagsAttribute, and in addition to each day, it includes Everyday, Weekdays, and Weekends, which are combinations of the appropriate days of the week.

The CheckSchedule method implements the abstract method from the base class. First, it determines whether the current date and time fall into the boundaries defined by DaysOfWeek, StartTime, and EndTime. Then it looks at the previously scheduled date/time and adds the frequency time span (or the rescheduled time span, if it is smaller). If this next scheduled date/time is now, or has passed already, then it fires another scheduled event by calling FireScheduled.

You can also create a basic implementation of AgentNotifier called SmtpNotifier. This notifier sends e-mail notifications via SMTP. You can configure five properties: From, To, Subject, Body, and SmtpServer. The Subject and Body properties are format strings, where the first parameter is the short message (for the subject) and the second parameter is a detailed message (for the body), both of which are in the WorkerResult. The Notify method is an override of the abstract method from AgentNotifier and it sends the e-mail message. Notifications for different jobs, or even for different results of the same job, may be sent to different people. The notifiers simply need to be specified in the XML configuration file (see Listing 2).

Work With Several Workers
The sample worker classes I've created all derive from the AgentWorker class and override the abstract Run method. Each worker has its own set of public properties or fields you can use to configure the worker in the configuration file. Properties with default values are optional in the configuration file.

The HttpWorker checks the response time of the specified URL. You can configure the thresholds for the Warning and Critical response times. The HttpWorker class defines four constants, which represent different states in the WorkerResult. STATE_OK is used when the response time is below the warning threshold. STATE_WARNING_TIME and STATE_CRITICAL_TIME indicate that the response of the URL is too slow. STATE_CRITICAL_ERROR indicates there's a problem, such as the Web site being unreachable. The important aspect is that the numerical value for each state is different.

A change in state helps the notifiers determine whether a notification might be sent. If the job is running every five minutes, it might make sense to send out notifications only when the situation changes. The Run method of the HttpWorker uses the HttpWebRequest and HttpWebResponse classes from System.Net in order to check the response time of the URL. The date and time are recorded just before and just after the request's GetResponse method is called. The difference is compared to the threshold properties in order to determine the Web site's response time.

The NetworkDiskSpaceWorker is a useful worker for monitoring free drive space on machines that are part of your domain. It uses classes from the System.Management namespace to search for drives on the specified machine and query the amount of space used on each drive. You can set the warning and critical thresholds as a number of gigabytes and as a percentage of free space. This is useful for alerting a system administrator before a resource runs out of space.

I also included the LongRunningProcessWorker, which looks for a process on the specified computer and determines how long the process has been running. This is useful for monitoring processes that might hang periodically. For example, if you have scheduled tasks that run Excel on a server in an unattended environment, you might want to monitor for an excel.exe process that has been running for too long. This worker also uses the System.Management namespace.

I've included a number of other workers in the sample code, including FtpDownloadWorker, SqlLogShippingWorker, and SqlReplicationWorker. The latter two workers monitor the status and latency of SQL Server processes, so you can receive an alert if your fail-over database is getting stale.

Make the Agent Into a Service
An AgentManager is simply an object with a Start and Stop method. You can use it from many application types. I've provided a small Console application for testing, but when you deploy the agent, it's best if it is a service because you can have it start automatically. You need to follow a few steps in Visual Studio .NET in order to create a service and be able to install it.

Start by creating a new service project (either in VB.NET or C#). The new service project starts up in designer view. Right-click anywhere in the blank part of the designer window, and choose Add Installer. This adds a project installer to your project, which allows you to install it using the InstallUtil program (see "Installing and Uninstalling Services" in the Visual Studio .NET help).

Next, add references to the AgentBase and AgentClasses assemblies (containing the AgentManager and the DynamicXmlObjectLoader classes) to the service project. Close the ProjectInstaller windows and open the source code for Service1. Declare an AgentManager in the class. In the OnStart method, load the manager from the configuration file and call its Start method. In the OnStop method, call the manager's Stop method (see Listing 4).

You might need other applications to be able to communicate with your agent. For example, you might want to add a page to an ASP.NET application that lists all of the loaded jobs and their current statuses. You can accomplish this easily with .NET Remoting. I created a simple example by extending the AgentManager class and creating a new manager called RemotingManager. RemotingManager works together with another class called ManagerDataSetReportGenerator, which is the class that will be used for remoting.

Conceptually, remoting has a server and a client. The client and server must share knowledge of a common interface in order to communicate. In order to avoid having to reference a new interface from both programs, I used a generic interface, IObjectHandle, which is built into the .NET Framework. I chose this interface because it contains a single method called Unwrap, which takes no parameters and returns an object. The Unwrap method in ManagerDataSetReportGenerator builds a data set from the RemotingManager's list of jobs and then returns the XML version of the data set as a string.

Remoting needs to be configured when an application starts, so the RemotingManager's constructor configures remoting from the application's config file (for example, app.config). The RemotingManager also overrides OnStart and OnStop in order to set and clear a static reference to itself in the ManagerDataSetReportGenerator class (see Listing 5). The app.config file for the Windows service, the console application, or whatever program is hosting the agent, contains the <system.runtime.remoting> section for the remoting configuration (in addition to the trace switches discussed earlier).

AgentMonitorApp is a remoting client that periodically retrieves the current status of the jobs from the running agent. To see it in action, run the sample AgentTestApp and AgentMonitorApp at the same time. You'll see that the AgentMonitor app refreshes the list every 10 seconds. Like the agent, the monitor's remoting configuration is in the app.config file (see Listing 6). The monitor uses RemotingServices.Connect to connect to the agent. This returns an object that implements the interface, IObjectHandle. The monitor then calls the Unwrap method, which returns the XML for the data set. AgentMonitorApp might reside on a different physical machine than the agent; the URL, which is also in the app.config file, simply needs to be pointed to the location of the agent.

The concept of job scheduling dates as far back as computing systems and plays an integral role in most software solutions. Windows Task Scheduler is a good example, and even SQL Server has a system of jobs and alerts. However, the .NET Framework provides all of the tools necessary to create your own versatile job scheduler and notification service in order to fit your organization's needs. The concepts I've demonstrated, as well as the downloadable example code, provide a working starting point for your .NET agent.

=
comments powered by Disqus

Featured

Subscribe on YouTube