C# Corner
The Azure Factor
How factoring out common patterns in your Azure worker roles can improve development.
Every Azure worker role you write will have some common code blocks. The fact is that Azure worker roles will always follow a few well-known patterns. Because of these common well-known patterns, it's easy to fall into the trap of copying the code from your first application, and modifying it to create your second. I look at copy/paste/modify reuse as an opportunity to create a better library. You need to factor out the common parts from the parts that aren't common. Sometimes that's easy. Sometimes, as with worker roles, you'll need to reach into your bag of tricks and use Action<>, Func<>, LINQ queries and extension methods to create the library you want. In this article, I'll take a sample Azure application and show you how to create reusable algorithms that you can work with in all your worker roles.
The sample download is a simplified version of an app I use to keep track of my library of Web links. There's an Azure Web role that displays the library of links. You can see a description of the site and the URL, and you click on the link to visit the site in a new window. Of course, you can also add new links to the library.
One consistent problem I have is that some old Web links don't work anymore. I always had stale links in my favorites folder. I just didn't take the time to keep visiting all my links, and remove the ones that no longer worked. Azure has an architecture that makes it easier to avoid that problem. An Azure worker role can periodically attempt to download all the links. It marks any links that don't work as possibly bad. The worker role class is shown here:
public class WorkerRole : RoleEntryPoint
{
public override void Start()
{
// This is a sample worker implementation.
//Replace with your logic.
RoleManager.WriteToLog("Information",
"Worker Process entry point called");
while (true)
{
Thread.Sleep(TimeSpan.FromSeconds(30));
RoleManager.WriteToLog("Information",
"Checking Link Database");
// Look at every link.
// build a request and remove those that
//don't work
StorageAccountInfo accountInfo =
StorageAccountInfo.
GetAccountInfoFromConfiguration
("TableStorageEndpoint");
LinkDataServiceContext context =
new LinkDataServiceContext
(accountInfo);
foreach (var weblink in context.
WebLinkDataCollection)
{
if ((!weblink.LinkValid)
|| (DateTime.Now - weblink.
LastVisit > TimeSpan.FromDays(1)))
CheckLink(weblink);
context.SaveChanges();
}
}
}
private void CheckLink(WebLinkData weblink)
{
WebClient client = new WebClient();
try
{
var answer = client.
DownloadData(weblink.Url);
RoleManager.WriteToLog("Information",
string.Format("link {0} is valid",
weblink.Url));
weblink.LastVisit = DateTime.Now;
weblink.LinkValid = true;
}
catch (WebException)
{
RoleManager.WriteToLog("Information",
string.Format("link {0} FAIL",
weblink.Url));
weblink.LinkValid = false;
public override RoleStatus GetHealthStatus()
{
// pretty simple, we're healthy.
return RoleStatus.Healthy;
}
}
This is the idiomatic code you'll see in many Azure applications. The Start() method is an infinite loop that sleeps for a while, then does some work, then goes back to sleep. The work being done is written in a very imperative style. The end result is that this code has practices that lead to programs that are hard to maintain and enhance. First of all, the actual logic for this application is mixed in with boilerplate code that would be common on almost every worker role you'll create. Second, there's too much emphasis on how a worker role should be coded when you want your code to show what your worker role does. Finally, you know that when you need to create a new worker role, you'll copy this code. You'll paste it into your new worker role, and you'll modify it to suit the needs of the new worker role.
The goal of modifying this class is to separate the parts that are specific to this worker role with the code you'll use in any worker role. To make it easier to integrate into any Azure application, you can design the API using extension methods on the Microsoft.ServiceHosting.ServiceRuntime.RoleEntryPoint class. The first step is to create a method that can handle the common functionality in your Start() method. The common actions are to sleep for some period, perform some action, and then determine if you continue processing requests. That functionality can be easily represented by this extension method:
public static void RunWorker(
this RoleEntryPoint role,
TimeSpan sleepTime,
Func<bool> method)
{
do
{
Thread.Sleep(sleepTime);
} while (method());
}
Next, you should look at what's being done at each step. This worker role does its work using items located in Azure Table Storage. That's going to be a common pattern: run some query against an Azure Table, process some elements, and then save the changes back to that same table. That's another useful extension method. Azure Table items are accessed from a Service Context as an IQueryable<T>, for some T. An extension method to take some action against items in a table and then save those changes looks like this:
public static void ProcessTableData<T>(
this IQueryable<T> sequence,
Action<T> action,
TableStorageDataServiceContext context)
{
sequence.ForAll(action);
context.SaveChanges();
}
public static void ForAll<T>(this IQueryable<T>
sequence, Action<T> action)
{
foreach (T item in sequence)
action(item);
}
ForAll is a help method that provides a simple interface to take action against all items in a sequence. ProcessTableData isn't that complicated of a method, because you can form your query in the calling code. Any LINQ code would work to filter the table (where, take, skip, orderby, or any other LINQ method). In fact, that's what I'm doing in my Start() method. The refactored Start() method looks like this:
public override void Start()
{
RoleManager.WriteToLog("Information",
"Worker Process entry point called");
StorageAccountInfo accountInfo =
StorageAccountInfo.GetAccountInfoFromConfiguration(
"TableStorageEndpoint");
LinkDataServiceContext context =
new LinkDataServiceContext(accountInfo);
this.RunWorker(TimeSpan.FromSeconds(30), () =>
{
RoleManager.WriteToLog("Information",
"Checking Link Database");
context.WebLinkDataCollection.
Where((link) =>
(!link.LinkValid) || DateTime.Now -
link.LastVisit >
TimeSpan.FromDays(1)).
ProcessTableData((item) =>
CheckLink(item), context);
return true;
});
}
The other common pattern for a worker role is to pull messages off a queue and process each message. For example, one Azure sample has this code:
// retrieve messages and write them to the
//development fabric log
while (true)
{
Thread.Sleep(10000);
if (queue.DoesQueueExist())
{
Message msg = queue.GetMessage();
if (msg != null)
{
RoleManager.WriteToLog("Information",
string.Format(
"Message '{0}' processed.",
msg.ContentAsString()));
queue.DeleteMessage(msg);
}
}
}
That's hardly reusable, and you know hundreds of developers have already copied this method many times. Just like with the table storage, you can create an extension method that abstracts away the common portions of the code, and use lambda expressions for the specific bits of work. Here's the extension method that contains the processing that's common to every queue:
public static void ProcessQueue(this
MessageQueue queue, TimeSpan timeout, Func<Message, bool> action)
{
bool more = true;
while (more)
{
Thread.Sleep(timeout);
if (queue.DoesQueueExist())
{
Message msg = queue.GetMessage();
if (msg != null)
{
more = action(msg);
queue.DeleteMessage(msg);
}
}
}
}
And here's how you'd use it in the sample shown earlier:
queue.ProcessQueue(TimeSpan.FromSeconds(10),
(msg) =>
{
RoleManager.WriteToLog("Information",
string.Format(
"Message '{0}' processed.",
msg.ContentAsString()));
return true;
});
You may be thinking that these changes don't really add any value. In one sense, that's true. The sample has the exact same functionality now that it had before I started. But that's not the same as saying these changes didn't produce value. These methods have factored out common algorithms you'll use in every worker role you create. That result in itself has value: These extension methods have been fully tested on the first use. Copy/paste/modify reuse often introduces bugs, especially in the modify phase. A common source of those bugs is the Azure Queue processing. Notice how the extension methods get the message, process it and then finally delete it. Often, developers make the mistake of deleting the message as they get it, rather than after processing. That can create the kind of bugs that only show up very rarely, when your worker process fails to process the message. That's the kind of change that causes bugs months later.
Azure is very new, so it's hard to believe we know the best practices for Azure development. We know many of our ideas on what's best will change. That's why I'm very adamant about factoring out common algorithms. When you find a better way to solve a particular problem, you only have one location to fix. If you've duplicated-or worse, duplicated and modified-code, you have to make the same fixes in many places. That increases your chance for errors and omissions. C# gives you a wealth of tools to factor out these common algorithms, and you end up being able to see your specific algorithms more clearly.
About the Author
Bill Wagner, author of Effective C#, has been a commercial software developer for the past 20 years. He is a Microsoft Regional Director and a Visual C# MVP. His interests include the C# language, the .NET Framework and software design. Reach Bill at [email protected].