Practical .NET

Exploiting the ConcurrentDictionary in Asynchronous Applications

The ConcurrentDictionary provides the most efficient (and safest) way to shared named values between asynchronous processes with several powerful methods. But the best advice might be to avoid ever needing them.

In an earlier column I introduced the ConcurrentDictionary object, which allows you to share data between asynchronous processes. In that column, I showed how to use the basic TryAdd and TryGetValue methods. Those methods work well, provided you have a simple application with one process adding or removing items and all other processes reading items in the dictionary (the producer/consumers pattern).

The goal for both the TryAdd and TryGetValue methods is to ensure that if one process reads data before another process added it, an exception isn't raised. Instead, the methods return a False value that allows your code to deal with the problem.

But I also pointed out that TryAdd and TryGetValue won't be sufficient in applications where there are multiple producers updating the dictionary. For example, using a TryAdd to add an item to the dictionary in a producer process might return False, indicating that the item already is present in the dictionary. It might make sense, therefore, to follow that TryAdd with a TryGetValue to retrieve that existing item, using code like this:

If Not dictCustomer.TryAdd("A123", cust) Then
  If dictCustomer.TryGetValue("A123", cust) Then
    '...working with the cust variable
  End If
End If   

However, in an asynchronous application with multiple producers, there's no reason why some other producer might not have updated the dictionary by removing the item between the first and second line of code in the first producer. As a result, by the time the second producer gets to the center of the two nested loops, it might still not have anything in that cust variable.

The good news is there are more sophisticated tools available with the ConcurrentDictionary to handle these problems. The bad news is you might be well advised not to use them.

Ensuring an Object Is Available
For example, you can retrieve values from the ConcurrentDictionary using either the GetOrAdd or the AddOrUpdate methods.

GetOrAdd accepts two parameters: a key and an object. If there's a matching key in the dictionary, the value for the key is returned. However, if the key isn't found in the dictionary then the object in the second parameter is added to the dictionary. Either way, the method doesn't fail.

This example uses GetOrAdd to either retrieve the current value for the key A123 or add a customer to the dictionary under the key "A123":

Dim cust As Customer
cust = New Customer
cust.Id = "A123"
cust.FirstName = Me.FirstName.Text
cust = dictCustomer.GetOrAdd("A123", cust)

Either way, the cust variable will be set to the value that was in the dictionary or was added to it.

A more useful version of GetOrAdd accepts a reference to a method in its second parameter and works in a similar way to AddOrUpdate. This code will add the object returned by GetCustomer if there's no item in the dictionary for the key "A123":

cust = dictCustomer.GetOrAdd("A123", Function(cid) GetCustomer(cid)) //Ed. note: thanks to Richard @arthurdent69 -- see Comment section of this article -- for suggested improvement

AddOrUpdate accepts three parameters: a key, an object and a reference to a method. If the key in the first parameter is found, then the item in the dictionary is updated with the object in the second parameter; if the key doesn't exist in the dictionary, the method is executed to generate an object to be added to the dictionary.

In this example (and assuming that the method GetCustomer returns a Customer object), this code will either update the key "A123" with the value in cust or, if there isn't already an item with the key "A123" in the dictionary, call GetCustomer method and add the object the GetCustomer method returns:

cust = dictCustomer.AddOrUpdate("A123", cust, Function(cid) GetCustomer(cid)) //Ed. note: thanks to Richard @arthurdent69 -- see Comment section of this article -- for suggested improvement

Avoiding Assumptions
It's tempting to think of asynchronous processes as executing synchronously, one after the other. However, GetOrAdd is smarter than that. If multiple GetOrAdd methods process simultaneously there will be a value added to the dictionary under the specified key. However, the dictionary will short circuit the process so that not all values generated by the GetOrAdd method will necessarily have been added, one after the other, to the dictionary.

Here's another case: One process calls the GetOrAdd method but finds no item. Because no item is present, the GetOrAdd method begins to create/add an item. However, at the same time, another process calls GetOrAdd and also finds no item in the diction (the first process hasn't finished added its item). Because no item was found, the second process also adds an item to the dictionary, completing before the first process does. In this case, it's entirely possible that the first process will, in fact, get the item added by the second process rather than the item the first process created.

Hopefully, in either case you won't care. The key is to avoid making assumptions about what item is in the dictionary when using GetOrAdd (it may not be the one just created) or about what process led to the item you just retrieved.

You must also avoid making assumptions about the consistency of the collection from one statement to another. For example, the ConcurrentDictionary has a ContainsKey method that returns True if the key exists (and False if it does not). However, between the time you call ContainsKey and your next code statement, some other process might add or remove that key from your ConcurrentDictionary. Taking action based on the result of ContainsKey is risky, at best.

In fact, using some of the CurrentDictionary's methods and being able to predict the result requires some careful design. For example, the CurrentDictionary has a TryRemove method that removes an item from the collection. The method returns True if it finds and successfully removes the item; False if the method doesn't find the matching key. If TryRemove does find an item to remove, it returns that item in the second parameter you pass to the method.

Here's an example of the TryRemove method removing the item with a key of A123. If the method finds an item to remove, the method will return True and the cust variable will be filled with the removed item:

If dictCustomer.TryRemove("A123", cust) Then
  ...work with the removed Customer object in the cust variable...
End If

However, if one process is simultaneously calling TryRemove while another process is using TryAdd with the same key, there's no way to predict whether the item will be in the collection after both methods complete. TryRemove really makes sense only if there's one process responsible for maintaining the list of items in the collection -- which means no other method can use TryAdd or GetOrAdd (TryValue would be safe).

It might be worthwhile to consider just setting the value of a dictionary entry to null rather than counting on removing it. You can't directly add Nothing/null to a ConcurrentDictionary but you can add a variable of the right type that's set to Nothing/null. This code works, for example:

cust = Nothing
dictCustomer.GetOrAdd("A123", cust)

If this all sounds a little scary … well, it should. If your application requires that multiple processes have all the flexibility they need to add, update and remove objects from the ConcurrentDictionary, then you'll need to be careful about what assumptions you make about the values you've retrieved. Rather than take advantage of these methods, you might be better advised to keep things simple and use the producer/consumer model where only one process updates the dictionary while other processes read it.

If you decide on an architecture that's more flexible than producer/consumer, these are the tools you'll need to integrate the ConcurrentDictionary into your application. Just don't call me if you find you have a bug.

About the Author

Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.

comments powered by Disqus

Featured

  • Microsoft Revamps Fledgling AutoGen Framework for Agentic AI

    Only at v0.4, Microsoft's AutoGen framework for agentic AI -- the hottest new trend in AI development -- has already undergone a complete revamp, going to an asynchronous, event-driven architecture.

  • IDE Irony: Coding Errors Cause 'Critical' Vulnerability in Visual Studio

    In a larger-than-normal Patch Tuesday, Microsoft warned of a "critical" vulnerability in Visual Studio that should be fixed immediately if automatic patching isn't enabled, ironically caused by coding errors.

  • Building Blazor Applications

    A trio of Blazor experts will conduct a full-day workshop for devs to learn everything about the tech a a March developer conference in Las Vegas keynoted by Microsoft execs and featuring many Microsoft devs.

  • Gradient Boosting Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the gradient boosting regression technique, where the goal is to predict a single numeric value. Compared to existing library implementations of gradient boosting regression, a from-scratch implementation allows much easier customization and integration with other .NET systems.

  • Microsoft Execs to Tackle AI and Cloud in Dev Conference Keynotes

    AI unsurprisingly is all over keynotes that Microsoft execs will helm to kick off the Visual Studio Live! developer conference in Las Vegas, March 10-14, which the company described as "a must-attend event."

Subscribe on YouTube