In-Depth

Priority Queues with C#

A priority queue assigns a priority to each element. Knowing how to build them is important in solving many coding problems.

A priority queue is a data structure that holds information that has some sort of priority value. When an item is removed from a priority queue, it's always the item with the highest priority. Priority queues are used in many important computer algorithms, in particular graph-based shortest-path algorithms.

Somewhat surprisingly, the Microsoft .NET Framework doesn't contain a priority queue class. One of the main reasons for this omission is that in most programming scenarios that require a priority queue, you must heavily customize features of the queue to meet your particular needs. Although there are a handful of C# priority queue implementations available on the Web, many of them are either copyrighted, inflexible or contain subtle bugs.

In this article I'll present a C# implementation of a priority queue. I'll explain the code in detail so you'll be able to modify the code presented here as necessary. In addition to being a practical addition to your personal skill set, several of the techniques used to create a priority queue can be used in other coding scenarios. I'll also present a highly effective technique to test custom priority-queue implementations.

To get a feel for exactly what a priority queue is and where I'm headed in this article, take a look at the demo run in the screenshot in Figure 1. The items in the priority queue represent employee objects that have a last name, such as "Baker," and a priority value between 1.0 and 10.0.


[Click on image for larger view.]
Figure 1. The result of the priority queue demo.

When working with priority queues, you must be careful to be consistent with the interpretation of the meaning of the priority values. In this example, and in the majority of algorithms that use priority queues, numerically smaller values of the priority variable represent conceptually higher priorities (which at first might seem backward). So, in this example, an employee with a priority value of 1.0 has a higher priority than an employee with a priority value of 2.0.

After the demo program creates six dummy employees, they're added into the priority queue, and the queue is displayed. Notice that the employee with the highest priority (Aiden, 1.0) ends up in the front of the queue. Also notice that the queue isn't fully sorted, but appears to be semi-ordered in some way.

Behind the scenes, the priority queue is using a structure called a binary heap, which is a semi-ordered structure, as I'll explain shortly. Next, the demo program removes the front employee from the queue. After that removal, the queue is displayed and now the employee with the new highest priority (Baker, 2.0) is at the front of the queue. Then, after Baker is removed, the resulting queue has (Chung, 3.0) at the front.

The demo program concludes by running a set of automated tests. I'll explain exactly what those tests are, and why they're important when creating a custom priority queue or other custom data structure.

This article assumes you have intermediate C# programming skills. I'll present all of the code that generated the screenshot in Figure 1.

Items with a Priority
The overall structure of the demo program shown in Figure 1 is presented in Listing 1. For clarity, I embedded the PriorityQueue class definition in a demo Console Application program. In many situations you'll want to place your priority queue code in a separate class library, to create a DLL that can be used by the code that needs the queue. I also embedded the Employee class definition directly in the demo program.

When working with a priority queue, you'll almost always have a class that has some sort of priority; the Employee class in this example. Notice that class Employee inherits from the IComparable interface, which means it must implement a CompareTo method. This will be true in general, because items in a priority queue must be compared against each other during add and remove operations to reorder the queue.

The PriorityQueue class is generic, so it can be used with any item of type T that has some kind of priority field. This means that the definition has a dependency on the System.Collectons.Generic namespace, as shown at the top of Listing 1. In some situations, you might want to define a priority queue that works on a single, specific type. But because the code logic in a priority queue implementation doesn't depend on the items in the queue, in most cases a generic design is a reasonable approach.

The Employee class definition is shown in Listing 2. For simplicity, I defined the lastName and priority fields with public scope; in most cases your fields will be private, but this doesn't affect the priority queue. Here I named the field that defines an employee's priority as priority. In many situations, an object's priority is implied. For example, in a graph-based shortest-path algorithm, items have a distance field that acts as the priority. Priority fields are almost always either type int or type double.

The required CompareTo method can be coded in several ways, but I prefer the approach in Listing 2. As mentioned earlier, in this example (and in most cases) lower-priority numeric values mean higher priority. In the rare cases where the opposite is true, there are two main ways to handle the situation. First, you can modify the CompareTo method to operate in reverse by changing the two inequality operators. This is the method I recommend. Second, you can modify the code logic in the priority queue add and remove operations. This approach tends to be more error-prone, so I advise against it in most situations.

In some situations, the items you're dealing with might not have a single field that determines the item's priority. For example, an employee's priority might depend on two fields such as years of service and job category. In these situations you must code the CompareTo method to take all relevant fields into account.

Understanding Binary Heaps
A simplistic way to implement a priority queue would be to use an array or a List collection for storage. After each addition operation (at the end of the list), the list could be sorted so that the highest priority (lowest value) is at the front (index [0]) of the list. Likewise, after a removal (of the front item), the list could be re-sorted so that the item with the new highest priority is at the front of the list. However, most algorithms that require a priority queue have lots of data and perform many additions, removals or both. Constantly sorting a list can lead to unacceptable performance.

An alternative to using a List collection for priority queue storage is to use a binary heap. A binary heap is a form of conceptual storage (meaning many different implementations are possible), as shown in Figure 2.


[Click on image for larger view.]
Figure 2. Binary heap storage.

Binary heaps can significantly improve the performance of large priority queues. A min binary heap is ordered storage that has the lowest priority value in the root node. The heap is populated top-to-bottom and left-to-right, so each node that isn't in the bottom row has two child nodes, except for possibly one node, which will have only a left child.

Each child node has a priority value greater than or equal to the priority value of the parent node. There are two things going on here: the fact that the highest priority node is at the root of a binary heap means it can always be found immediately. Second, the heap ordering of the rest of the nodes enables quick updates after a new node is added or the root node is removed.

A binary heap can be implemented using pointers or references. But a binary heap can also be efficiently implemented using a List collection, as shown in Figure 2. The trick is that for any parent node in the list at index [pi], the two child nodes are located at indexes [2 * pi + 1] and [2 * pi + 2].

For example, in Figure 2 the node with priority 3.0 is at index [1]. Its left child is at index [2 * 1 + 1] = [3] and its right child is at index [2 * 1 + 2] = [4]. Very clever! For a given child node at index [ci], its parent is located at index [(ci - 1) / 2].

Adding an Item to a Priority Queue
The PriorityQueue class definition begins as:

public class PriorityQueue <T> where T : IComparable <T>
{
  private List <T> data;

  public PriorityQueue()
  {
    this.data = new List <T>();
  }

The first line of the class can be loosely interpreted as, "This is the definition of a PriorityQueue that works with items of type T (to be specified later), where that type T is required to have a CompareTo method defined." The items are stored in a generic List named data. The PriorityQueue constructor instantiates the List collection.

The method to add an item to a priority queue is surprisingly short (see Listing 3).

The method is named Enqueue. Because the term enqueue is often associated with a pure first-in, first-out ordinary queue data structure, some developers prefer to name this method Add.

If you scan through the code you'll see that none of the logic depends on what the data items are; the logic only requires a CompareTo method consistent with your definition of priority. In spite of its shortness, method Enqueue is quite tricky. The item to add is added at the end of the data List. The while loop starts by placing child index ci at the end of the list, then checking the priority value at the parent index pi. If the child and parent are in order with regard to a binary heap (child priority value is greater than or equal to the parent priority), then the method is done. Otherwise, the child node bubbles up until it's in the correct position. Because the value of index ci is halved in each iteration, the maximum number of operations is O(lg n), which is much better than the O(n lg n) for a simple implementation that doesn't use a binary heap.

Removing an Item
The method to remove an item from a priority queue is shown in Listing 4. The method assumes that the priority queue isn't empty. You might want to modify the code to throw an exception if the queue is empty. The variable li is the last index. The front item is removed by making a copy of it, then overwriting the item at location [0] with the item in the last location. Index ci is initially set to the left child of the parent at index [0].

The Dequeue logic has to compare the priority value of the smaller of the two children, which accounts for the first call to CompareTo. If the parent and the child are out of order, the child node trickles down the binary heap until the parent and child nodes are in order. Because the value of index ci doubles each time through the loop in Dequeue, removing an item is O(lg n).

Miscellaneous Priority Queue Methods
Calling the priority queue is straightforward. The code in the demo program that adds Employee items to the priority queue is:

Employee e1 = new Employee("Aiden", 1.0);
Employee e2 = new Employee("Baker", 2.0);
// And so on for e3, e4, e5, e6
Console.WriteLine("Adding " + e5.ToString() + " to priority queue");
pq.Enqueue(e5);
Console.WriteLine("Adding " + e3.ToString() + " to priority queue");
pq.Enqueue(e3);
// And so on for e6, e4, e1, e2

Depending on the requirements of the algorithm that uses a priority queue, you'll likely want to add some methods to supplement Enqueue and Dequeue. To display the priority queue, you can write a simple ToString method along the lines of:

public override string ToString()
{
  string s = "";
  for (int i = 0; i  < data.Count; ++i)
    s += data[i].ToString() + " ";
  s += "count = " + data.Count;
  return s;
}

The ToString method can be called like so:

Console.WriteLine("Priory queue is: ");
Console.WriteLine(pq.ToString());

To check if a priority queue is empty or not, you'll likely want to code a Count method or an IsEmpty method. For example:

public int Count()
{
  return data.Count;
}

Another common utility function is a Peek method that returns the front item in the priority queue without removing the item:

public T Peek()
{
  T frontItem = data[0];
  return frontItem;
}

Calling these methods could look like:

PriorityQueue <Employee > pq = new PriortyQueue <Employee >();
// Enqueue some Employee items
if (pq.Count()  > 0)
{
  Employee e = pq.Peek();
  Console.WriteLine("The front employee is " + e.ToString());
}

Testing a Custom Data Structure
When creating a custom data structure like the priority queue described in this article, research suggests that a good way to test the custom data structure is to define a method that verifies the state of the data structure, then use a random input and output approach.

The idea is best explained by a concrete example. After any item is added or removed from the priority queue, some pretty tricky code is invoked. But after every operation, the binary heap property (the parent priority value must be less than or equal to the priority value of the child nodes) of the data List collection should still hold.

Consider the method in Listing 5 to check a priority queue's state invariant.

The IsConsistent method walks through each node stored in the data List collection. The index of the left child (lci) and the right child (rci) are computed. Then the priority values of the children (if they exist) are compared with the priority value of the parent and checked to make sure the parent priority value is less than the children's priority values.

An alternative approach would be to iterate through the data List, interpreting each node as a child node, and computing and checking the priority value of the parent node.

With a state consistency method in hand, a simple but highly effective test method can be written as shown in Listing 6.

The test method instantiates a Random object and a priority queue. The method loops a specified number of times (typically some large number like 1,000,000 or more). At each iteration, a random decision to enqueue or dequeue is made by generating an integer between 0 and 1 inclusive.

If the decision is made to add an item, a random Employee item is generated that has a last name such as "123man" and a random priority value between 1.0 and 100.0, and the random Employee is added to the queue.

If the decision is to remove an item, the test method checks to see if that's possible (count > 0) and then does so. After every enqueue or dequeue operation, the state of the underlying data List is checked to verify the binary heap property still holds. Calling the test method could look like:

TestPriorityQueue(1000000);

Research has shown that this approach to testing a custom data structure is extremely effective at uncovering difficult-to-detect bugs.

comments powered by Disqus

Reader Comments:

Fri, Feb 1, 2013 Nelly viagra online :] cipro ppbks generic cialis :PP

http://www.bestmedicinefored.com/ DOT :] http://www.treatmentforinfections.com/ DOT ppbks http://www.edgenericmeds.net/ DOT :PP

Thu, Jan 31, 2013 Kaeden car insurance quotes csn cheap auto insurance vxy order viagra :-]

http://www.quotesfromtopinsurers.com/ DOT csn http://www.onlinecheapautoinsurance.net/ DOT vxy http://edmedsinfo.com/ DOT :-]

Thu, Nov 15, 2012 Author Redmond, WA

(Author to Matt) Good observation: Yes, the SortedSet, available in .NET 4.0 and higher, is an alternative storage approach for a priority queue. The MSDN documentation isn't completely clear but I believe a SortedSet is implemented as a binary tree, so my previous comments about using a SortedDictionary apply. You mention that a SortedSet cannot have duplicate values, but I think this may not be true. Regardless, you make a good point.

Wed, Nov 14, 2012 Matt Richland, WA

Although the solution is intriguing, it seems overly complex. I coded up a test solution using "SortedSet data" and found it to run ~7% faster than the binary heap. For example, running 10 million enqueue/dequeue operations takes 8.65 seconds with the binary heap vs. 8.07 seconds with the SortedSet. The upside is that the Enqueue function is simply "data.Add(item);" while Dequeue is just three steps, "T item = data.First(); data.Remove(item); return item;". I guess the one advantage to the binary heap is that the same object can be added twice, while SortedSets do not allow duplicates.

Wed, Nov 14, 2012 Author Redmond, WA

(Author to Jenda) I understand your point about using a lambda expression vs using the IComparable interface, but think you're being a bit too harsh. I use both constructions quite often and tend to prefer the IComparable pattern from just a style perspective -- but most of my colleagues prefer lambda expressions. One question I have, that I haven't seen addressed, is how IComparable and lambda expressions compare: does the compiler generate similar IL code?

Wed, Nov 14, 2012 Author Redmond, WA

Your design is one I haven't seen before and seems very efficient for those situations where priority can take on a value from a fixed set (say 0 through 99).

Wed, Nov 14, 2012 Jenda

IComparable? Lovely. Lambdas had been in the language for how long? Why do I mention lambdas? Because in real world objects seldom have just a single sinsible ordering! With your IComparable nonsense how do I enlist objects of that class in structures with different ordering? All structures that attempt to keep the objects in some sort of ordering should allow their users to specify BY WHAT to order! IComparable was acceptable (though annoying) back when C# had no support for lambdas and taking a reference to a method took lots of handwaving, it should be forgottent now apart from the builtins and an ocassional simple single-use class.

Wed, Nov 14, 2012 Michael Rice Addison, TX

I've implemented priority queues using a fixed array of queues, where each location in the fixed array represents the ordinal value of the "priority" (i.e., a hash of "priority" to int). The priority queue DS maintains the index into the array where the current highest priority item is located, and this is updated during enqueue and dequeue operations. The key to this algorithm is to make the item "priority" hash function as efficient as possible. A simple data type, such as int or double, is obviously the best choice. Another benefit of this method is deterministic control of order of items with the same priority. If also done some implementations of this where the queued items at certain priorities were not simple FIFO, but rather used a LRU or MRU ordering. This implementation has worked exceptionally well for me and has been very durable and robust.

Wed, Nov 14, 2012 Author

Using a SortedDictionary for storage is in fact a very good option and one that I experimented with. But the approach does have three possible drawbacks. First, under the covers, a SortedDictionary is just a binary search tree and these can become unbalanced which typically leads to greatly degraded performance. Binary heaps are always balanced by definition. Second, if using a SortedDictionary, you lose direct access to the storage implementation which means customization can be more difficult. And third, the SortedDictionary is not available in all languages (I don't think JavaScript has one for example) and I try to write code so that it can be easily refactored to other languages. But, again, you are correct, a SortedDictionary can work well in many scenarios.

Tue, Nov 13, 2012 Mike Denver

Why not use a sorted dictionary (or sorted heap) class to implement your priority queue. Inserting is simply adding the new element (and priority) to the dictionary/heap. Removing the first item is as easy as creating as using the ienumerator interface of the base class and returning only the first element from the enumerated list. These two classes are documented to have the same big-O performance as the best priority queue algorithms and without the developer writing a lot of code. I implemented a PriorityQueue(of iComparable) in about one page of VB.Net code using the sorted dictionary.

Tue, Nov 13, 2012 Author Redmond, WA

I also recommend "On Some Reliability Estimation Problems in Random and Partition Testing" by Tsoukalas, Duran, and Ntafos in The Proceedings of the 1991 International Symposium on Software Reliability Engineering, pp. 194-201. Old but good foundation.

Tue, Nov 13, 2012

(From the author). I can't comment too much about the testing technique in general because most of the work was done at Microsoft Research (where I work) and has not yet been approved for release by the Legal department. What I can say is that the technique was explored against a wide range of custom DSs in conjunction with work with the Pex Project. See http://research.microsoft.com/en-us/projects/pex/. Sorry I can't be more specific but in short, the technique was effective against all DSs studied.

Mon, Nov 12, 2012 James R. Twine United States

Thanks for the reference. But please read my original post in its entirety. I asked how this kind of testing can apply to *other* types of data structures, as implied by the statement "research has shown that this approach to testing a custom data structure is extremely effective at uncovering difficult-to-detect bugs." The reference you posted specifically mentions priority queues in its title. Its abstract also implies that the testing approach specifically targets a "custom implementation of a random access priority queue." My original question still stands. What evidence is available to support that this approach works for other data structures? Thanks!

Fri, Nov 9, 2012 Alexandre Grenier Montréal

Thanks Doc. Excellent article! I hope readers understand this is simplified to show the basics, as optimisations can be quite complex, such as the usage of a red-black left-leaning binary heap. You've managed to distil it to the very basic. I'm not sure why poster KP is saying the ToString() ± O(n^2)... Also thanks for citing your research reference, in case we missed the point ;)

Fri, Nov 9, 2012 James McCaffrey Redmond, WA

Research on testing complex data structures - "Testing a High Performance, Random Access Priority Queue: A Case Study", Proceedings of the 2011 International Conference on Information Technology New Generations, pp. 280-285.

Fri, Nov 9, 2012 James McCaffrey Redmond, WA USA

(A few comments from the author). Yes, the StringBuilder class is more efficient than string concatenation in the ToString method, if for some reason you are calling ToString many thousands of times. For normal usage of ToString there is no measurable improvement. And yes, Dequeue will fail on an empty priqueue -- I removed all error checking for clarity.

Wed, Nov 7, 2012 Mapař CZECH REPUBLIC

ToString() - It's possible to simply use StringBuilder. Dequeue must to fail for empty queue "T frontItem = data[0]". Other possible implementation is list sorted by priority (created by Enqueue). The sorting on Dequeue can be useful, when priorities are time variable.

Tue, Nov 6, 2012 kp

Your ToString() method works in O(N^2). It's pretty bad :)

Tue, Nov 6, 2012 James R. Twine United States

"Research has shown that this approach to testing a custom data structure is extremely effective at uncovering difficult-to-detect bugs." What or which research? Sources? References? Or is it all "original research?" I can see how it would be a good test for *this* particular data structure, but all custom data structures?

Add Your Comments Now:

Your Name:(optional)
Your Email:(optional)
Your Location:(optional)
Comment:
Please type the letters/numbers you see above

.NET Insight

Sign up for our newsletter.

I agree to this site's Privacy Policy.