Getting Started

Sort in .NET

Sorting provides users with logical views of information. Learn how to implement and customize the sorting features your collections provide.

Technology Toolbox: VB.NET, Visual Studio .NET 2003

Sorting is one of the most commonly used algorithms in any application. Whether you're sorting data in arrays, data tables, or controls, sorting provides users with a logical view of information that's easy for them to digest.

Both ADO.NET and the majority of list controls support sorting natively by default, and a plethora of articles and quickstart examples show you just how this works. I'll show you how to implement and customize sorting for lists containing various data types.

Several specific collections in the .NET Framework implement the Sort() method, including Array and ArrayList. You get native support for the Sort() method if you define your collection class as ArrayList or inherit from the ArrayList class:

myArrayList.Sort()

You need to implement your own comparator and sorting logic if you base your collection off classes such as CollectionBase, Hashtable, DictionaryBase, or Queue. Many developers inherit their collection classes from either the Array or ArrayList classes if they need sort functionality.

The Sort() method uses a standard quicksort algorithm to sort data in collections. The quicksort algorithm works by a method frequently called "divide and conquer." It selects an element from the list randomly and breaks the list into two pieces—those whose elements are smaller than a given number, and those whose elements are larger than that number. It then applies this logic to each piece recursively until the list is completely sorted.

For example, take this list of numbers—5, 8, 2, 9, 1, 4, 7, 3—and use quicksort to sort them (see Table 1). First, select a number from the list at random (let's say 4), then partition the list into two pieces. The left side contains elements less than 4, and the right side contains elements greater than 4 (these are placed in their same order—you don't do any subsorting here).

Now, perform the same action recursively on the left and then the right side until you end up with no more elements to partition on any side. Table 1 describes the detail of how to quicksort the entire list of numbers to get the end product of (1, 2, 3, 4, 5, 7, 8).

Random Number Determines Speed
As you can see, the speed of the quicksort depends a lot on the random number it selects to break the list into two pieces each time. Performance relates directly to the number of comparisons made. Always selecting a number that yields the same number of elements in the left and right pieces yields the least overall amount of comparisons. Always selecting a number that yields zero or one elements on one side and the remainder of elements on the other side yields the most overall number of comparisons.

There are variations on the quicksort algorithm that attempt to condition the value selected randomly for partitioning based on information on the contents of the list (without rescanning). Computing the expected number of comparisons (a random value called Xn, where n is the number of elements in the list) is outside the scope of this article, but many good combinatorial algorithm books out there can show you how this works.

As with all sorting algorithms, quicksort has strengths and weaknesses. It works best with large amounts of data in a list. The smaller the amount of data, the closer the speed of a quicksort will be to a linear sort. I've found the .NET implementation of the quicksort algorithm to be extremely fast in most cases. For example, sorting 500,000 GUIDs on my 866-MHz PIII system ran 2.5 seconds on average.

Now that you've seen how the .NET Framework uses quicksort, I'll show you how to use sorting in your application. Here's a simple example of how to sort a set of strings stored in an ArrayList collection:

Dim myVanHalenCollection as new ArrayList
myVanHalenCollection (0).Add("Sammy")
myVanHalenCollection (1).Add("Eddie")
myVanHalenCollection (2).Add("Alex")
myVanHalenCollection (3).Add("Mark")
myVanHalenCollection.Sort()

As you might've guessed, the ArrayList ends up looking like this:

{("Alex"), ("Eddie"), ("Mark"), ("Sammy")}

If multiple elements in the list are equal, they will be placed in their same order within the newly sorted list. This is an important fact to note when sorting key-value pairs.

The .NET Framework also contains a SortedList class that sorts data as you add it. It uses key-value pairs to store data, and is useful when you want to presort data for viewing and then be able to select a value from the key selected. A common use for SortedList is sorting states:

Dim myStates as new SortedList
myStates.Add("CA", "CALIFORNIA")
myStates.Add("AZ", "ARIZONA")
myStates.Add("AK", "ALASKA")
myStates.Add("ME", "MAINE")
myStates.Add("MA", "MASSACHUSETTS")

At the end, myStates looks like this:

{ ("AK", "ALASKA"), ("AZ", "ARIZONA"),
   ("CA", "CALIFORNIA"), ("MA",
   "MASSACHUSETTS"), ("ME", "MAINE") }

You can also implement key-value sorting by creating two equal-sized arrays, one holding the key and the other holding the value. When you load the arrays, the indices should match up a key with its value (see Listing 1).

Once loaded, an overloaded version of Sort() can be called to sort both arrays, either by key or by value. Calling Sort() like this sorts both arrays by the product code (key):

Array.Sort(SortedProductCodes, _
   SortedProductNames)

You can sort both arrays by the product name (value) by inverting the parameters:

Array.Sort(SortedProductNames, _
   SortedProductCodes)

The final product still matches up a key to its value through the array index, making this a useful way to manage key-value pairs.

You can't sort multidimensional arrays with the Sort() method (doing so will yield an unhandled exception). If you want to sort multidimensional arrays, either build a jagged array (an array of arrays) or sort each array dimension individually using one of the overloaded Sort() methods that takes position and length as parameters. (I'll discuss these overloaded Sort() methods later in the article.)

Implement Your Own Comparator
The Sort() method requires that all objects within the collection be either of a primitive data type (Int32, Int16, String) or given a class that implements the iComparer interface. You'll get an unhandled exception error if you call the Sort() method that meets neither of these conditions (see Figure 1).

Classes use the iComparable interface to define a default comparator implementation. Primitive data types already implement the iComparable interface for you (which is why they're already sortable). However, you'll need to implement the iComparable interface yourself if you define a new data type that's not inherited from one of the primitive data types.

You use the iComparer interface to define an overriding comparator implementation for the Sort() method. This gives you an alternative way to compare elements—you don't have to use the default comparator defined by the data type. Both of these interfaces use the CompareTo() method as the comparator, which provides the .NET Framework with a mechanism to compare the current object against another object of the same type (remember, comparisons are the heart of quicksort):

Function CompareTo(ByVal obj As Object) _
   As Integer

CompareTo() must return a number less than zero if the current object is less than object "obj." It returns zero if the two objects are equal, and it returns a number larger than zero if the current object is greater than object "obj." Keep in mind that your performance will suffer if your CompareTo method is not efficient.

You can sort collection data easily using the Sort() method, but let's say you want more control. For example, you want to sort information in descending order. To do this, you need a class that implements the iComparer interface to override the default comparator.

The MyReverserClass implements the iComparer interface (see Listing 2). The sole purpose of the MyReverser class is to provide a CompareTo() method that renders a sort in descending (reverse) order when called by the Sort() method. Notice that the comparator within the MyReverserClass class merely makes a call to the default CompareTo() method of .NET's CaseInsensitiveComparer class (one of several predefined .NET comparator classes available) and reverses the comparison parameters, yielding an inverted sort. The .NET Framework also provides a way to sort only a portion of an array using an overloaded version of the Sort() method:

Sort(myArray, iStart, iCount)

Sort a Subset of myArray
Calling this overloaded method sorts myArray from the element at position iStart through the number of elements defined in iCount. All other elements in the array remain unsorted. The ArrayList doesn't support this method signature, but it does allow you to obtain the same results by using this overloaded method:

myArrayList.Sort(iStart, iCount, nothing)

The third parameter is expected to be an object that contains an implemented iComparer interface. Passing in nothing (or null) tells the method to use the default comparator for the data type of elements within the list.

You can download an example program called Sorting that contains a Simple Sorting form showing you how to select contiguous pieces of a list in a listbox control and then sort only those elements. The entries that aren't sorted retain their original position in the list, while the other elements are sorted in the contiguous range specified. Take a look at an example of how to sort sections of an ArrayList based on what the user selects from the listbox (see Listing 3).

In this example, you load data into an unsorted ArrayList and then copy it into the unsorted listbox. The program sorts the data based on whether users press the Sort or Reverse Sort button. If they select the Sort button, the unsorted ArrayList is copied to a new ArrayList and the default Sort() method is called. If they select a subset of entries in the listbox, the default Sort() method is called where the values of iStart and iCount equal the start of the entries in the listbox to sort and the number of entries to sort:

' If all the values are selected, sort the 
' whole thing
If (lstUnsortedEmployees.SelectedItems.Count _
   = lstUnsortedEmployees.Items.Count) Then
   ' Sort the values using default 
   ' interface designed for strings
   SortedEmployeeArray.Sort()
Else
   ' Otherwise, just sort the selected 
   ' values and append the rest
   ' Passing Nothing forces it to use the 
   ' default comparator
   SortedEmployeeArray.Sort _
      (0, lstUnsortedEmployees.SelectedItems.Count, _
      Nothing)
End If

The same logic is performed if users select the Reverse button, but instead of using the default Sort() method, one of the overloaded Sort() methods is used and is passed the mySortReverser object (which is an instantiation of MySortReverserClass) so that the elements selected are sorted in descending order:

Dim mySortReverser As New _
   mySortReverserClass
' If all the values are selected, sort the 
' whole thing
If (lstUnsortedEmployees.SelectedItems.Count _
   = lstUnsortedEmployees.Items.Count) Then
   ' Sort the values using default 
   ' interface designed for strings
   SortedEmployeeArray.Sort(mySortReverser)
Else
   ' Otherwise, just sort the selected 
   ' values and append the rest
   SortedEmployeeArray.Sort _
      (0, lstUnsortedEmployees.SelectedItems.Count, _
      mySortReverser)
End If

Build Your Own Class
Sorting Arrays and ArrayLists is simple, but let's say you want to build your own class and provide sort capabilities. The easiest way is to capitalize on the .NET Framework infrastructure built for you already and have your class inherit from a class that supports the Sort() method (such as an ArrayList):

Public Class Employee
   Implements IComparable
   ' Employee Class Definition Goes Here
End Class
Public Class EmployeeCollection
   Inherits System.Collections.ArrayList
End Class

With this structure, you can now create a collection of Employees and sort them. Remember, the iComparable interface is implemented at the element class level and not within the class definition of the collection. Consider the code for an Employee class and Employee collection, including the implementation of the iComparable interface within the Employee class (see Listing 4).

The Advanced Sorting form within the Sorting program reads employee data from an XML file and, just as the earlier examples do, puts it into one unsorted listbox for users to select with another listbox containing the sorted results after they press the Sort button (see Figure 2).

Sorting is a breeze once the Employee class implements the iComparable interface. However, notice that this example also allows you to sort by LastName and FirstName, as well as change the direction of the sort (to ascending or descending). You do this by implementing an overriding comparator built from a class that implements the iComparer interface just like the reverse sorting example:

Public Class CustomEmployeeSort
   Implements IComparer
   ' Custom Sort Definition Goes Here
End Class

By implementing a custom comparator from the iComparer interface, an object instantiated off of this class is passed to the Sort() method, and it uses your overriding comparator logic instead of the default logic attached to the element type in the collection. The full code for the CustomEmployeeSort class provides the overriding comparator (see Listing 5).

The new comparator checks what the sort parameters are and performs a comparison of the objects based on those parameters. Notice that the types of the objects passed in are type-checked so the caller can throw and catch your own ArgumentException instead of relying on the obscure .NET exceptions.

The New() method of the CustomEmployeeSort class is overridden so that when a new object is instantiated, the desired sort parameters are chosen. The code for creating a comparator that sorts Employees by LastName in descending order looks like this:

Dim mySort As CustomEmployeeSort
mySort = New CustomEmployeeSort _
   (CustomEmployeeSort.SortPossibilities.LastNameDescending) 

You can also develop your own collections class from scratch by inheriting from the CollectionsBase class, then adding your own comparator and sorting logic. The best way to do this is to implement the iBindingList interface, which allows you to implement a wealth of methods to enable collection sorting.

In addition, you should think about implementing the iList interface in any collection that you plan to bind to a control's data source. Such binding requires the implementation of the iList interface (binding objects that don't implement the iList interface will yield an unhandled exception). Fortunately, collection classes such as Array and ArrayList support this interface already (which is why it can be such a boon to developers to inherit from these classes).

My suggestion is to look at the collection classes already available before you try and roll your own collection. Rocky Lhotka's MSDN article, "Sorting the Unsortable Collection," talks exclusively about implementing sorting for collection classes that don't support the Sort() method natively.

As you can see, sorting is easy to implement once you understand how the interfaces work. The .NET Framework provides a lot of the underpinnings for you, and it's relatively simple to extend them to customize your own comparator logic (the key to quicksort). Understanding the differences between iComparable and iComparer is crucial to understanding how to implement your own comparators for sorting. The .NET Framework provides much of the structure for you, so you should look to inherit from these classes before deciding to reinvent the wheel.

comments powered by Disqus

Featured

Subscribe on YouTube