The Data Science Lab

Lightweight Mathematical Combinations Using C#

After previously discussing permutations, Dr. James McCaffrey of Microsoft Research uses step-by-step examples and full code presentations to explore combinations.

A zero-based mathematical (n, k) combination is a subset of k integers from 0 to n-1. For example, if n = 5 and k = 3 there are 10 combinations:

0 1 2
0 1 3
0 1 4
0 2 3
0 2 4
0 3 4
1 2 3
1 2 4
1 3 4
2 3 4

It's usual to list combinations in what's called lexicographical order. If each combination is interpreted as an ordinary integer, the combinations are listed from smallest (12) to largest (234).

The number of (n,k) combinations is n! / (k! * (n-k)!) where n! is factorial(n) = n * (n-1) * (n-2) * . . 1. The function is often called Choose(). For example, Choose(5, 3) = 5! / (3! * (5-3)!) = 120 / (6 * 2) = 120 / 12 = 10.

The number of combinations gets very, very large as n and k increase. For example, Choose(500, 100) =

204,169,423,593,847,671,561,387,240,724,193,094,
030,165,460,090,325,932,474,143,661,505,758,330,
944,415,515,994,945,349,100,113,181,687,417,345

which is significantly larger than the estimated number of atoms in the Universe.

To deal with combinations using the C# language it's necessary to use the BigInteger data type which can handle arbitrarily large values.

Mathematical combinations are related to, but quite different from, mathematical permutations. A zero-based mathematical permutation of order n is a rearrangement of the integers from 0 to n-1. For example, one permutation of order n = 5 is (2, 0, 1, 4, 3).

Figure 1: Combinations Using C# in Action
[Click on image for larger view.]Figure 1: Combinations Using C# in Action

A good way to see where this article is headed is to take a look at the screenshot of a demo program in Figure 1. The demo illustrates how to create and display an (n,k) combination, compute Choose(n, k) using the BigInteger type, display all (n,k) combinations using a Successor() function, and compute a specific combination element directly.

This article assumes you have intermediate or better programming skill with the C# language but doesn't assume you know anything about mathematical combinations. The complete demo code is presented in this article and is also available in the accompanying download. All normal error checking has been removed to keep the main ideas as clear as possible.

To create the demo program I used Visual Studio 2022 (the free Community edition) with .NET 6 (in spite of the new "de-proved" Console template). However, the demo has no significant dependencies so any relatively recent versions of VS and .NET can be used.

Defining and Displaying a Combination
The simplest possible way to define a mathematical combination is as an array of integer. The demo program implements a MakeComb() function as:

static int[] MakeComb(int n, int k)
{
  int[] result = new int[k];
  for (int i = 0; i < k; i++)
    result[i] = i;
  return result;
}

The MakeComb() function can be called like so:

int n = 5; int k = 3;
int[] c = MakeComb(n, k); 
Console.WriteLine("Initial Combination(n=5, k=3) is: ");
ShowComb(c);

The ShowComb() function is defined as:

static void ShowComb(int[] comb)
{
  int n = comb.Length;
  for (int i = 0; i < n; ++i)
    Console.Write(comb[i] + " ");
  Console.WriteLine("");
}

An alternative design is to define a dedicated Combination class, but in most scenarios using an integer array is simple and effective.

Defining a Choose Function
The demo program uses the BigInteger type to define a Choose() function as presented in Listing 1. The function uses several math tricks for efficiency. The first trick is based on the idea that Choose(n, k) = Choose(n, n-k), for example, Choose(100, 97) has the same value as Choose(100, 3). This trick is useful because smaller values of k are easier to compute.

The second trick is best explained by a concrete example. Computing Choose() using the basic definition is inefficient because the n! and k! factorial terms can be very large. An alternate definition of Choose() looks like this:

Choose(100, 3) = (100 * 99 * 98) / (3 * 2 * 1)
               = 50 * 33 * 98
               = 161,700

In words, the denominator is k! (which, by using the first trick can be small) and the numerator is k terms in the form n * (n-1) * (n-2) * . . . which is much smaller than n!

Listing 1: A Choose() Function

using System.Numerics; 
static BigInteger Choose(int n, int k)
{
  // number combinations
  if (n < 0 || k < 0)
    throw new Exception("Negative argument in Choose()");
  if (n < k) return 0; // special
  if (n == k) return 1; // short-circuit

  int delta, iMax;

  if (k < n - k) // ex: Choose(100,3)
  {
    delta = n - k; iMax = k;
  }
  else           // ex: Choose(100,97)
  {
    delta = k; iMax = n - k;
  }

  BigInteger ans = delta + 1;
  for (int i = 2; i <= iMax; ++i)
    ans = (ans * (delta + i)) / i;

  return ans;
}

The BigInteger type is defined in the System.Numerics namespace. The demo code defines Choose(n, k) as 0 when n < k as a special case.

The demo program calls the Choose() function like this:

BigInteger numCombs = Choose(5,3);
Console.WriteLine("Number of n=5, k=3 combs is: ");
Console.WriteLine(numCombs);

numCombs = Choose(100, 10);
Console.WriteLine("Number ways choose 10 from 100: ");
Console.WriteLine(numCombs.ToString("N0"));

Because the Choose() function result can be very large, it's often helpful to use the "N0" formatting specification to place commas in the output.

Defining a Successor Function
Defining a function that returns the successor combination to a given combination is a well-known problem. The demo program defines a Successor() function in Listing 2.

Listing 2: The Combination Successor() Function

static int[] Successor(int[] comb, int n, int k)
{
  if (IsLast(comb, n, k) == true)
    return null;

  int[] result = new int[k];  // make copy
  for (int i = 0; i < k; ++i)
    result[i] = comb[i];

  int idx = k - 1;
  while (idx > 0 && result[idx] == n - k + idx)
    --idx;
      
  ++result[idx];

  for (int j = idx; j < k - 1; ++j)
    result[j + 1] = result[j] + 1;

  return result;
}

Suppose n = 10 and k = 6 and a current combination is (0, 2, 3, 4, 8, 9). The successor combination is (0, 2, 3, 5, 6, 7).

The Successor() function first finds the left-most index (idx) of the value to increment. In this example idx = [3]. The value at [3] is 4 and it is incremented giving an intermediate result of (0, 2, 3, 5, 8, 9). Then the algorithm sequentially increments the tail values to the right of idx giving the final successor result of (0, 2, 3, 5, 6, 7).

With a Successor() function in hand, it's possible to iterate through all combinations with code like:

int[] c = MakeComb(5, 3);  // [0,1,2]
Console.WriteLine("All n=5, k=3 combinations: ");
while (c != null)
{
  ShowComb(c);
  c = Successor(c, n, k);
}

The demo Successor() function returns null if the current combination is the last combination. For example, for n = 8 and k = 3, the last combination in lexicographical order is (5, 6, 7). The demo program defines an IsLast() function as:

static bool IsLast(int[] comb, int n, int k)
{
  // is comb(8,3) like [5,6,7] ?
  if (comb[0] == n - k)
    return true;
  else
    return false;
}

The last (n, k) combination will have the form (n-k), (n-k+1), . . (n-1). Because of lexicographical ordering, it's enough to check the first value in the combination to see if it is n-k because that condition is true only for the last combination.

The mth Lexicographical Combination Element Directly
Suppose you want a specific combination element. For example, all 10 n = 5, k = 3 combinations are:

[0] 0 1 2
[1] 0 1 3
[2] 0 1 4
[3] 0 2 3
[4] 0 2 4
[5] 0 3 4
[6] 1 2 3
[7] 1 2 4
[8] 1 3 4
[9] 2 3 4

You want a function named Element that accepts an index, m, and returns the specified combination, for example, for n = 5 and k = 3, Element(m=5) returns (0, 3, 4).

Because there are only Choose(5, 3) = 10 possible combinations you could start with the initial (0, 1, 2) combination and then call the Successor() function inside a for-loop with five iterations and end up at the desired combination.

But this simple approach only works for very small values of n and k. Somewhat surprisingly, it is possible to write a function that directly returns a specified combination. The demo program defines an Element() function in Listing 3.

Listing 3: Directly Computing the mth Combination Element

static int[] Element(BigInteger m, int n, int k)
{
  // compute element [m] using the combinadic
  BigInteger maxM = Choose(n, k) - 1;

  if (m > maxM)
    throw new Exception("m value is too large in Element");

  int[] ans = new int[k];

  int a = n;
  int b = k;
  BigInteger x = maxM - m; // x is the "dual" of m

  for (int i = 0; i < k; ++i)
  {
    ans[i] = LargestV(a, b, x); // see helper below    
    x = x - Choose(ans[i], b);
    a = ans[i];
    b = b - 1;
  }

  for (int i = 0; i < k; ++i)
    ans[i] = (n - 1) - ans[i];

  return ans;
}

static int LargestV(int a, int b, BigInteger x)
{
  int v = a - 1;
  while (Choose(v, b) > x)
    --v;
  return v;
}

The Element() function calls a helper LargestV() function. The Element() function can be called like this:

int n = 100;
int k = 10;
BigInteger m = BigInteger.Parse("9999999999");
int[] c = Element(m, n, k);
Console.WriteLine("Combination [" + m + "] " +
  " of C(n=100,k=10): ");
ShowComb(c);

The algorithm for the Element() function is beautiful but quite complicated. The algorithm is based on expressing an integer as a "combinadic," which is a linear combination of Choose() function values. You can find a detailed explanation of the algorithm in my blog post.

Wrapping Up
The examples in this article use zero-based combinations. This is convenient because in many practical uses of combinations, combination values map to array indices. In a pure mathematical context, one-based combinations are more common.

An idea that's closely related to mathematical combinations has the rather odd name of "Stirling numbers of the second kind," usually written as S(n, k). The function returns the number of ways to partition n items into k subsets. For example, suppose n = 4 and k = 2, then S(4, 2) = 7. If the n items are (0, 1, 2, 3) then the 7 partitions are: (0 | 123), (1 | 023), (2 | 013), (3 | 012), (01 | 23), (02 | 13), and (03 | 12). It's possible to implement the S(n, k) function using the ideas presented in this article.

About the Author

Dr. James McCaffrey works for Microsoft Research in Redmond, Wash. He has worked on several Microsoft products including Azure and Bing. James can be reached at [email protected].

comments powered by Disqus

Featured

  • IDE Irony: Coding Errors Cause 'Critical' Vulnerability in Visual Studio

    In a larger-than-normal Patch Tuesday, Microsoft warned of a "critical" vulnerability in Visual Studio that should be fixed immediately if automatic patching isn't enabled, ironically caused by coding errors.

  • Building Blazor Applications

    A trio of Blazor experts will conduct a full-day workshop for devs to learn everything about the tech a a March developer conference in Las Vegas keynoted by Microsoft execs and featuring many Microsoft devs.

  • Gradient Boosting Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the gradient boosting regression technique, where the goal is to predict a single numeric value. Compared to existing library implementations of gradient boosting regression, a from-scratch implementation allows much easier customization and integration with other .NET systems.

  • Microsoft Execs to Tackle AI and Cloud in Dev Conference Keynotes

    AI unsurprisingly is all over keynotes that Microsoft execs will helm to kick off the Visual Studio Live! developer conference in Las Vegas, March 10-14, which the company described as "a must-attend event."

  • Copilot Agentic AI Dev Environment Opens Up to All

    Microsoft removed waitlist restrictions for some of its most advanced GenAI tech, Copilot Workspace, recently made available as a technical preview.

Subscribe on YouTube