The Data Science Lab

How to Invert a Machine Learning Matrix Using C#

Inverting a matrix is one of the most common tasks in data science and machine learning. In this article I explain why inverting a matrix is very difficult and present code that you can use as-is, or as a starting point for custom matrix inversion scenarios. Specifically, this article presents an implementation of matrix inversion using Crout's decomposition.

There are many different techniques to invert a matrix. The Wikipedia article on matrix inversion lists 10 categories of techniques, and each category has many variations. The fact that there are so many different ways to invert a matrix is an indirect indication of how difficult the problem is. Briefly, relatively simple matrix inversion techniques such as using cofactors and adjugates only work well for small matrices (roughly 10 x 10 or smaller). For larger matrices you should write code that involves a complex technique called matrix decomposition.

Figure 1: Matrix Decomposition Demo Program
[Click on image for larger view.] Figure 1: Matrix Decomposition Demo Program

A good way to see where this article is headed is to take a look at the screenshot of a demo program in Figure 1. The demo begins by creating a 4 x 4 matrix:

3.0  7.0  2.0  5.0
4.0  0.0  1.0  1.0
1.0  6.0  3.0  0.0
2.0  8.0  4.0  3.0

Next, the demo computes the determinant of the matrix. If the determinant of a matrix is 0, then the matrix doesn't have an inverse. Because the determinant is -115.0 the matrix does have an inverse which the demo computes as:

 0.1565   0.2348   0.2696  -0.3391
 0.2696  -0.0957   0.4087  -0.4174
-0.5913   0.1130  -0.5739   0.9478
-0.0348  -0.0522  -0.5043   0.4087

The demo concludes by performing matrix multiplication on the original matrix and the inverse matrix giving a result matrix of:

1.0  0.0  0.0  0.0
0.0  1.0  0.0  0.0
0.0  0.0  1.0  0.0
0.0  0.0  0.0  1.0

Because the result is the identity matrix (1s on the main diagonal, 0s elsewhere), the inverse matrix is correct.

This article assumes you have intermediate or better skill with C# and a basic familiarity with matrices but doesn’t assume you know anything about matrix inversion using decomposition. The code for demo program is a bit too long to present in its entirety in this article but the complete code is available in the associated file download.

Understanding the Inverse of a Matrix
In regular arithmetic, the inverse of a number z is a number that when multiplied by z gives 1. For example, if z = 4, the inverse of z is 1/4 = 0.25 because 4 * (1/4) = 1.

Matrix inversion extends this idea. The inverse of an n x n (called a square matrix because the number of rows equals the number of columns) matrix m is a matrix inv such that m * inv = I where I is the identity matrix (1s on the diagonal, 0s elsewhere).

Not all matrices have an inverse. As it turns out, there is a scalar ("single number") value called the determinant of a matrix. If the determinant of a matrix is 0, then the matrix does not have an inverse.

Suppose a 3 x 3 matrix is:

1.0  4.0  0.0
3.0  2.0  5.0
7.0  8.0  6.0

The determinant of the matrix is calculated as:

+1.0 * [(2.0)(6.0) - (5.0)(8.0)]
-4.0 * [(3.0)(6.0) - (5.0)(7.0)]
+0.0 * [(3.0)(8.0) - (2.0)(7.0)]

= (1.0 * -28.0) - (4.0 * -17.0) + (0.0 * 10.0) 
= -28.0 + 68.0 + 0.0
= 40.0

Every square matrix has a determinant. For matrices with shapes larger than 3 x 3, calculating the determinant in an efficient way is surprisingly difficult.

Matrix Representation and Matrix Multiplication
In C-family programming languages, the most common way to represent a matrix is as an array-of-arrays. For example:

double[][] m = new double[3][];
for (int i = 0; i < 3; ++i)  // 3 rows
  m[i] = new double[4];  // 4 columns

The code creates a 3 x 4 matrix where each of the 12 cells is implicitly initialized to a 0.0 value. The demo program defines a function to create a matrix as:

static double[][] MatCreate(int rows, int cols)
{
  double[][] result = new double[rows][];
  for (int i = 0; i < rows; ++i)
    result[i] = new double[cols];
  return result;
}

The function can be used like:

double[][] m = MatCreate(3, 4);
m[0][0] = 1.0;
// etc.
m[2][3] = 12.0;

Representing a matrix as an array-of-arrays is illustrated in the top image in Figure 2. Unlike most programming languages, C# supports a true two-dimensional matrix type using special syntax, for example:

double[,] m = new double[3,4]; // 3 rows, 4 cols
m[0, 0] = 1.0;
m[0, 1] = 2.0;
. . .
m[2, 3] = 12.0;

In addition to the fact that this type of matrix is not supported by most other programming languages, there are several technical disadvantages of using this special syntax approach and I don't recommend it.

Figure 2: Matrix Representation and Multiplication
[Click on image for larger view.] Figure 2: Matrix Representation and Multiplication

Another way to represent a matrix is to use a program-defined class along the lines of:

public class MyMatrix
{
  private int nRows;
  // etc.
  private double[][] data;
  // etc., etc., etc.
}

In my opinion, in most (but not all) scenarios using a program-defined class to define a matrix is unnecessarily complex and an example of OOP run amok.

In order to understand matrix inversion, you must understand matrix multiplication. Matrix multiplication is perhaps best explained by example. Take a look at the example in the bottom image in Figure 2. The value at cell [r][c] of the result matrix is the sum of the products of the values in row r of the first matrix and the corresponding values in column c of the second matrix.

When finding the inverse of a matrix you only work with square matrices, but matrix multiplication can be applied to matrices with different shapes. In these situations the matrices must be what's called conformable. If matrix A has shape a x n and matrix B has shape n x b, the result of multiplication has shape a x b. The number of columns in the first matrix must equal the number of rows in the second matrix.

The demo program implements matrix multiplication with method MatProduct defined (with some minor edits) as:

static double[][] MatProduct(double[][] matA,
  double[][] matB)
{
  int aRows = matA.Length;
  int aCols = matA[0].Length;
  int bRows = matB.Length;
  double[][] result = MatCreate(aRows, bCols);
  for (int i = 0; i < aRows; ++i) // row of A
    for (int j = 0; j < bCols; ++j) // col of B
      for (int k = 0; k < aCols; ++k)
        result[i][j] += matA[i][k] * matB[k][j];
  return result;
}

The demo uses a brute force approach, but because the calculation of each cell in the result matrix is independent, matrix multiplication could be performed in parallel using the Parallel.For method from the .NET Task Parallel Library.

The Demo Program
To create the demo program, I launched Visual Studio 2019. I used the Community (free) edition but any relatively recent version of Visual Studio will work fine. From the main Visual Studio start window I selected the "Create a new project" option. Next, I selected C# from the Language dropdown control and Console from the Project Type dropdown, and then picked the "Console App (.NET Core)" item.

The code presented in this article will run as a .NET Core console application or as a .NET Framework application. Many of the newer Microsoft technologies, such as the ML.NET code library, specifically target .NET Core so it makes sense to develop most new C# machine learning code in that environment.

I entered "MatrixInverse" as the Project Name, specified C:\VSM on my local machine as the Location (you can use any convenient directory), and checked the "Place solution and project in the same directory" box.

After the template code loaded into Visual Studio, at the top of the editor window I removed all using statements to unneeded namespaces, leaving just the reference to the top level System namespace. The demo needs no other assemblies and uses no external code libraries.

In the Solution Explorer window, I renamed file Program.cs to the more descriptive MatrixInverseProgram.cs and then in the editor window I renamed class Program to class MatrixInverseProgram to match the file name. The structure of the demo program, with a few minor edits to save space, is shown in Listing 1.

Listing 1. Matrix Inversion Demo Program Structure

using System;
namespace MatrixInverse
{
  class MatrixInverseProgram
  {
    static void Main(string[] args)
    {
      Console.WriteLine("Begin inverse demo");
      double[][] m = new double[4][];
      m[0] = new double[] { 3, 7, 2, 5 };
      m[1] = new double[] { 4, 0, 1, 1 };
      m[2] = new double[] { 1, 6, 3, 0 };
      m[3] = new double[] { 2, 8, 4, 3 };

      Console.WriteLine("Original matrix m is ");
      MatShow(m, 4, 8);

      double d = MatDeterminant(m);
      if (Math.Abs(d) < 1.0e-5)
        Console.WriteLine("Matrix has no inverse");
      else
        Console.WriteLine("Det(m) = " +
          d.ToString("F4"));

      double[][] inv = MatInverse(m);
      Console.WriteLine("Inverse matrix inv is ");
      MatShow(inv, 4, 8);

      double[][] prod = MatProduct(m, inv);
      Console.WriteLine("product of m * inv is ");
      MatShow(prod, 1, 6);

      Console.WriteLine("End demo");
      Console.ReadLine();
    } // Main

    static double[][] MatInverse(double[][] m) { . . }
    
    static int MatDecompose(double[][] m,
      out double[][] lum, out int[] perm) { . . }
    
    static double[] Reduce(double[][] luMatrix,
      double[] b) { . . {

    static double MatDeterminant(double[][] m) { . . }
    
    static double[][] MatCreate(int rows,
      int cols) { . . }
    
    static double[][] MatProduct(double[][] matA,
      double[][] matB) { . . }

    static void MatShow(double[][] m,
      int dec, int wid) { . . }

  } // Program
} //ns

All of the program logic is contained in the Main() method. The demo uses static methods defined in the Program class. An alternative is to define a simple container class named something like MatLib and then call the functions like double[][] m = MatLib.MatCreate(4, 4).

The demo begins by creating a 4 x 4 matrix m:

double[][] m = new double[4][];
m[0] = new double[] { 3, 7, 2, 5 };
m[1] = new double[] { 4, 0, 1, 1 };
m[2] = new double[] { 1, 6, 3, 0 };
m[3] = new double[] { 2, 8, 4, 3 };

In many scenarios you'd want to define a function to load a matrix into memory along the lines of:

string fn = "..\\data.txt";
double[][] m = MatLoad(fn, 4, 4, '\t');

The demo displays the matrix like so:

Console.WriteLine("Original matrix m is ");
MatShow(m, 4, 8);

The parameters of MatShow() specify the number of digits after the decimal and the total width per value, with blank space padding used. An alternative approach is to define a function that returns a string representation like:

string s = MatAsString(m, 4, 8);
Console.WriteLine(s);

Next, the demo program computes and displays the value of the determinant:

double d = MatDeterminant(m);
if (Math.Abs(d) < 1.0e-5)
  Console.WriteLine("Matrix has no inverse");
else
  Console.WriteLine("Det(m) = " +
    d.ToString("F4"));

Computing the determinant of a matrix in an efficient way is very difficult as you'll see shortly. Behind the scenes, the MatDeterminant() function uses the complex matrix decomposition technique. Next, the demo computes and displays the inverse:

double[][] inv = MatInverse(m);
Console.WriteLine("Inverse matrix inv is ");
MatShow(inv, 4, 8);

Function MatInverse() calls functions MatCreate(), MatDecompose(), and Reduce(). Function MatDecompose() implements Crout's algorithm and is the heart of the demo. Function Reduce() accepts the results of MatDecompose() and uses them to compute the inverse matrix final result.

The demo concludes by performing matrix multiplication to verify that the inverse matrix is correct:

double[][] prod = MatProduct(m, inv);
Console.WriteLine("product of m * inv is ");
MatShow(prod, 1, 6);

The correctness of the inverse is determined by visually noting that the product of the source matrix and its inverse is the identity matrix. An alternative technique is to programmatically verify the inverse by defining methods to create an identity matrix and compare two matrices:

double[][] iden = MatIdentity(4, 4); // 1s on diag
bool good = MatEqual(prod, iden, 1.0e-5);
if (good == true)
  Console.WriteLine("correct inverse");
else
  Console.WriteLine("bad inverse"):

Because the values in each matrix are type double, you don't want to compare cell values directly, so you specify a tolerance such as 1.0e-5 for how close two values must be in order to be considered equal.

Understanding Matrix Decomposition
The demo program defines several methods that illustrate what matrix decomposition is. See the screenshot in Figure 3.

In regular arithmetic you can decompose a number z into two values a and b so that the product a * b equals z. For example if z = 30, one decomposition is a = 5 and b = 6. Notice that there are many possible decompositions of a number. Matrix decomposition extends this idea by finding two matrices that when multiplied together give a source matrix.

Figure 3: Matrix Decomposition Demo
[Click on image for larger view.] Figure 3: Matrix Decomposition Demo

Recall the original demo matrix is:

3.0  7.0  2.0  5.0
4.0  0.0  1.0  1.0
1.0  6.0  3.0  0.0
2.0  8.0  4.0  3.0

The computed decomposition of the matrix is:

4.0000   0.0000   1.0000   1.0000
0.5000   8.0000   3.5000   2.5000
0.7500   0.8750  -1.8125   2.0625
0.2500   0.7500  -0.0690  -1.9828

The decomposition is called an LU ("lower-upper") decomposition because the decomposition combines the two result matrices of the decomposition into a single matrix. Specifically, the lower part is:

1.0000  0.0000  0.0000  0.0000
0.5000  1.0000  0.0000  0.0000
0.7500  0.8750  1.0000  0.0000
0.2500  0.7500 -0.0690  1.0000

and the upper part is:

4.0000  0.0000  1.0000  1.0000
0.0000  8.0000  3.5000  2.5000
0.0000  0.0000 -1.8125  2.0625
0.0000  0.0000  0.0000 -1.9828

The lower part has 1s on its diagonal and all the values below the diagonal of the combined decomposition, with 0s above the diagonal. The upper part has the diagonal elements of the combined decomposition and the values above the diagonal, with 0s below the diagonal.

There are several different types of matrix decomposition. The demo uses what is called Crout's algorithm. A common alternative that works slightly differently is called Doolittle's algorithm.

If you refer to the screenshot in Figure 3, the code multiplies the lower and upper parts of the decomposition giving:

4.0  0.0  1.0  1.0
2.0  8.0  4.0  3.0
3.0  7.0  2.0  5.0
1.0  6.0  3.0  0.0

which is not quite the same as the original matrix:

3.0  7.0  2.0  5.0
4.0  0.0  1.0  1.0
1.0  6.0  3.0  0.0
2.0  8.0  4.0  3.0

The rows of the result of lower * upper are permuted relative to the source matrix. The permutation of the rows is stored in an integer array named perm with values (1, 3, 0, 2). This means row 0 of the permuted LU product is row 1 of the original matrix, row 1 of permuted LU product is row 3 of the original matrix, and so on.

Although it's not at all obvious, the inverse of a matrix can be computed from the upper and lower decompositions. Somewhat surprisingly, it's easier to compute a decomposition of a source matrix and then use the decomposition to compute the inverse of the source matrix, than it is to compute the inverse directly.

The code that generated the screenshot in Figure 3 starts with:
double[][] m = new double[4][];
m[0] = new double[] { 3, 7, 2, 5 };
m[1] = new double[] { 4, 0, 1, 1 };
m[2] = new double[] { 1, 6, 3, 0 };
m[3] = new double[] { 2, 8, 4, 3 };

The decomposition is computed by these statements:

double[][] lum;
int[] perm;
int toggle = MatDecompose(m, out lum, out perm);
Console.WriteLine("decomposition of m is");
MatShow(lum, 4, 8);

The MatDecompose() function returns three values. The explicit return is an integer value that is either +1 or -1 depending on whether the number of rows permutations of the decomposition is even or odd. The explicit return value is not needed by function MatInverse() and isn't used by the code that illustrates decomposition. The explicit return value is needed by function MatDeterminant() which also calls MatDecompose().

The second return value of MatDecompose() is an out-parameter lum which is an array-of-arrays style matrix that stores the values of the lower-upper Crout's decomposition. The third return value of MatDecompose() is another out-parameter perm which is an integer array that stores the row permutation information.

The demo code continues by extracting the lower and upper parts of the decomposition:

double[][] lower = ExtractLower(lum);
double[][] upper = ExtractUpper(lum);
Console.WriteLine("lower part of LUM is");
MatShow(lower, 4, 8);
Console.WriteLine("upper part of LUM is");
MatShow(upper, 4, 8);

Functions ExtractLower() and ExtractUpper() are purely helper functions to illustrate matrix decomposition and aren't needed in non-demo scenarios.

To summarize, function MatInverse() calls MatCreate() to set up the result matrix inverse. Function MatInverse() calls function MatDecompose() which returns three values. The explicit return is not needed but the decomposition matrix return value is used by helper function Reduce() and the permutation array return value is needed too in order to compute the result inverse matrix. Inverting a matrix is not easy.

The MatInverse() Function Implementation
The implementation of function MatInverse() is presented in Listing 2. The function has all error checking removed to keep the main ideas as clear as possible. In a production scenario you'd have to add a significant amount of error checking that corresponds to the context in which you're computing the matrix inverse.

Listing 2. The MatInverse() Function

static double[][] MatInverse(double[][] m)
{
  // assumes determinant is not 0
  // that is, the matrix does have an inverse
  int n = m.Length;
  double[][] result = MatCreate(n, n); // make copy
  for (int i = 0; i < n; ++i)
    for (int j = 0; j < n; ++j)
      result[i][j] = m[i][j];

  double[][] lum; // combined lower & upper
  int[] perm;  // out parameter
  MatDecompose(m, out lum, out perm); 

  double[] b = new double[n];
  for (int i = 0; i < n; ++i)
  {
    for (int j = 0; j < n; ++j)
      if (i == perm[j])
        b[j] = 1.0;
      else
        b[j] = 0.0;

    double[] x = Reduce(lum, b); // 
    for (int j = 0; j < n; ++j)
      result[j][i] = x[j];
  }
  return result;
}

The implementation is deceptively short because most of the work is done by functions MatDecompose() and Reduce().

Implementing the MatDeterminant() Function
A surprising side effect of matrix decomposition is that the determinant of a matrix can be computed by multiplying the values of the diagonal of the combined lower-upper decomposition matrix and the parity return value of the decomposition. Quite remarkable! The demo implementation is:

static double MatDeterminant(double[][] m)
{
  double[][] lum;
  int[] perm;
  double result = MatDecompose(m, out lum,
    out perm);  // impl. cast
  for (int i = 0; i < lum.Length; ++i)
    result *= lum[i][i];
  return result;
}

Functions MatInverse() and MatDeterminant() both use matrix decomposition and you often want to call MatDeterminant() before calling MatInverse() to see if the source matrix does have a matrix (the determinant is non-zero). Therefore, one strategy is to combine the calls like so:

static double[][] MatInverse(double[][] m)
{
  // check that m is square
  // compute decomposition of m
  // use decomposition to compute determinant
  // if determinant is 0 exit else
  //   use decomposition to compute inverse
}

Wrapping Up
Matrix inversion is used by dozens of machine learning algorithms and techniques. Examples include iterated Newton-Raphson optimization (for logistic regression binary classification), computing Mahalanobis distance (for k-nearest neighbors classification), and solving systems of linear equations.

One of the difficulties for engineers who are new to machine learning is that when encountering a new technique, that new technique may involve a sub-topic, such as matrix inversion, that must be learned. The consequence is that there isn't one clear path for gaining mastery of machine learning techniques. Every person has a different optimal learning path which depends on their background knowledge of techniques like matrix inversion.

comments powered by Disqus

Featured

Subscribe on YouTube