The Data Science Lab


Data Anomaly Detection Using a Neural Autoencoder with C#

Dr. James McCaffrey of Microsoft Research tackles the process of examining a set of source data to find data items that are different in some way from the majority of the source items.

Just for Fun: A Five-Card Poker Library Using C#

Chances are if you've had many coding interviews you've been presented with a poker problem. Here's a great take from Dr. James McCaffrey of Microsoft Research.

The t-SNE Data Visualization Technique from Scratch Using C#

Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step example of machine learning technique to visualize high-dimensional data.

Data Clustering Using a Self-Organizing Map (SOM) with C#

Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on technique for visualizing and clustering data.

Principal Component Analysis from Scratch Using Singular Value Decomposition with C#

Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on a classical ML technique that transforms a dataset into one with fewer columns, useful for creating a graph of data that has more than two columns, for example.

Matrix Inverse from Scratch Using SVD Decomposition with C#

Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on an implementation of the technique that emphasizes simplicity and ease-of-modification over robustness and performance.

Machine Learning

Principal Component Analysis (PCA) from Scratch Using the Classical Technique with C#

Transforming a dataset into one with fewer columns is more complicated than it might seem, explains Dr. James McCaffrey of Microsoft Research in this full-code, step-by-step machine learning tutorial.

Matrix Inverse from Scratch Using QR Decomposition with C#

Dr. James McCaffrey of Microsoft Research guides you through a full-code, step-by-step tutorial on "one of the most important operations in machine learning."

Spectral Data Clustering from Scratch Using C#

Spectral clustering is quite complex, but it can reveal patterns in data that aren't revealed by other clustering techniques.

K-Means Data Clustering from Scratch Using C#

K-means is comparatively simple and works well with large datasets, but it assumes clusters are circular/spherical in shape, so it can only find simple cluster geometries.

DBSCAN Data Clustering from Scratch Using C#

Compared to other clustering techniques, DBSCAN does not require you to explicitly specify how many data clusters to use, explains Dr. James McCaffrey of Microsoft Research in this full-code, step-by-step machine language tutorial.

Gaussian Mixture Model Data Clustering from Scratch Using C#

Dr. James McCaffrey of Microsoft Research explains GMM clustering in a full-code, step-by-step tutorial, noting his data scientists colleagues have different opinions about the complicated technique.

Neural Network Regression from Scratch Using C#

Compared to other regression techniques, a well-tuned neural network regression system can produce the most accurate prediction model, says Dr. James McCaffrey of Microsoft Research in presenting this full-code, step-by-step tutorial.

Decision Tree Regression from Scratch Using C#

Dr. James McCaffrey of Microsoft Research says the technique is easy to tune, works well with small datasets and produces highly interpretable predictions, but there are also trade-off cons.

Gaussian Process Regression from Scratch Using C#

GPR works well with small datasets and generates a metric of confidence of a predicted result, but it's moderately complex and the results are not easily interpretable, says Dr. James McCaffrey of Microsoft Research in this full-code tutorial.

Subscribe on YouTube