The Data Science Lab


Gradient Boosting Regression Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the gradient boosting regression technique, where the goal is to predict a single numeric value. Compared to existing library implementations of gradient boosting regression, a from-scratch implementation allows much easier customization and integration with other .NET systems.

Random Forest Regression and Bagging Regression Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the random forest regression technique (and a variant called bagging regression), where the goal is to predict a single numeric value. The demo program uses C#, but it can be easily refactored to other C-family languages.

AdaBoost Regression Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the AdaBoost.R2 algorithm for regression problems (where the goal is to predict a single numeric value). The implementation follows the original source research paper closely, so you can use it as a guide for customization for specific scenarios.

Decision Tree Regression from Scratch Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of decision tree regression using the C# language. Unlike most implementations, this one does not use recursion or pointers, which makes the code easy to understand and modify.

Simple k-NN Regression Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of k-nearest neighbors regression to predict a single numeric value. Compared to other machine learning regression techniques, k-NN regression is often slightly less accurate, but is very simple to implement and customize, and the results are highly interpretable.

DBSCAN Clustering and Anomaly Detection Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of data clustering and anomaly detection using the DBSCAN (Density Based Spatial Clustering of Applications with Noise) algorithm. Compared to other anomaly detection systems based on data clustering, DBSCAN can find significantly different types of anomalies.

Winnow Classification Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the Winnow classification technique. Winnow classification is used for a very specific scenario where the target variable to predict is binary and all the predictor variables are also binary.

Implementing k-NN Classification Using C#

Dr. James McCaffrey of Microsoft Research presents a full demo of k-nearest neighbors classification on mixed numeric and categorical data. Compared to other classification techniques, k-NN is easy to implement, supports numeric and categorical predictor variables, and is highly interpretable.

Logistic Regression with Batch SGD Training and Weight Decay Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end program that explains how to perform binary classification (predicting a variable with two possible discrete values) using logistic regression, where the prediction model is trained using batch stochastic gradient descent with weight decay.

AdaBoost Binary Classification Using C#

Dr. James McCaffrey from Microsoft Research presents a C# program that illustrates using the AdaBoost algorithm to perform binary classification for spam detection. Compared to other classification algorithms, AdaBoost is powerful and works well with small datasets, but is sometimes susceptible to model overfitting.

Artificial Immune Systems for Intrusion Detection Using C#

Dr. James McCaffrey from Microsoft Research presents a demonstration program that models biological immune systems to identify network intrusion threats. The demo illustrates challenges with artificial immune systems as well as promising new approaches.

Black White Wave IMage

Data Anomaly Detection Using LightGBM

Dr. James McCaffrey from Microsoft Research presents a complete program that uses the Python language LightGBM system to create a custom autoencoder for data anomaly detection. You can easily adapt the demo program for your own anomaly detection scenarios.

Data Dimensionality Reduction Using a Neural Autoencoder with C#

Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on creating an approximation of a dataset that has fewer columns.

Binary Classification Using LightGBM

Dr. James McCaffrey from Microsoft Research presents a full-code, step-by-step tutorial on using the LightGBM tree-based system to perform binary classification (predicting a discrete variable that has exactly two possible values).

Nearest Centroid Classification for Numeric Data Using C#

Here's a complete end-to-end demo of what Dr. James McCaffrey of Microsoft Research says is arguably the simplest possible classification technique.

Subscribe on YouTube