The Data Science Lab


Simple k-NN Regression Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of k-nearest neighbors regression to predict a single numeric value. Compared to other machine learning regression techniques, k-NN regression is often slightly less accurate, but is very simple to implement and customize, and the results are highly interpretable.

DBSCAN Clustering and Anomaly Detection Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of data clustering and anomaly detection using the DBSCAN (Density Based Spatial Clustering of Applications with Noise) algorithm. Compared to other anomaly detection systems based on data clustering, DBSCAN can find significantly different types of anomalies.

Winnow Classification Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the Winnow classification technique. Winnow classification is used for a very specific scenario where the target variable to predict is binary and all the predictor variables are also binary.

Implementing k-NN Classification Using C#

Dr. James McCaffrey of Microsoft Research presents a full demo of k-nearest neighbors classification on mixed numeric and categorical data. Compared to other classification techniques, k-NN is easy to implement, supports numeric and categorical predictor variables, and is highly interpretable.

Logistic Regression with Batch SGD Training and Weight Decay Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end program that explains how to perform binary classification (predicting a variable with two possible discrete values) using logistic regression, where the prediction model is trained using batch stochastic gradient descent with weight decay.

AdaBoost Binary Classification Using C#

Dr. James McCaffrey from Microsoft Research presents a C# program that illustrates using the AdaBoost algorithm to perform binary classification for spam detection. Compared to other classification algorithms, AdaBoost is powerful and works well with small datasets, but is sometimes susceptible to model overfitting.

Artificial Immune Systems for Intrusion Detection Using C#

Dr. James McCaffrey from Microsoft Research presents a demonstration program that models biological immune systems to identify network intrusion threats. The demo illustrates challenges with artificial immune systems as well as promising new approaches.

Black White Wave IMage

Data Anomaly Detection Using LightGBM

Dr. James McCaffrey from Microsoft Research presents a complete program that uses the Python language LightGBM system to create a custom autoencoder for data anomaly detection. You can easily adapt the demo program for your own anomaly detection scenarios.

Data Dimensionality Reduction Using a Neural Autoencoder with C#

Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on creating an approximation of a dataset that has fewer columns.

Binary Classification Using LightGBM

Dr. James McCaffrey from Microsoft Research presents a full-code, step-by-step tutorial on using the LightGBM tree-based system to perform binary classification (predicting a discrete variable that has exactly two possible values).

Nearest Centroid Classification for Numeric Data Using C#

Here's a complete end-to-end demo of what Dr. James McCaffrey of Microsoft Research says is arguably the simplest possible classification technique.

Regression Using LightGBM

Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on this powerful machine learning technique used to predict a single numeric value.

Clustering Mixed Categorical and Numeric Data Using k-Means with C#

Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on a "very tricky" machine learning technique.

Multi-Class Classification Using LightGBM

Dr. James McCaffrey of Microsoft Research provides a full-code, step-by-step machine learning tutorial on how to use the LightGBM system to perform multi-class classification using Python and the scikit-learn library.

Data Anomaly Detection Using a Neural Autoencoder with C#

Dr. James McCaffrey of Microsoft Research tackles the process of examining a set of source data to find data items that are different in some way from the majority of the source items.

Subscribe on YouTube