The Data Science Lab


K-Means Data Clustering from Scratch Using C#

K-means is comparatively simple and works well with large datasets, but it assumes clusters are circular/spherical in shape, so it can only find simple cluster geometries.

DBSCAN Data Clustering from Scratch Using C#

Compared to other clustering techniques, DBSCAN does not require you to explicitly specify how many data clusters to use, explains Dr. James McCaffrey of Microsoft Research in this full-code, step-by-step machine language tutorial.

Gaussian Mixture Model Data Clustering from Scratch Using C#

Dr. James McCaffrey of Microsoft Research explains GMM clustering in a full-code, step-by-step tutorial, noting his data scientists colleagues have different opinions about the complicated technique.

Neural Network Regression from Scratch Using C#

Compared to other regression techniques, a well-tuned neural network regression system can produce the most accurate prediction model, says Dr. James McCaffrey of Microsoft Research in presenting this full-code, step-by-step tutorial.

Decision Tree Regression from Scratch Using C#

Dr. James McCaffrey of Microsoft Research says the technique is easy to tune, works well with small datasets and produces highly interpretable predictions, but there are also trade-off cons.

Gaussian Process Regression from Scratch Using C#

GPR works well with small datasets and generates a metric of confidence of a predicted result, but it's moderately complex and the results are not easily interpretable, says Dr. James McCaffrey of Microsoft Research in this full-code tutorial.

Blue Squares Floating Small

Weighted k-Nearest Neighbors Regression Using C#

The main advantages of KNNR are simplicity and interpretability, says Dr. James McCaffrey of Microsoft Research in presenting this full-code, step-by-step tutorial.

Kernel Ridge Regression Using C#

KRR is especially useful when there is limited training data, says Dr. James McCaffrey of Microsoft Research in this full-code, step-by-step tutorial.

Linear Ridge Regression Using C#

Implementing LRR from scratch is harder than using a library like scikit-learn, but it helps you customize your code, makes it easier to integrate with other systems, and gives you a complete understanding of how LRR works.

Gaussian Process Regression Using the scikit Library

Dr. James McCaffrey of Microsoft Research offers a full-code, step-by-step tutorial for this technique, especially useful when there is limited training data.

Nebula

Regression Using scikit Kernel Ridge Regression

Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on this regression technique, which is especially useful when there is limited training data.

Space

Binary Classification Using a scikit Neural Network

Machine learning with neural networks is sometimes said to be part art and part science. Dr. James McCaffrey of Microsoft Research teaches both with a full-code, step-by-step tutorial.

Gaussian Naive Bayes Classification Using the scikit Library

Dr. James McCaffrey of Microsoft Research says the main advantage of using Gaussian naive Bayes classification compared to other techniques like decision trees or neural networks is that you don't have to fine tune model parameters.

Classification Using the scikit k-Nearest Neighbors Module

Dr. James McCaffrey of Microsoft Research uses a full-code, step-by-step demo to predict the species of a wheat seed based on seven predictor variables such as seed length, width and perimeter.

Regression Using a scikit MLPRegressor Neural Network

Dr. James McCaffrey of Microsoft Research uses a full-code, step-by-step demo to show how to predict the annual income of a person based on their sex, age, state where they live and political leaning.

Subscribe on YouTube