The Data Science Lab


Data Prep for Machine Learning: Missing Data

Turning his attention to the extremely time-consuming task of machine learning data preparation, Dr. James McCaffrey of Microsoft Research explains how to examine data files and how to identify and deal with missing data.

Working With PyTorch Tensors

Dr. James McCaffrey of Microsoft Research presents the fundamental concepts of tensors necessary to establish a solid foundation for learning how to create PyTorch neural networks, based on his teaching many PyTorch training classes at work.

Getting Started with PyTorch 1.5 on Windows

Dr. James McCaffrey of Microsoft Research uses a complete demo program, samples and screenshots to explains how to install the Python language and the PyTorch library on Windows, and how to create and run a minimal, but complete, neural network classifier.

Clustering Non-Numeric Data Using C#

Clustering non-numeric -- or categorial -- data is surprisingly difficult, but it's explained here by resident data scientist Dr. James McCaffrey of Microsoft Research, who provides all the code you need for a complete system using an algorithm based on a metric called category utility (CU), a measure how much information you gain by clustering.

Data Clustering with K-Means++ Using C#

Dr. James McCaffrey of Microsoft Research explains the k-means++ technique for data clustering, the process of grouping data items so that similar items are in the same cluster, for human examination to see if any interesting patterns have emerged or for software systems such as anomaly detection.

How to Do Kernel Logistic Regression Using C#

Dr. James McCaffrey of Microsoft Research uses code samples, a full C# program and screenshots to detail the ins and outs of kernal logistic regression, a machine learning technique that extends regular logistic regression -- used for binary classification -- to deal with data that is not linearly separable.

How to Invert a Machine Learning Matrix Using C#

VSM Senior Technical Editor Dr. James McCaffrey, of Microsoft Research, explains why inverting a matrix -- one of the more common tasks in data science and machine learning -- is difficult and presents code that you can use as-is, or as a starting point for custom matrix inversion scenarios.

How to Train a Machine Learning Radial Basis Function Network Using C#

A radial basis function network (RBF network) is a software system that's similar to a single hidden layer neural network, explains Dr. James McCaffrey of Microsoft Research, who uses a full C# code sample and screenshots to show how to train an RBF network classifier.

How to Create a Radial Basis Function Network Using C#

Dr. James McCaffrey of Microsoft Research explains how to design a radial basis function (RBF) network -- a software system similar to a single hidden layer neural network -- and describes how an RBF network computes its output.

How to Do Machine Learning Evolutionary Optimization Using C#

Resident data scientist Dr. James McCaffrey of Microsoft Research turns his attention to evolutionary optimization, using a full code download, screenshots and graphics to explain this machine learning technique used to train many types of models by modeling the biological processes of natural selection, evolution, and mutation.

Floating White Boxes Graphic

How to Do Multi-Class Logistic Regression Using C#

Dr. James McCaffrey of Microsoft Research uses a full code program, examples and graphics to explain multi-class logistic regression, an extension technique that allows you to predict a class that can be one of three or more possible values, such as predicting the political leaning of a person (conservative, moderate, liberal) based on age, sex, annual income and so on.

How to Create a Machine Learning Decision Tree Classifier Using C#

After earlier explaining how to compute disorder and split data in his exploration of machine learning decision tree classifiers, resident data scientist Dr. James McCaffrey of Microsoft Research now shows how to use the splitting and disorder code to create a working decision tree classifier.

Purple Blue Nebula Graphic

How to Compute Disorder for Machine Learning Decision Trees Using C#

Using a decision tree classifier from a machine learning library is often awkward because it usually must be customized and library decision trees have many complex supporting functions, says resident data scientist Dr. James McCaffrey, so when he needs a decision tree classifier, he always creates one from scratch. Here's how.

Blue Speed Lines Blurred Graphic

How to Do Machine Learning Perceptron Classification Using C#

Dr. James McCaffrey of Microsoft Research uses code samples and screen shots to explain perceptron classification, a machine learning technique that can be used for predicting if a person is male or female based on numeric predictors such as age, height, weight, and so on. It's mostly useful to provide a baseline result for comparison with more powerful ML techniques such as logistic regression and k-nearest neighbors.

How to Do Naive Bayes with Numeric Data Using C#

Dr. James McCaffrey of Microsoft Research uses a full code sample and screenshots to demonstrate how to create a naive Bayes classification system when the predictor values are numeric, using the C# language without any special code libraries.

Subscribe on YouTube