The Data Science Lab


Matrix

Multi-Class Classification Using a scikit Neural Network

Dr. James McCaffrey of Microsoft Research says a neural network model is arguably the most powerful multi-class classification technique.

Multi-Class Classification Using a scikit Decision Tree

Decision trees are useful for relatively small datasets that have a relatively simple underlying structure, and when the trained model must be easily interpretable, explains Dr. James McCaffrey of Microsoft Research, who provides step-by-step instructions and full source code.

Nebula

Naive Bayes Classification Using the scikit Library

Dr. James McCaffrey of Microsoft Research shows how to predict a person's sex based on their job type, eye color and country of residence.

Binary Classification Using a scikit Decision Tree

Dr. James McCaffrey of Microsoft Research says decision trees are useful for relatively small datasets and when the trained model must be easily interpretable, but often don't work well with large data sets and can be susceptible to model overfitting.

Swirl

Logistic Regression Using the scikit Library

Dr. James McCaffrey of Microsoft Research says the main advantage of scikit is that it's easy to use (even though most classes have many constructor parameters).

Black White Wave IMage

Logistic Regression from Scratch Using Raw Python

The fundamental technique has been studied for decades, thus creating a huge amount of information and alternate variations that make it hard to tell what is key vs. non-essential information.

Silver Pins

Multi-Class Classification Accuracy by Class Using PyTorch

Dr. James McCaffrey of Microsoft Research: When multi-class data is skewed toward one or more classes, it's very important to analyze accuracy by class.

The Traveling Salesman Problem Using an Evolutionary Algorithm with C#

Dr. James McCaffrey of Microsoft Research uses full code samples to detail an evolutionary algorithm technique that apparently hasn't been published before.

Simple Numerical Optimization Using an Evolutionary Algorithm with C#

Dr. James McCaffrey of Microsoft Research says that when quantum computing becomes generally available, evolutionary algorithms for training huge neural networks could become a very important and common technique.

Motherboard Image

Regression Using PyTorch New Best Practices, Part 2: Training, Accuracy, Predictions

Dr. James McCaffrey of Microsoft Research updates regression techniques and best practices guidance based on experience over the past two years, reflecting rapid advancements in machine learning with deep neural techniques.

Regression Using PyTorch, Part 1: New Best Practices

Machine learning with deep neural techniques has advanced quickly, so Dr. James McCaffrey of Microsoft Research updates regression techniques and best practices guidance based on experience over the past two years.

Binary Classification Using New PyTorch Best Practices, Part 2: Training, Accuracy, Predictions

Dr. James McCaffrey of Microsoft Research explains how to train a network, compute its accuracy, use it to make predictions and save it for use by other programs.

Binary Classification Using PyTorch, Part 1: New Best Practices

Because machine learning with deep neural techniques has advanced quickly, our resident data scientist updates binary classification techniques and best practices based on experience over the past two years.

Multi-Class Classification Using New PyTorch Best Practices, Part 2: Training, Accuracy, Predictions

Following new best practices, Dr. James McCaffrey of Microsoft Research revisits multi-class classification for when the variable to predict has three or more possible values.

Multi-Class Classification Using PyTorch, Part 1: New Best Practices

Dr. James McCaffrey of Microsoft Research updates previous tutorials with new, cutting-edge deep neural machine learning techniques.

Vortex

ANOVA Using C#

One use case for the analysis of variance statistics technique is asking if student performances are the same in three classrooms taught by the same teacher but with different textbooks, says Dr. James McCaffrey of Microsoft Research.

The LogBeta and LogGamma Functions Using C#

With no built-in functions for classical statistics analyses in the .NET library, Dr. James McCaffrey of Microsoft Research explains how to roll your own from scratch.

White and Blue Boxes Graphic

Lightweight Mathematical Combinations Using C#

After previously discussing permutations, Dr. James McCaffrey of Microsoft Research uses step-by-step examples and full code presentations to explore combinations.

Lightweight Mathematical Permutations Using C#

Get ready to use the BigInteger data type as Dr. James McCaffrey of Microsoft Research demonstrates zero-based mathematical permutations with C#.

Circl

Runs Testing Using C# Simulation

Dr. James McCaffrey of Microsoft Research uses a full code program for a step-by-step explanation of this machine learning technique that indicates if patterns are random.

Space

Probit Regression Using C#

Dr. James McCaffrey of Microsoft Research explains the classical machine learning technique typically used for binary classification -- predicting an outcome that can only be one of two discrete values.

Purple Nebula Graphic

Weighted k-NN Classification Using C#

Dr. James McCaffrey of Microsoft Research explains the machine learning technique, which can be used to predict a person's happiness score from their income and education, for example.

Color Wave

Naive Bayes Classification Using C#

Dr. James McCaffrey of Microsoft Research presents a full step-by-step example with all code to predict a person's optimism score from their occupation, eye color and country.

Red Shapes

CIFAR-10 Image Classification Using PyTorch

CIFAR-10 problems analyze crude 32 x 32 color images to predict which of 10 classes the image is. Here, Dr. James McCaffrey of Microsoft Research shows how to create a PyTorch image classification system for the CIFAR-10 dataset.

Nebula

Preparing CIFAR Image Data for PyTorch

CIFAR-10 problems analyze crude 32 x 32 color images to predict which of 10 classes the image is. Here, Dr. James McCaffrey of Microsoft Research explains how to get the raw source CIFAR-10 data, convert it from binary to text and save it as a text file that can be used to train a PyTorch neural network classifier.

Subscribe on YouTube