The Data Science Lab


Multi-Class Classification Using a scikit Decision Tree

Decision trees are useful for relatively small datasets that have a relatively simple underlying structure, and when the trained model must be easily interpretable, explains Dr. James McCaffrey of Microsoft Research, who provides step-by-step instructions and full source code.

Nebula

Naive Bayes Classification Using the scikit Library

Dr. James McCaffrey of Microsoft Research shows how to predict a person's sex based on their job type, eye color and country of residence.

Binary Classification Using a scikit Decision Tree

Dr. James McCaffrey of Microsoft Research says decision trees are useful for relatively small datasets and when the trained model must be easily interpretable, but often don't work well with large data sets and can be susceptible to model overfitting.

Swirl

Logistic Regression Using the scikit Library

Dr. James McCaffrey of Microsoft Research says the main advantage of scikit is that it's easy to use (even though most classes have many constructor parameters).

Black White Wave IMage

Logistic Regression from Scratch Using Raw Python

The fundamental technique has been studied for decades, thus creating a huge amount of information and alternate variations that make it hard to tell what is key vs. non-essential information.

Silver Pins

Multi-Class Classification Accuracy by Class Using PyTorch

Dr. James McCaffrey of Microsoft Research: When multi-class data is skewed toward one or more classes, it's very important to analyze accuracy by class.

The Traveling Salesman Problem Using an Evolutionary Algorithm with C#

Dr. James McCaffrey of Microsoft Research uses full code samples to detail an evolutionary algorithm technique that apparently hasn't been published before.

Simple Numerical Optimization Using an Evolutionary Algorithm with C#

Dr. James McCaffrey of Microsoft Research says that when quantum computing becomes generally available, evolutionary algorithms for training huge neural networks could become a very important and common technique.

Motherboard Image

Regression Using PyTorch New Best Practices, Part 2: Training, Accuracy, Predictions

Dr. James McCaffrey of Microsoft Research updates regression techniques and best practices guidance based on experience over the past two years, reflecting rapid advancements in machine learning with deep neural techniques.

Regression Using PyTorch, Part 1: New Best Practices

Machine learning with deep neural techniques has advanced quickly, so Dr. James McCaffrey of Microsoft Research updates regression techniques and best practices guidance based on experience over the past two years.

Binary Classification Using New PyTorch Best Practices, Part 2: Training, Accuracy, Predictions

Dr. James McCaffrey of Microsoft Research explains how to train a network, compute its accuracy, use it to make predictions and save it for use by other programs.

Binary Classification Using PyTorch, Part 1: New Best Practices

Because machine learning with deep neural techniques has advanced quickly, our resident data scientist updates binary classification techniques and best practices based on experience over the past two years.

Multi-Class Classification Using New PyTorch Best Practices, Part 2: Training, Accuracy, Predictions

Following new best practices, Dr. James McCaffrey of Microsoft Research revisits multi-class classification for when the variable to predict has three or more possible values.

Multi-Class Classification Using PyTorch, Part 1: New Best Practices

Dr. James McCaffrey of Microsoft Research updates previous tutorials with new, cutting-edge deep neural machine learning techniques.

Vortex

ANOVA Using C#

One use case for the analysis of variance statistics technique is asking if student performances are the same in three classrooms taught by the same teacher but with different textbooks, says Dr. James McCaffrey of Microsoft Research.

Subscribe on YouTube