Dr. James McCaffrey of Microsoft Research provides a code-driven tutorial on PUL problems, which often occur with security or medical data in cases like training a machine learning model to predict if a hospital patient has a disease or not.
- By James McCaffrey
- 05/21/2021
Generating synthetic data is useful when you have imbalanced training data for a particular class, for example, generating synthetic females in a dataset of employees that has many males but few females.
- By James McCaffrey
- 05/07/2021
Dr. James McCaffrey of Microsoft Research provides full code and step-by-step examples of anomaly detection, used to find items in a dataset that are different from the majority for tasks like detecting credit card fraud.
- By James McCaffrey
- 04/13/2021
When training data won't fit into machine memory, a streaming data loader using an internal memory buffer can help. Dr. James McCaffrey of Microsoft Research shows how.
- By James McCaffrey
- 04/01/2021
Dr. James McCaffrey of Microsoft Research explains how to evaluate, save and use a trained regression model, used to predict a single numeric value such as the annual revenue of a new restaurant based on variables such as menu prices, number of tables, location and so on.
- By James McCaffrey
- 03/12/2021
The goal of a regression problem is to predict a single numeric value, for example, predicting the annual revenue of a new restaurant based on variables such as menu prices, number of tables, location and so on.
- By James McCaffrey
- 03/03/2021
Dr. James McCaffrey of Microsoft Research presents the second of four machine learning articles that detail a complete end-to-end production-quality example of neural regression using PyTorch.
- By James McCaffrey
- 02/11/2021
Dr. James McCaffrey of Microsoft Research presents the first in a series of four machine learning articles that detail a complete end-to-end production-quality example of neural regression using PyTorch.
- By James McCaffrey
- 02/02/2021
Dr. James McCaffrey of Microsoft Research continues his four-part series on multi-class classification, designed to predict a value that can be one of three or more possible discrete values, by explaining model accuracy.
- By James McCaffrey
- 01/25/2021
Dr. James McCaffrey of Microsoft Research continues his four-part series on multi-class classification, designed to predict a value that can be one of three or more possible discrete values, by explaining neural network training.
- By James McCaffrey
- 01/04/2021
Dr. James McCaffrey of Microsoft Research explains how to define a network in installment No. 2 of his four-part series that will present a complete end-to-end production-quality example of multi-class classification using a PyTorch neural network.
- By James McCaffrey
- 12/15/2020
Dr. James McCaffrey of Microsoft Research kicks off a four-part series on multi-class classification, designed to predict a value that can be one of three or more possible discrete values.
- By James McCaffrey
- 12/04/2020
In the final article of a four-part series on binary classification using PyTorch, Dr. James McCaffrey of Microsoft Research shows how to evaluate the accuracy of a trained model, save a model to file, and use a model to make predictions.
- By James McCaffrey
- 11/24/2020
Dr. James McCaffrey of Microsoft Research continues his examination of creating a PyTorch neural network binary classifier through six steps, here addressing step No. 4: training the network.
- By James McCaffrey
- 11/04/2020
Dr. James McCaffrey of Microsoft Research tackles how to define a network in the second of a series of four articles that present a complete end-to-end production-quality example of binary classification using a PyTorch neural network, including a full Python code sample and data files.
- By James McCaffrey
- 10/14/2020
Dr. James McCaffrey of Microsoft Research kicks off a series of four articles that present a complete end-to-end production-quality example of binary classification using a PyTorch neural network, including a full Python code sample and data files.
- By James McCaffrey
- 10/05/2020
Dr. James McCaffrey of Microsoft Research provides a full code sample and screenshots to explain how to create and use PyTorch Dataset and DataLoader objects, used to serve up training or test data in order to train a PyTorch neural network.
- By James McCaffrey
- 09/10/2020
Dr. James McCaffrey of Microsoft Research explains how to programmatically split a file of data into a training file and a test file, for use in a machine learning neural network for scenarios like predicting voting behavior from a file containing data about people such as sex, age, income and so on.
- By James McCaffrey
- 09/01/2020
Dr. James McCaffrey of Microsoft Research uses a full code program and screenshots to explain how to programmatically encode categorical data for use with a machine learning prediction model such as a neural network classification or regression system.
- By James McCaffrey
- 08/12/2020
Dr. James McCaffrey of Microsoft Research uses a full code sample and screenshots to show how to programmatically normalize numeric data for use in a machine learning system such as a deep neural network classifier or clustering algorithm.
- By James McCaffrey
- 08/04/2020
After previously detailing how to examine data files and how to identify and deal with missing data, Dr. James McCaffrey of Microsoft Research now uses a full code sample and step-by-step directions to deal with outlier data
- By James McCaffrey
- 07/14/2020
Turning his attention to the extremely time-consuming task of machine learning data preparation, Dr. James McCaffrey of Microsoft Research explains how to examine data files and how to identify and deal with missing data.
- By James McCaffrey
- 07/06/2020
Dr. James McCaffrey of Microsoft Research presents the fundamental concepts of tensors necessary to establish a solid foundation for learning how to create PyTorch neural networks, based on his teaching many PyTorch training classes at work.
- By James McCaffrey
- 06/15/2020
Dr. James McCaffrey of Microsoft Research uses a complete demo program, samples and screenshots to explains how to install the Python language and the PyTorch library on Windows, and how to create and run a minimal, but complete, neural network classifier.
- By James McCaffrey
- 06/08/2020
Clustering non-numeric -- or categorial -- data is surprisingly difficult, but it's explained here by resident data scientist Dr. James McCaffrey of Microsoft Research, who provides all the code you need for a complete system using an algorithm based on a metric called category utility (CU), a measure how much information you gain by clustering.
- By James McCaffrey
- 06/03/2020