Using a decision tree classifier from a machine learning library is often awkward because it usually must be customized and library decision trees have many complex supporting functions, says resident data scientist Dr. James McCaffrey, so when he needs a decision tree classifier, he always creates one from scratch. Here's how.
- By James McCaffrey
- 01/21/2020
Dr. James McCaffrey of Microsoft Research uses code samples and screen shots to explain perceptron classification, a machine learning technique that can be used for predicting if a person is male or female based on numeric predictors such as age, height, weight, and so on. It's mostly useful to provide a baseline result for comparison with more powerful ML techniques such as logistic regression and k-nearest neighbors.
- By James McCaffrey
- 01/07/2020
Dr. James McCaffrey of Microsoft Research uses a full code sample and screenshots to demonstrate how to create a naive Bayes classification system when the predictor values are numeric, using the C# language without any special code libraries.
- By James McCaffrey
- 11/12/2019
Here's a hands-on tutorial from bona-fide data scientist Dr. James McCaffrey of Microsoft Research to get you up to speed with machine learning development using C#, complete with code listings and graphics.
- By James McCaffrey
- 11/07/2019
Microsoft Research's Dr. James McCaffrey show how to perform binary classification with logistic regression using the Microsoft ML.NET code library. The goal of binary classification is to predict a value that can be one of just two discrete possibilities, for example, predicting if a person is male or female
- By James McCaffrey
- 10/18/2019
Dr. James McCaffrey provides hands-on examples in introducing ML.NET, for machine learning prediction models, and AutoML, which automatically examines different ML algorithms, finds the best one, and creates a Visual Studio project with the C# code backing the best model, along with C# code that shows how to use the trained model to make a prediction.
- By James McCaffrey
- 09/30/2019
Microsoft Research data scientist Dr. James McCaffrey explains what neural network Glorot initialization is and why it's the default technique for weight initialization.
- By James McCaffrey
- 09/05/2019
Data scientist Dr. James McCaffrey begins a series on presenting and explaining the most common modern techniques used for neural networks, for which over the past couple of years there have been many small but significant changes in the default techniques used.
- By James McCaffrey
- 07/29/2019
Suppose you have three different Internet advertising strategies and you want to determine which of them is the best as quickly as possible. Or suppose you work for a medical company and you want to determine which of three new drugs is the most effective. Resident data scientist Dr. James McCaffrey shows how Thompson Sampling can help.
- By James McCaffrey
- 07/25/2019
Dr. James McCaffrey of Microsoft Research uses Python code samples and screenshots to explain naive Bayes classification, a machine learning technique used to predict the class of an item based on two or more categorical predictor variables, such as predicting the gender (0 = male, 1 = female) of a person based on occupation, eye color and nationality.
- By James McCaffrey
- 05/14/2019
Need to predict the political party affiliation (democrat, republican, independent) of a person based on their age, annual income, gender, years of education and so on? Our resident data scientist Dr. James McCaffrey shows a technique that can help with that and much more -- with code!
- By James McCaffrey
- 04/10/2019
Our resident doctor of data science this month tackles anomaly detection, using code samples and screenshots to explain the process of finding rare items in a dataset, such as discovering fraudulent login events or fake news items.
- By James McCaffrey
- 03/04/2019
The Data Science doctor delves into supporting vector machines, software systems that can perform binary classification such as creating a model to predict the gender of a person based on their age, annual income, height and weight.
- By James McCaffrey
- 03/04/2019
Dr. James McCaffrey of Microsoft Research uses a full project code sample and screenshots to detail how to use Python to work with self-organizing maps (SOM), which let you investigate the structure of a set of data.
- By James McCaffrey
- 01/15/2019
The Data Science Doctor explains how to use the reinforcement learning branch of machine learning with the Q-learning approach, providing code on how to solve a maze problem for an easy-to-understand example.
- By James McCaffrey
- 10/19/2018
Our resident data scientist provides a hands-on example on how to make a prediction that can be one of just two possible values, which requires a different set of techniques than classification problems where the value to predict can be one of three or more possible values.
- By James McCaffrey
- 08/30/2018
The Data Science Doctor provides a hands-on tutorial, complete with code samples, to explain one of the most common methods for image classification, deep neural network, used, for example, to identify a photograph of an animal as a "dog" or "cat" or "monkey."
- By James McCaffrey
- 06/25/2018
The data science doctor explains everything you need to know about clustering data, the process of grouping items so those in a group (cluster) are similar and items in different groups are dissimilar.
- By James McCaffrey
- 04/30/2018
Our Data Science Lab guru explains how to implement the k-means technique for data clustering, or cluster analysis, which is the process of grouping data items so that similar items belong to the same group/cluster.
- By James McCaffrey
- 03/27/2018
Go hands-on with data scientist Dr. James McCaffrey as he explains neural network dropout, a technique that can be used during training to reduce the likelihood of model overfitting.
- By James McCaffrey
- 02/26/2018
Learn how to do time series regression using a neural network, with "rolling window" data, coded from scratch, using Python.
- By James McCaffrey
- 02/02/2018
The data doctor continues his exploration of Python-based machine learning techniques, explaining binary classification using logistic regression, which he likes for its simplicity.
- By James McCaffrey
- 01/08/2018
The data science doctor continues his exploration of techniques used to reduce the likelihood of model overfitting, caused by training a neural network for too many iterations.
- By James McCaffrey
- 12/05/2017
Our resident data scientist explains how to train neural networks with two popular variations of the back-propagation technique: batch and online.
- By James McCaffrey
- 10/31/2017
Our data science expert continues his exploration of neural network programming, explaining how regularization addresses the problem of model overfitting, caused by network overtraining.
- By James McCaffrey
- 10/05/2017