Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of data clustering and anomaly detection using the DBSCAN (Density Based Spatial Clustering of Applications with Noise) algorithm. Compared to other anomaly detection systems based on data clustering, DBSCAN can find significantly different types of anomalies.
- By James McCaffrey
- 11/06/2024
Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the Winnow classification technique. Winnow classification is used for a very specific scenario where the target variable to predict is binary and all the predictor variables are also binary.
- By James McCaffrey
- 10/15/2024
Dr. James McCaffrey of Microsoft Research presents a full demo of k-nearest neighbors classification on mixed numeric and categorical data. Compared to other classification techniques, k-NN is easy to implement, supports numeric and categorical predictor variables, and is highly interpretable.
- By James McCaffrey
- 10/01/2024
Dr. James McCaffrey from Microsoft Research presents a complete end-to-end program that explains how to perform binary classification (predicting a variable with two possible discrete values) using logistic regression, where the prediction model is trained using batch stochastic gradient descent with weight decay.
- By James McCaffrey
- 09/16/2024
Dr. James McCaffrey from Microsoft Research presents a C# program that illustrates using the AdaBoost algorithm to perform binary classification for spam detection. Compared to other classification algorithms, AdaBoost is powerful and works well with small datasets, but is sometimes susceptible to model overfitting.
- By James McCaffrey
- 09/03/2024
Dr. James McCaffrey from Microsoft Research presents a demonstration program that models biological immune systems to identify network intrusion threats. The demo illustrates challenges with artificial immune systems as well as promising new approaches.
- By James McCaffrey
- 08/15/2024
Dr. James McCaffrey from Microsoft Research presents a complete program that uses the Python language LightGBM system to create a custom autoencoder for data anomaly detection. You can easily adapt the demo program for your own anomaly detection scenarios.
- By James McCaffrey
- 08/02/2024
Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on creating an approximation of a dataset that has fewer columns.
- By James McCaffrey
- 07/15/2024
Dr. James McCaffrey from Microsoft Research presents a full-code, step-by-step tutorial on using the LightGBM tree-based system to perform binary classification (predicting a discrete variable that has exactly two possible values).
- By James McCaffrey
- 07/01/2024
Here's a complete end-to-end demo of what Dr. James McCaffrey of Microsoft Research says is arguably the simplest possible classification technique.
- By James McCaffrey
- 06/17/2024
Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on this powerful machine learning technique used to predict a single numeric value.
- By James McCaffrey
- 06/05/2024
Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on a "very tricky" machine learning technique.
- By James McCaffrey
- 05/15/2024
Dr. James McCaffrey of Microsoft Research provides a full-code, step-by-step machine learning tutorial on how to use the LightGBM system to perform multi-class classification using Python and the scikit-learn library.
- By James McCaffrey
- 05/02/2024
Dr. James McCaffrey of Microsoft Research tackles the process of examining a set of source data to find data items that are different in some way from the majority of the source items.
- By James McCaffrey
- 04/15/2024
Chances are if you've had many coding interviews you've been presented with a poker problem. Here's a great take from Dr. James McCaffrey of Microsoft Research.
- By James McCaffrey
- 04/04/2024
Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step example of machine learning technique to visualize high-dimensional data.
- By James McCaffrey
- 03/15/2024
Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on technique for visualizing and clustering data.
- By James McCaffrey
- 03/01/2024
Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on a classical ML technique that transforms a dataset into one with fewer columns, useful for creating a graph of data that has more than two columns, for example.
- By James McCaffrey
- 02/16/2024
Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on an implementation of the technique that emphasizes simplicity and ease-of-modification over robustness and performance.
- By James McCaffrey
- 02/01/2024
Transforming a dataset into one with fewer columns is more complicated than it might seem, explains Dr. James McCaffrey of Microsoft Research in this full-code, step-by-step machine learning tutorial.
- By James McCaffrey
- 01/17/2024
Dr. James McCaffrey of Microsoft Research guides you through a full-code, step-by-step tutorial on "one of the most important operations in machine learning."
- By James McCaffrey
- 01/03/2024
Spectral clustering is quite complex, but it can reveal patterns in data that aren't revealed by other clustering techniques.
- By James McCaffrey
- 12/18/2023
K-means is comparatively simple and works well with large datasets, but it assumes clusters are circular/spherical in shape, so it can only find simple cluster geometries.
- By James McCaffrey
- 12/01/2023
Compared to other clustering techniques, DBSCAN does not require you to explicitly specify how many data clusters to use, explains Dr. James McCaffrey of Microsoft Research in this full-code, step-by-step machine language tutorial.
- By James McCaffrey
- 11/15/2023
Dr. James McCaffrey of Microsoft Research explains GMM clustering in a full-code, step-by-step tutorial, noting his data scientists colleagues have different opinions about the complicated technique.
- By James McCaffrey
- 11/01/2023