The Data Science Lab


AdaBoost.R2 Regression Using C#

AdaBoost.R2 regression works by building an ensemble of decision trees, training them on reweighted data, and combining their predictions with a weighted median, while also showing how parameter choices affect accuracy and overfitting.

Quadratic Regression with Pseudo-Inverse Training Using C#

Dr. James McCaffrey presents a complete end-to-end demonstration of quadratic regression, implemented from scratch, with pseudo-inverse training, using the C# language. Compared to standard linear regression, quadratic regression is better able to handle data with a non-linear structure and interactions between predictor variables. Compared to other types of training, pseudo-inverse does not require any parameters that must be determined by trial and error.

Decision Tree Regression from Scratch Using JavaScript

Dr. James McCaffrey presents a complete end-to-end demonstration of decision tree regression from scratch using JavaScript. The goal of decision tree regression is to predict a single numeric value. For simplicity and better maintenance, the demo implementation uses list storage instead of pointers. For better customization and interpretability, the implementation uses list iteration instead of recursion or a stack algorithm.

Machine Learning

Random Forest Regression Using C#

Dr. James McCaffrey presents a complete end-to-end example of random forest regression to predict a single numeric value, implemented using C#. A random forest is a collection of basic decision tree regressors that have been trained on different subsets of the source training data. The technique reduces model overfitting to give more accurate predictions on new, previously unseen data.

Quadratic Regression with SGD Training Using JavaScript

Dr. James McCaffrey presents a complete end-to-end demonstration of quadratic regression, with SGD training, implemented from scratch, using JavaScript. Compared to standard linear regression, quadratic regression is better able to handle data with a non-linear structure, and data with interactions between predictor variables.

Decision Tree Regression from Scratch Without Pointers or Recursion Using C#

Dr. James McCaffrey presents a complete end-to-end demonstration of decision tree regression from scratch using the C# language. The goal of decision tree regression is to predict a single numeric value. The demo implementation doesn't use pointers (references) for simplicity and does not use recursion for better maintainability and customization.

Linear Regression with Pseudo-Inverse Training Using JavaScript

Dr. James McCaffrey presents a complete end-to-end demonstration of linear regression with pseudo-inverse training implemented using JavaScript. Compared to other training techniques, such as stochastic gradient descent, pseudo-inverse training does not require any parameters and so it is especially simple to use.

Quadratic Regression with SGD Training Using C#

Dr. James McCaffrey presents a complete end-to-end demonstration of quadratic regression, implemented from scratch, with SGD training, using C#. Compared to standard linear regression, quadratic regression is better able to handle data with a non-linear structure, and data with interactions between predictor variables.

Kernel Ridge Regression with Cholesky Inverse Training Using JavaScript

Dr. James McCaffrey presents a complete end-to-end demonstration of the kernel ridge regression technique to predict a single numeric value, implemented using JavaScript. The demo trains the model using kernel matrix inverse (Cholesky decomposition). There is no single best machine learning regression technique, but when kernel ridge regression prediction works, it is often highly accurate.

Linear Regression with Pseudo-Inverse Training Using C#

Dr. James McCaffrey presents a complete end-to-end demonstration of linear regression using pseudo-inverse training. Compared to other training techniques, such as stochastic gradient descent, pseudo-inverse training does not require any parameters and so it is especially simple to use.

Anomaly Detection Using K-Means Clustering with JavaScript

Dr. James McCaffrey presents a complete end-to-end demonstration of anomaly detection using k-means data clustering, implemented with JavaScript. Compared to other anomaly detection techniques, k-means anomaly detection is simple to implement, simple to interpret, and simple to customize.

Decision Tree Regression from Scratch with Pointers Using C#

Dr. James McCaffrey presents a complete end-to-end demonstration of decision tree regression from scratch using the C# language. The goal of decision tree regression is to predict a single numeric value. The demo implementation uses pointers (references) for efficiency but does not use any recursion for better maintainability and customization.

ANOVA Using JavaScript

Dr. James McCaffrey presents a complete end-to-end demonstration of ANOVA (analysis of variance) using JavaScript. ANOVA is a classical statistics technique where the goal is to determine if the unknown means (averages) of three or more groups are likely to all be equal or not, based on the variances of samples from the groups.

Tsetlin Machine Binary Classification Using C#

Dr. James McCaffrey presents a complete end-to-end demonstration of Tsetlin Machine binary classification using the C# language. Tsetlin Machine models have characteristics of propositional logic, rule-based systems, and finite state automata. Tsetlin Machine systems require predictor values to be binary encoded and therefore the systems are very flexible and computationally efficient in terms of each operation, but computationally expensive in terms of number of operations.

Linear Regression with Two-Way Interactions Using JavaScript

Dr. James McCaffrey presents a complete end-to-end demonstration of linear regression with two-way interactions between predictor variables. Standard linear regression predicts a single numeric value based only on a linear combination of predictor values. Linear regression with interactions between predictor variables can handle more complex data while retaining a high level of model interpretability.

Kernel Ridge Regression with Cholesky Inverse Training Using C#

Dr. James McCaffrey presents a complete end-to-end demonstration of the kernel ridge regression technique to predict a single numeric value. The demo uses the kernel matrix inverse (Cholesky decomposition) technique for model training. There is no single best machine learning regression technique, but when kernel ridge regression prediction works, it is often highly accurate.

Kernel Ridge Regression with Stochastic Gradient Descent Training Using JavaScript

Dr. James McCaffrey presents a complete end-to-end demonstration of the kernel ridge regression technique to predict a single numeric value. The demo uses stochastic gradient descent, one of two possible training techniques. There is no single best machine learning regression technique, but when kernel ridge regression prediction works, it is often very accurate.

Computing the Determinant of a Matrix Using Gaussian Elimination to Row Echelon Form with C#

Dr. James McCaffrey presents a complete end-to-end demonstration of computing the determinant of a matrix using the C# language. In machine learning scenarios, computing the determinant of a matrix is typically used during model training to determine if a matrix has an inverse or not.

Implementing k-Nearest Neighbors Regression Using JavaScript

Dr. James McCaffrey presents a complete end-to-end demonstration of k-nearest neighbors regression using JavaScript. There are many machine learning regression techniques, but k-nearest neighbors is especially simple to implement and the results are highly interpretable.

Kernel Ridge Regression with Stochastic Gradient Descent Training Using C#

Dr. James McCaffrey presents a complete end-to-end demonstration of the kernel ridge regression technique to predict a single numeric value. The demo uses stochastic gradient descent, one of two possible training techniques. There is no single best machine learning regression technique. When kernel ridge regression prediction works, it is often highly accurate.

Linear Regression Using JavaScript

Dr. James McCaffrey presents a complete end-to-end demonstration of linear regression using JavaScript. Linear regression is the simplest machine learning technique to predict a single numeric value, and a good way to establish baseline results for comparison with other more sophisticated regression techniques.

Matrix Inverse Using Cayley-Hamilton with C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of computing a matrix inverse using the Cayley-Hamilton technique. Compared to other matrix inverse algorithms, Cayley-Hamilton is very simple and as a nice side effect gives you the matrix determinant. However, Cayley-Hamilton is not suitable for use with large matrices.

Linear Support Vector Regression Using C# with Particle Swarm Training

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the linear support vector regression (linear SVR) technique, where the goal is to predict a single numeric value. A linear SVR model uses an unusual error/loss function and cannot be trained using standard techniques, and so particle swarm optimization training is used.

Matrix Inverse Using Newton Iteration with C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of computing a matrix inverse using the Newton iteration algorithm. Compared to other algorithms, Newton iteration is simple and easy to customize, but the technique is relatively slow.

Linear Regression with Two-Way Interactions Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of linear regression with two-way interactions between predictor variables. Compared to standard linear regression, which predicts a single numeric value based only on a linear combination of predictor values, linear regression with interactions can handle more complex data while retaining a high level of model interpretability.

Subscribe on YouTube