Dr. James McCaffrey of Microsoft Research explains how to programmatically split a file of data into a training file and a test file, for use in a machine learning neural network for scenarios like predicting voting behavior from a file containing data about people such as sex, age, income and so on.
- By James McCaffrey
- 09/01/2020
Dr. James McCaffrey of Microsoft Research uses a full code program and screenshots to explain how to programmatically encode categorical data for use with a machine learning prediction model such as a neural network classification or regression system.
- By James McCaffrey
- 08/12/2020
Dr. James McCaffrey of Microsoft Research uses a full code sample and screenshots to show how to programmatically normalize numeric data for use in a machine learning system such as a deep neural network classifier or clustering algorithm.
- By James McCaffrey
- 08/04/2020
Microsoft shipped the seventh preview of Entity Framework Core 5.0, boosting its data access technology with a factory to create DbContext instances and more.
After previously detailing how to examine data files and how to identify and deal with missing data, Dr. James McCaffrey of Microsoft Research now uses a full code sample and step-by-step directions to deal with outlier data
- By James McCaffrey
- 07/14/2020
Turning his attention to the extremely time-consuming task of machine learning data preparation, Dr. James McCaffrey of Microsoft Research explains how to examine data files and how to identify and deal with missing data.
- By James McCaffrey
- 07/06/2020
Microsoft today announced the fifth previews of .NET 5.0 and Entity Framework Core 5.0 en route to a November general release date, though not all of the planned functionality will be finalized by then because of the COVID-19 pandemic.
Clustering non-numeric -- or categorial -- data is surprisingly difficult, but it's explained here by resident data scientist Dr. James McCaffrey of Microsoft Research, who provides all the code you need for a complete system using an algorithm based on a metric called category utility (CU), a measure how much information you gain by clustering.
- By James McCaffrey
- 06/03/2020
This week sees several significant additions to the Visual Studio Code ecosystem: an update to the Python extension; the popular open source MongoDB database; and AI-powered JavaScript code completions from Kite.
Dr. James McCaffrey of Microsoft Research explains the k-means++ technique for data clustering, the process of grouping data items so that similar items are in the same cluster, for human examination to see if any interesting patterns have emerged or for software systems such as anomaly detection.
- By James McCaffrey
- 05/06/2020
Microsoft recently beefed up the .NET and Java SDKs for Azure Cosmos DB, a globally distributed, multi-model database service that helps users and developers elastically and independently scale throughput and storage across Azure regions with a click of a button.
VSM Senior Technical Editor Dr. James McCaffrey, of Microsoft Research, explains why inverting a matrix -- one of the more common tasks in data science and machine learning -- is difficult and presents code that you can use as-is, or as a starting point for custom matrix inversion scenarios.
- By James McCaffrey
- 04/07/2020
Microsoft engineer Sam Xu says "it’s time to move OData to .NET 5" and in a new blog post he shows how to do just that.
In announcing today's second preview of the big, unifying .NET 5 that's going GA in November, Microsoft revealed the next-gen platform is already handling 50 percent of the traffic to the company's main .NET website.
A radial basis function network (RBF network) is a software system that's similar to a single hidden layer neural network, explains Dr. James McCaffrey of Microsoft Research, who uses a full C# code sample and screenshots to show how to train an RBF network classifier.
- By James McCaffrey
- 03/24/2020