News

ML.NET 3.0 Boosts Deep Learning, Data Processing for .NET-Based AI Apps

Microsoft shipped ML.NET 3.0, enhancing deep learning and data processing scenarios in the company's machine language framework that lets devs create AI-infused apps completely within the .NET ecosystem.

The ability for devs to use C# and F# instead of the usual data science stalwarts -- Python and R -- is a main selling point of the open source ML.NET framework, created to help developers build custom ML models and integrate them into apps. That's done with tools such as a command-line interface (CLI) and Model Builder or creating constructs like those large language models (LLMs) that power ChatGPT and Microsoft's ubiquitous "Copilot" AI assistants.

ML.NET comes in a NuGet package that has been downloaded more than 5.3 million times.

[Click on image for larger view.] ML.NET (source: Microsoft).

In announcing ML.NET 3.0 yesterday (Nov. 27), Microsoft emphasized two main points of interest, deep learning and data processing.

Deep Learning
This ML subset uses artificial neural networks loosely based on human brain behaviors in order to "learn" from inputs such as large amounts of data, even unstructured data.

Microsoft said deep learning scenarios were substantially expanded in the v3.0 release with new capabilities in three areas: object detection, named entity recognition and question answering.

Object detection in ML.NET 3.0 is an advanced form of image classification that not only categorizes entities within images but also locates them, making it ideal for scenarios with images containing multiple objects of different types. In v3.0, the object detection capabilities are boosted via integrations with TorchSharp and ONNX models, with Microsoft specifically noting TorchSharp-powered object detection APIs. The company said those represent a significant step in leveraging deep learning techniques within the ML.NET framework.

In discussing advanced neural network architecture, Microsoft explained that the underlying technology of the object detection API includes techniques developed at Microsoft Research, utilizing a transformer-based neural network architecture. This approach is indicative of modern trends in deep learning, particularly in computer vision, the company said.

TorchSharp is also instrumental in the enhancements to named entity recognition and question answering, two common ML areas that are part of the natural language processing (NLP) space. Enhancements for both of those scenarios are unlocked in ML.NET 3.0 by leveraging TorchSharp RoBERTa text classification features previously introduced.

"Both the NER and QA trainers are included in the Microsoft.ML.TorchSharp 3.0.0 package and the Microsoft.ML.TorchSharp namespace," Microsoft said.

Data Processing
Here, scenarios are improved via many enhancements and bug fixes to DataFrame -- a structure for storing and manipulating data -- along with new IDataView interoperability features.

"The important steps of loading, inspecting, transforming, and visualizing your data are much more powerful," Microsoft said.

Specific items of note include:

  • Enhanced IDataView <-> DataFrame conversions: Added support for String and VBuffer column types, with String values handled as ReadOnlyMemory<char> and VBuffer supporting all backing primitives.
  • Increased column data capacity: Columns can now store more than 2 GB of data, removing the previous limitation.
  • Apache Arrow integration: Recognizes Apache Arrow Date64 column data.
  • Expanded data loading capabilities: Includes import and export functionality for SQL databases using ADO.NET. Also, data can be loaded from any IEnumerable collection and exported to System.Data.DataTable.
  • Appending data between DataFrames: Allows appending data from one DataFrame to another when column names match, easing constraints on column ordering.
  • Handling of duplicate column names: Enhancements in DataFrame.LoadCsv to manage duplicate column names, offering options to rename them.
  • Improved arithmetic performance and null value handling: Optimizations in column cloning, binary comparison scenarios, and arithmetic operations.
  • Debugger enhancements: Better readability for columns with long names in the debugger.

Microsoft also noted new tensor primitive integrations that don't affect development tasks directly but do provide notable performance improvements. AutoML, which automates the process of applying machine learning to data, was also enhanced, providing a boost to associated experiences in Model Builder and the ML.NET CLI.

Much more about all of the above and other changes can be found in the release notes.

Going forward, the dev team is now working on plans for .NET 9 and ML.NET 4.0, though Model Builder and the ML.NET CLI are expected to be updated much sooner in order to consume the ML.NET 3.0 release.

"We know we will continue expanding deep learning scenarios and integrations, and we know we will keep making enhancements to DataFrame," Microsoft said. "We will keep expanding the APIs available in System.Numerics.Tensors and integrating them into ML.NET. Stay tuned for more detailed ML.NET 4.0 plans."

About the Author

David Ramel is an editor and writer at Converge 360.

comments powered by Disqus

Featured

  • Microsoft Revamps Fledgling AutoGen Framework for Agentic AI

    Only at v0.4, Microsoft's AutoGen framework for agentic AI -- the hottest new trend in AI development -- has already undergone a complete revamp, going to an asynchronous, event-driven architecture.

  • IDE Irony: Coding Errors Cause 'Critical' Vulnerability in Visual Studio

    In a larger-than-normal Patch Tuesday, Microsoft warned of a "critical" vulnerability in Visual Studio that should be fixed immediately if automatic patching isn't enabled, ironically caused by coding errors.

  • Building Blazor Applications

    A trio of Blazor experts will conduct a full-day workshop for devs to learn everything about the tech a a March developer conference in Las Vegas keynoted by Microsoft execs and featuring many Microsoft devs.

  • Gradient Boosting Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the gradient boosting regression technique, where the goal is to predict a single numeric value. Compared to existing library implementations of gradient boosting regression, a from-scratch implementation allows much easier customization and integration with other .NET systems.

  • Microsoft Execs to Tackle AI and Cloud in Dev Conference Keynotes

    AI unsurprisingly is all over keynotes that Microsoft execs will helm to kick off the Visual Studio Live! developer conference in Las Vegas, March 10-14, which the company described as "a must-attend event."

Subscribe on YouTube