AI Toolkit for Visual Studio Code Debuts at Build 2024

On day one of the AI-heavy Microsoft Build 2024 developer conference the company announced the AI Toolkit for Visual Studio Code, enabling developers to explore, try, fine-tune and integrate state-of-the-art AI models into applications.

Those models come from Hugging Face and Azure AI Studio, the latter which itself reached general availability today, sporting new models both big and small. Azure AI Studio provides more than 1,600 cloud-hosted models from providers such as Meta, Mistral, Microsoft and OpenAI, highlighted by the addition of new small language models (SLMs) in Microsoft's Phi3 family, which are open models as opposed to the more common proprietary approach. Hugging Face, meanwhile, provides access to more than 400,000 models.

The AI Toolkit for VS Code extension in the Visual Studio Code Marketplace actually shows a release date of December 2023, as it's an evolution of the former Windows AI Studio extension. It has been installed almost 4,000 times.

"The AI Toolkit for Visual Studio Code is our response to user feedback on Windows AI Studio that they need a cross-platform developer experience that simplifies the process of experimenting with new models in their applications," Microsoft said in a May 21 blog post.

AI Toolkit for VS Code
[Click on image for larger view.] AI Toolkit for VS Code (source: Microsoft).

The tool's overview states developers can use it to:

  • Find a supported model and download locally
  • Test model inference in the Playground
  • Fine-tune model locally or remotely
  • Deploy fine-tuned models to cloud

Feature highlights include:

  • AI Toolkit Model Catalog: The tool includes a curated catalog of models from Azure AI Studio, optimized for local use, making it easier to discover and try models in a specific development environment. The catalog currently supports models running on Windows and Linux, both on CPU and GPU, with MacOS-optimized models coming soon.
  • Model Playground for Local Experimentation: The kit features a model playground for local or cloud experimentation with small language models. This tool allows developers to explore and understand different models' capabilities in a controlled environment.
    AI Toolkit for VS Code Model Catalog
    [Click on image for larger view.] AI Toolkit for VS Code Model Catalog (source: Microsoft).
  • Advanced Fine-Tuning Techniques: To help developers stay ahead of the curve by ensuring efficient and effective model fine-tuning for specific use cases, the toolkit supports these fine-tuning techniques:
    • Parameter Efficient Fine Tuning (PEFT)
    • Quantized Low Rank Adaptation (QLORA)
    • Flash Attention 2
  • Evaluations and Insights: It supports model and prompt evaluations, offering insights into model performance thanks to the aforementioned integration with Azure AI Studio and prompt flow SDKs. That means developers can create custom evaluations to measure metrics like accuracy, coherence, and safety for their applications.
  • Deployment Capabilities: It enables seamless deployment of fine-tuned models to Azure Container Apps (ACA), Azure Kubernetes Service (AKS) or Azure AI Studio. Currently, these deployment capabilities are designed for development environments, with enhancements for production deployments in progress.
  • High-Performance Inferencing on Windows: This functionality utilizes ONNX Runtime and DirectML to efficiently run inference pipelines on a wide range of GPUs. This setup leverages DirectML to access various accelerators on the Windows platform, including GPUs, NPUs and CPUs to ensure optimized performance.

A new Playground experience provided by the update runs optimized models for Windows, which Microsoft said allows developers to test the capabilities of models within VS Code and start building fine-tuning projects for their specific needs.

Microsoft published a Welcome to AI Toolkit for VS Code guide on GitHub to help developers get started with the extension. More advanced guidance comes in Fine-Tuning models remotely and Inferencing with the fine-tuned model documentation.

The three-day in-person (Seattle) and online Microsoft Build 2024 developer conference runs through May 23.

About the Author

David Ramel is an editor and writer for Converge360.

comments powered by Disqus


Subscribe on YouTube