News

Microsoft Going Small in Semantic Kernel AI Dev Tooling

In revealing future plans for its Semantic Kernel AI dev tooling, Microsoft said it's going small -- at lease in terms of AI model size.

While highlighting increased small model support, Microsoft's Matthew Bolanos yesterday said: "In addition to helping you improve performance, we also want to help you control costs. Not all tasks require the power of something like GPT-4o. Many times, a local model or small language model (SLM) will do. To help developers offload tasks to smaller, cheaper models, we'll be providing connectors to both ONNX runtime and Azure's Model as a Service."

Semantic Kernel is a lightweight, open-source development kit designed to help developers build AI agents and integrate advanced AI models into their applications.
Semantic Kernel Step by Step
[Click on image for larger view.] Semantic Kernel Step by Step (source: Microsoft).
Microsoft provided a sneak peek in to immediate plans for the kit in yesterday's blog post, "What's coming next? Summer / Fall roadmap for Semantic Kernel."

After the debut of ChatGPT and other huge AI constructs, called large language models (LLMs), the industry has increased focus on smaller, more specialized small langue models (SLMs) that can be less expensive and easier to use. For more on that, see the article, "LLMs vs SLMs: When to Go Big or Small in Enterprise AI."

Regarding the new SLM functionality, Bolanos said, "Together with these two connectors, you can start saving money by only using the models that you need to complete a user's request."

Along with small model support, he cited:

  • Enhancing enterprise requirements: Bolanos said customers appreciate Semantic Kernel for its enterprise features like filters and Open Telemetry metrics. To enhance its utility, the dev team aims to help users leverage these features for better performance and reliability in AI apps. For instance, to address slow AI applications, the team will simplify diagnosing slow LLM calls ("we want to make it easier for you to diagnose why your LLM calls are slow") and introduce techniques such as semantic caching to speed up responses.
  • Improved memory connectors: The kit's memory connectors will soon be upgraded to allow devs to use custom models to read and write data into vector databases. This means that instead of getting untyped records back from a product search, the new connectors will return the same data type that was input, such as a product. This enhancement will improve type safety and data retrieval from vector databases, making semantic searching more efficient and reliable, Bolanos said.
  • Automation with agents: The dev team will collaborate with the rest of Microsoft to enable the orchestration of multiple AI agents to complete business processes. AI agents perform better when given specific steps, and Semantic Kernel will simplify defining these processes across all three of the team's SDKs.

Bolanos pointed to a bucketized view of the team's work on GitHub for those wanting more information such as when features will go live and other items being worked on.

About the Author

David Ramel is an editor and writer at Converge 360.

comments powered by Disqus

Featured

Subscribe on YouTube