News
Steve Sanderson Previews AI App Dev: Small Models, Agents and a Blazor Voice Assistant
Blazor creator Steve Sanderson presented a keynote at the recent NDC London 2025 conference where he previewed the future of .NET application development with smaller AI models and autonomous agents, along with showcasing a new Blazor voice assistant project demonstrating cutting-edge functionality.
In a keynote video titled "The past, present, and future of AI for application developers" published on YouTube last week, Sanderson, who works on ASP.NET at Microsoft, set up the future by reviewing the past, and that future is centered on agents.
Agentic AI
Sanderson concluded his keynote by introducing agentic AI as a key architectural direction for the future. Unlike single-prompt completions, agentic systems are autonomous loops: they can call functions, reason about goals, gather context, and iterate toward a solution.
"One of the main patterns that people often talk about at the moment is agents or agentic systems and this is the idea that instead of just making a single call to an AI service and saying, 'just give me one answer at a time,' how about we give it the ability to do a whole series of tasks by itself," he said.
This is particularly compelling for developers working on complex tools, DevOps automation, or workflow-heavy enterprise apps. He illustrated the concept with a software upgrade agent: one that could research a new framework version, scan a codebase, suggest changes, and even propose pull requests. Though still early in practical adoption, this kind of multi-step, self-directed AI behavior is something Microsoft is watching closely -- and it's likely to influence future APIs and dev tooling.
"And it's an area that a lot of people are pushing towards at the moment right now, that's maybe a little bit further ahead."
The Importance of Small-Model Choices
Sanderson urged developers to recognize that they don't always need massive LLMs like GPT-4 to build useful AI-powered applications. For many .NET scenarios -- such as intent classification, structured data extraction, or even lightweight natural language understanding -- small language models can be more than sufficient. Their lightweight nature allows them to be run on a developer's own servers -- or even in-browser -- which unlocks benefits around latency, cost, and data privacy. This is especially relevant to Visual Studio developers working on internal business apps or resource-constrained deployments, where offloading to a cloud LLM isn't ideal. Sanderson showed a real-world zero-shot classification example using a Hugging Face model in a browser, reinforcing that small models aren't just theoretical -- they're deployable today.
[Click on image for larger view.] 'We've Been Relying on Using Some Pretty Big Machinery to Make This Work' (source: NDC London/YouTube).
"So far we've been relying on using some pretty big machinery to make this work, so you'll notice up here up at the top here we're using GPT-4o mini, which is, despite having the min in its name, still a very large model," Sanderson said. "Okay, and if you were dependent on this sort of thing as being a really core low-level part of your product you might think can we make it smaller. Can we make this something that's quicker and cheaper to run that runs on our servers just directly. And so that brings us onto this subject of small language models as well as large language models, and for many scenarios you actually can use a much smaller model than something like this and small enough that you can just run it on your server. In fact although you probably shouldn't, you could even run it directly in your end user's browser if you want to, so I'm going to show you an example of doing that -- not that I think you should, but just to make the point that these things are small enough that if you can do that, you should believe that you could run it on your server as well."
The Blazor Voice Assistant Demo
One of the demos in the keynote was a real-time, voice-driven assistant built with Blazor Server and OpenAI's real-time API. This assistant went beyond speech recognition, being able to understand user intent, manipulate UI data models, call backend functions, and even transform input content based on tone, formatting, and emoji requests. The demo showed how natural it felt inside the .NET ecosystem, with JSON models, function calling, Blazor component binding, and OpenAI integration all working together smoothly. Sanderson even demonstrated how the assistant could dynamically restructure list items, rewrite content with tone changes, and respond to conversational input -- without requiring any bespoke UI logic. It served as a powerful vision for the kind of next-gen user experiences .NET devs can build today.
"So that's an example of how you can create an assistant that's built in to your application [that] knows about the workflow that the user is going through and helps them with that."
And More
Watch Sanderson's 58-minute keynote for more, including:
-
Historical AI Foundations: Sanderson traced AI's origins from Alan Turing and Eliza to modern language models, demonstrating how early systems relied on simple string manipulation.
-
Live GPT-2 Model Training: He trained a mini GPT-2-style model from scratch on NDC session data using a laptop, showing how transformer models learn patterns and syntax over time.
-
Function Calling with AI: Demonstrated how to get LLMs to call external functions (like a weather API) through prompt engineering and structured tool-calling patterns.
-
Retrieval-Augmented Generation (RAG): Previewed a document-based AI assistant that allows users to query local PDFs using natural language with citations, powered by OpenAI and a vector store.
-
Structured Data Extraction in .NET: Used
Microsoft.Extensions.AI
to extract and deserialize structured information from freeform real estate listings into typed C# objects.
-
Vision-Based AI with Images: Analyzed traffic cam images using a vision model to detect congestion, hazards, and camera malfunctions, with alerts triggered via function calls.
"So in summary what have we done, we started off with the Eliza chat system and how it was just a bunch of string replacements," he said in conclusion. "We went through Markov modeling for language through GPT-2, and we trained a system and got a sense of how long that took. We went through a whole bunch of business process automation with structured data extraction and image models. We then looked at small language models. We've ended up looking at a realtime voice-based assistant, so I hope that something along the way has been at least a little bit interesting or useful to you."
About the Author
David Ramel is an editor and writer at Converge 360.