Azure AI Foundry Gets 'Computer-Using Agent' for Autonomous GUI Interaction -- Visual Studio Magazine

Azure AI Foundry Gets 'Computer-Using Agent' for Autonomous GUI Interaction

By David Ramel
03/13/2025

Microsoft is expanding functionality for agentic AI into its Azure AI Foundry platform, furthering one of the hottest areas of development right now.

The company this week announced two new features, a Responses API and a Computer-Using Agent (CUA), for its all-in-one platform for building transformative AI apps and agents, formerly called AI Studio.

The Responses API simplifies AI application development by providing a unified interface for retrieval, reasoning, and execution, while the CUA autonomously interacts with computer systems to execute tasks, bridging the gap between AI and real-world application control.

The CUA is described as a specialized AI model in Azure OpenAI Service that enables AI to interact with GUIs, navigate applications, and automate multi-step tasks via natural language instructions, a step up from automation tools that rely on predefined scripts or API-based integrations.

The tech is based on OpenAI's Computer-Using Agent announced in January, when the Microsoft partner touted "the flexibility to perform digital tasks without using OS- or web-specific APIs."

**[Click on image for larger view.]** Computer-Using Agent (CUA) *(source: OpenAI).*

Microsoft on Tuesday detailed these unique abilities of the offering:

Autonomous UI navigation: Can open applications, click buttons, fill out forms, and navigate multi-page workflows.
Dynamic adaptation: Interprets UI changes and adjusts actions accordingly, reducing reliance on rigid automation scripts.
Cross-application task execution: Operates across web-based and desktop applications, integrating disparate systems without API dependencies.
Natural language command interface: Users can describe a task in plain language, and CUA determines the correct UI interactions to execute.

The Responses API fits into the scheme by providing a structured response format that allows AI to interact with multiple tools while maintaining context across interactions, supporting:

Tool calling in one simple API call: Now, developers can seamlessly integrate AI tools, making execution more efficient.
Computer use: Use the computer use tool within the Responses API to drive automation and execute software interactions.
File search: Interact with enterprise data dynamically and extract relevant information.
Function calling: Develop and invoke custom functions to enhance AI capabilities.
Chaining responses into conversations: Keep track of interactions by linking responses together using unique response IDs, ensuring continuity in AI-driven dialogues.
Enterprise-grade data privacy: Built with Azure's trusted security and compliance standards, ensuring data protection for organizations.

An accompanying video from Marco Casalaina, VP of products for CoreAI and an AI Futurist at Microsoft, shows the new tooling in action. He used the two new features to demonstrate the automation of a routine task on a Linux virtual machine, where the AI autonomously navigates a website to download a shipment PDF, extracts and retains key information, inputs it into another site, and prompts for human confirmation before final submission.

"As you can see, these tools offer some amazing possibilities for automating workflows and enhancing productivity across various industries," Casalaina said. "Azure AI Foundry continues to push the boundaries of what's possible with AI-driven automation, and we're excited to see how you'll innovate with these powerful tools."

Microsoft said developers can immediately start building with CUA, while enterprises will soon gain access to Responses API and CUA in Azure OpenAI Service, with future plans to integrate CUA automation into Windows 365 and Azure Virtual Desktop for seamless deployment on Cloud PCs and VMs with enterprise-grade security and compliance.

Speaking to the latter, Microsoft hinted at the possible challenges that come with increased AI autonomy, which in popular culture often lead to doomsday, AI-kills-humanity scenarios.

"As AI systems become more autonomous, ensuring security, reliability, and alignment with human intent is critical," the company said. "The CUA model is one of the first agentic AI models capable of directly interacting with software environments, bringing new challenges in misuse prevention, unintended actions, and adversarial risks. To address these, Microsoft and OpenAI have implemented a multi-layered safety approach spanning the model, system, and deployment levels."

About the Author

David Ramel is an editor and writer at Converge 360.

Printable Format

comments powered by Disqus

Featured

Visual Studio Devs Share Copilot AI Prompts to Improve Code

Microsoft's Mads Kristensen took to social media to ask Visual Studio developers to share their favorite prompts to get GitHub Copilot AI to improve their code.
Azure AI Foundry Gets 'Computer-Using Agent' for Autonomous GUI Interaction

Microsoft is expanding functionality for agentic AI into its Azure AI Foundry platform, furthering one of the hottest areas of development right now where AI controls computers just like humans.
Microsoft Previews GPT-4o Copilot Code Completion and .NET AI Template in Visual Studio

Things are happening quickly in the Microsoft-centric AI dev space, with the company previewing new AI features in Visual Studio 2022 ranging from a new .NET AI template to GPT-4o Copilot Code Completion.
Microsoft Ports TypeScript to Go for 10x Native Performance Gains

Microsoft is revamping its TypeScript programming language with a native compiler and toolset. This effort seeks to address performance challenges, especially in large codebases, by porting the existing TypeScript compiler from TypeScript/JavaScript to the native language, Go.
Uno Platform Takes 'Hot Design' for Cross-Platform .NET Apps to Public Beta

Uno Platform has taken its "Hot Design" feature to public beta, extending the "Hot Reload" paradigm with functionality to visually edit and refine a running app in real-time.

Subscribe on YouTube

.NET Insight

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

Visual Studio Live! Las Vegas
March 10-14, 2025

Live! 360 2-Day Hands-On Seminar: From Traction to Production: Building Generative AI Applications with Azure AI Studio
March 25-26, 2025

VSLive! 4-Day Hands-On Training Seminar: Hands-on with Blazor
May 5-8, 2025

Cybersecurity & Ransomware Live! VirtCon 2025
May 13-15, 2025

VSLive! 3-Day Hands-On Training Seminar: Master Modern JavaScript: Unlock the Full Potential of Your Code
June 2-4, 2025

VSLive! 2-Day Hands-On Training Seminar: Asynchronous and Parallel Programming in C#
June 24-25, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
July 15-18, 2025

Visual Studio Live! @ Microsoft HQ
August 4-8, 2025

Visual Studio Live! San Diego
September 8-12, 2025

Live! 360 2-Day Hands-On Seminar: Swimming in the Lakes of Microsoft Fabric and AI – A Hands-on Experience
September 18-19, 2025

Live! 360 Orlando
November 16-21, 2025

Artificial Intelligence Live! Orlando
November 16-21, 2025

Cloud & Containers Live! Orlando
November 16-21, 2025

Cybersecurity & Ransomware Live! Orlando
November 16-21, 2025

Data Platform Live! Orlando
November 16-21, 2025

Visual Studio Live! Orlando
November 16-21, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Free Webcasts

> More Webcasts