News
You Can Now Apply for Early-Stage AI Agent 'Computer Use' in Copilot Studio
On the way to autonomous AI, Microsoft announced an early access research preview of "computer use" for Copilot Studio wherein AI agents visually interact with any app or website -- clicking, typing, and navigating like a human.
Copilot Studio is Microsoft's low-code tool for building, customizing, and deploying AI-powered agents that automate tasks across apps and workflows. It's integrated with the Power Platform and enables both business users and professional developers to create agents that work as standalone copilots, inside Power Platform apps, or embedded in other applications like Microsoft Teams or websites.
Microsoft has been steadily expanding the capabilities of Copilot Studio and agentic AI generally, having recently introduced deep reasoning capabilities for agents, adding support for the trending Model Context Protocol (MCP) and taking agent flows to general availability.
Now comes the ability to use websites and desktop applications as tools to complete a task.
"With computer use, agents can now interact with any system that has a graphical user interface!" said an April 15 post authored by Charles Lamanna, an exec for Business & Industry Copilot at Microsoft.
"Computer use enables agents to interact with websites and desktop apps by clicking buttons, selecting menus, and typing into fields on the screen," he said. "This allows agents to handle tasks even when there is no API available to connect to the system directly. If a person can use the app, the agent can too."
Here's a summary of the announcement highlights:
[Click on image for larger view.] Highlights
Specific highlights of the announcement include:
- Computer Use Feature (Early Access Preview): Enables Copilot Studio agents to interact directly with websites and desktop applications using UI actions (clicking, typing, navigating), even without APIs.
- UI Adaptability: Agents can adapt to UI changes in real-time using built-in reasoning to maintain workflow continuity.
- Cross-Browser & Desktop Support: Supports automation across desktop apps and web browsers including Edge, Chrome, and Firefox.
- Hosted Infrastructure: Runs on Microsoft-hosted infrastructure, keeping enterprise data within Microsoft Cloud boundaries and not used to train Frontier models.
- Key Use Cases Highlighted:
- Automated data entry into internal systems
- Market research automation from web sources
- Invoice data extraction and accounting system integration
- Enhanced RPA (Robotic Process Automation): Smarter, more robust automation compared to traditional RPA -- capable of reasoning, reacting, and making decisions in dynamic UIs.
- No-Code UI Programming: Users can describe tasks in natural language and watch real-time automation with reasoning traces and UI preview.
- Full Activity Visibility: Viewable history of computer use activities, including screenshots and reasoning chains.
- More to Come at Microsoft Build 2025: Additional demos and announcements expected at the event in May 2025.
Users can apply for the preview here. Note that you will need a preview environment hosted in the U.S. and you will need to provide your Tenant ID and Environment ID for the preview.
About the Author
David Ramel is an editor and writer at Converge 360.