News

VS Code Curbs Token Use Ahead of Copilot's Controversial Usage-Based Billing Switch

Microsoft released VS Code 1.118 on April 29 with a notable emphasis on token efficiency improvements, just two days after GitHub announced that Copilot is moving to usage-based billing effective June 1, 2026.

The 1.118 release notes acknowledge the connection directly, stating: "To help you get the most value out of your plan, we have been working on several initiatives to improve token efficiency without hindering the quality of the agent." The broader context was captured in the Visual Studio Magazine article, "Devs Sound Off on Usage-Based Copilot Pricing Change: 'You Will Get Less but Pay the Same Price'," as developers reacted to the coming billing shift.

That reaction has intensified quickly. The GitHub community FAQ thread had 70 comments when we wrote about the announcement earlier this week; as of April 30, it had climbed to 237 comments and 319 replies, a sharp increase that underscores how closely developers are watching token costs and plan value ahead of June 1.

VS Code's update featuring prompt caching achieves more than 93% cache reuse per request in active agent sessions. Each agent turn passes significant context to the model -- system prompt, tool definitions, conversation history, file contents. Without caching, that repetitive context is billed as new input tokens each time. With strategic cache breakpoints now placed at stable boundaries -- end of system prompt, end of tools, end of the most recent tool turn, and conversation turn boundaries -- the bulk of each request hits cached pricing. For Anthropic models, cached tokens cost roughly 10 times less than new input tokens, a meaningful difference in long, multi-turn agent sessions.

Tool Search and Subagents
A second efficiency push addresses the agent's toolset. VS Code's agent toolset is split by the new tool search tool into a compact core of roughly 30 always-available tools -- covering about 88% of actual tool calls -- and a larger deferred set whose schemas are not loaded into the model's context until explicitly requested. Microsoft says this alone delivers up to 20% token savings. The feature is already the default for Anthropic models (Claude Sonnet 4.5+ and Opus 4.5+) and is rolling out to OpenAI's GPT-5.4 and GPT-5.5 via the Responses API -- though GPT users must first enable the github.copilot.chat.responsesApi.toolSearchTool.enabled setting.

Two new specialized agentic tools -- a search tool and an execution tool -- extend the efficiency story further. Rather than routing all codebase exploration and terminal command execution through the main frontier model, these tasks are handed off to smaller, purpose-built models that cost significantly less to run. After over a month of flighting, Microsoft reports token savings of up to 20% from this approach as well.

Other 1.118 Changes in Brief
Outside the token-and-billing story, 1.118 continues Microsoft's relatively new weekly release cadence with a broad set of updates. Highlights include expanded Agent experience features in Insiders, semantic indexing now available in all workspaces (not just GitHub/ADO repos), GitHub text search across repos and orgs, and trust/security additions such as tighter sandbox default read permissions.

About the Author

David Ramel is an editor and writer at Converge 360.

comments powered by Disqus

Featured

  • Using Local AI to Cut Copilot Usage-Based Billing Shock

    After being gobsmacked by the new billing plan using almost all my monthly credits in one or two days, I tried pushing some Copilot-style coding work onto local models in VS Code. What I found was less "free AI" and more "pick your pain": cloud charges on one side, heavy local resource use and long waits on the other.

  • .NET 11 Preview 5 Focuses on Performance, Productivity and Safer Code

    .NET 11 Preview 5 focuses on under-the-hood runtime performance gains, streamlined APIs and language features that reduce boilerplate, plus built‑in security checks and incremental ASP.NET Core and EF Core improvements aimed at everyday developer productivity.

  • VS Code 1.124 Focuses on Agent Autonomy and Parallel Sessions

    Microsoft's June 2026 VS Code update turns on Autopilot by default and adds background sending for agent sessions.

  • Developing Agentic Systems in .NET: From Concept to Code

    ZioNet founder Alon Fliess previews his Visual Studio Live! San Diego session on building true agentic systems in .NET -- covering the cognitive loop, MCP tool integration, multi-agent orchestration and enterprise hosting and governance with the Microsoft Agent Framework.

Subscribe on YouTube