News

VS Code Curbs Token Use Ahead of Copilot's Controversial Usage-Based Billing Switch

Microsoft released VS Code 1.118 on April 29 with a notable emphasis on token efficiency improvements, just two days after GitHub announced that Copilot is moving to usage-based billing effective June 1, 2026.

The 1.118 release notes acknowledge the connection directly, stating: "To help you get the most value out of your plan, we have been working on several initiatives to improve token efficiency without hindering the quality of the agent." The broader context was captured in the Visual Studio Magazine article, "Devs Sound Off on Usage-Based Copilot Pricing Change: 'You Will Get Less but Pay the Same Price'," as developers reacted to the coming billing shift.

That reaction has intensified quickly. The GitHub community FAQ thread had 70 comments when we wrote about the announcement earlier this week; as of April 30, it had climbed to 237 comments and 319 replies, a sharp increase that underscores how closely developers are watching token costs and plan value ahead of June 1.

VS Code's update featuring prompt caching achieves more than 93% cache reuse per request in active agent sessions. Each agent turn passes significant context to the model -- system prompt, tool definitions, conversation history, file contents. Without caching, that repetitive context is billed as new input tokens each time. With strategic cache breakpoints now placed at stable boundaries -- end of system prompt, end of tools, end of the most recent tool turn, and conversation turn boundaries -- the bulk of each request hits cached pricing. For Anthropic models, cached tokens cost roughly 10 times less than new input tokens, a meaningful difference in long, multi-turn agent sessions.

Tool Search and Subagents
A second efficiency push addresses the agent's toolset. VS Code's agent toolset is split by the new tool search tool into a compact core of roughly 30 always-available tools -- covering about 88% of actual tool calls -- and a larger deferred set whose schemas are not loaded into the model's context until explicitly requested. Microsoft says this alone delivers up to 20% token savings. The feature is already the default for Anthropic models (Claude Sonnet 4.5+ and Opus 4.5+) and is rolling out to OpenAI's GPT-5.4 and GPT-5.5 via the Responses API -- though GPT users must first enable the github.copilot.chat.responsesApi.toolSearchTool.enabled setting.

Two new specialized agentic tools -- a search tool and an execution tool -- extend the efficiency story further. Rather than routing all codebase exploration and terminal command execution through the main frontier model, these tasks are handed off to smaller, purpose-built models that cost significantly less to run. After over a month of flighting, Microsoft reports token savings of up to 20% from this approach as well.

Other 1.118 Changes in Brief
Outside the token-and-billing story, 1.118 continues Microsoft's relatively new weekly release cadence with a broad set of updates. Highlights include expanded Agent experience features in Insiders, semantic indexing now available in all workspaces (not just GitHub/ADO repos), GitHub text search across repos and orgs, and trust/security additions such as tighter sandbox default read permissions.

About the Author

David Ramel is an editor and writer at Converge 360.

comments powered by Disqus

Featured

  • VS Code 1.127 Further Integrates Advanced Browser-AI Tech

    Microsoft's July 1 Visual Studio Code update continues a recent push to make the editor's integrated browser a more capable development surface -- and a more useful tool for AI agents.

  • Support Vector Regression with SGD Training Using C#

    Support vector regression can predict numeric values effectively, and this article shows how to implement and train a kernel SVR model in C# using stochastic sub-gradient descent.

  • New GitHub Switch Limits Repo Issue Creation to Collaborators Only

    After publicly touting pull request limits as a way to cut maintainer noise, GitHub is taking the same idea further with a new setting that lets repository admins restrict issue creation to collaborators only.

  • Uno Platform Helps Ship First Stable SkiaSharp 4.0 Release for 2D .NET Graphics

    SkiaSharp 4.148.0 is the first stable v4 release, bringing a newer Skia engine, API cleanup, performance work and a Microsoft-Uno co-maintenance model.

Subscribe on YouTube