In-Depth
Slammed by Copilot Usage-Based Billing on Day 1, Facing $180 Bill for June
Of all the days for GitHub Copilot Chat to go bonkers on me in VS Code it was yesterday, the start of the controversial new usage-based plan.
If you haven't heard of it, check out "Devs Sound Off on Usage-Based Copilot Pricing Change: 'You Will Get Less, but Pay the Same Price'."
Basically the new usage-based billing plan replaces premium-request counting with monthly GitHub AI Credits, meters usage by token consumption, and lets paid users or organizations buy additional credits after included allowances are exhausted.
Mine are pretty much exhausted on day one because Copilot lost its mind. I'm not even a coder, I'm a journalist who uses it for hands-on articles about agentic AI features, paying for GitHub Copilot Pro personally because it also helps me write, edit and format articles. I have a huge SKILL.md file that provides instructions to format articles from different freelance writers according to their content, style and so on.
Yesterday, the first of the month, I got a monster code-heavy tutorial for our Data Science Lab. My agent has a nifty Quality Assurance feature at the end of the formatting pass where it would flag possible problems with spelling, grammar, style formatting on so on. It described the potential problem, offered a fix and showed me the diff, giving me the interactive option to apply it or not.
For some reason that conked out yesterday and I spent hours trying to get Copilot to fix things and go back to the way they were, which entailed countless terminal commands, edits to Python scripts and other settings files and so on. I had no idea what it was doing, but nothing worked and I tried everything. I tested it on my personal PC and it worked but I couldn't get my work PC to synchronize whatever needed to be synchronized.
I finally got it to act somewhat like it used to after a nightmarish experience, after which today I checked my account and found a huge amount of usage, with Microsoft Copilot predicting I was on track for a $180 bill for this month if I continued this pace.
Slightly Concerning (source: Microsoft Copilot).
Actually it was none of those things. Just plain usage trying to solve my problem. Here's what my GitHub Billing and Licensing showed:
[Click on image for larger view.] AI Usage (source: GitHub).
In VS Code I asked Copilot what the hell had happened to cause a possible $180 bill.
Short answer: most of that likely came from one heavy editing session, not many small chats.
What your local history shows for yesterday (June 1):
1. 2 sessions total.
2. 40 turns total.
3. One session did most of the work:
• Session c7a1cb19-66e1-4a5b-84fb-0552fd9ce947
• 32 turns
• 31 files touched
4. You also had very large prompt payloads:
• Largest user message was 24,786 characters.
Why that can burn credits quickly:
1. Large pasted context + attachments increase input tokens.
2. Multi-turn editing/QA loops resend growing conversation context each turn.
3. Formatting workflows that read/edit/check files repeatedly are token-intensive
I asked it for suggestion to avoid a huge bill and it said:
Based on your actual usage yesterday, here are the highest-impact ways to reduce credits:
Compact long chats earlier
You had a 32-turn session with 0 checkpoints (no compaction). In long editing sessions, use /compact around every 10-15 turns, or when switching to a new article section. This cuts repeated context re-sends.
Split work into smaller sessions
Most of the usage came from one heavy session. Start a fresh chat per article (or per phase: format, QA, social posts) instead of keeping one long thread alive.
Avoid large raw pastes in prompts
Your largest user message was ~24,786 characters. That is expensive input. Prefer: “Edit the open file” + specific section instructions, or targeted excerpts only.
Reduce back-and-forth micro-turns
Usage scales with turn count. Batch requests into one prompt, for example:
“Format article, fill summary/takeaways, generate social posts, then run QA and show top 3 actionable fixes.”
Use a lighter model for routine passes
For repetitive editorial cleanup/QA, use a cheaper model tier and reserve premium models for complex rewrites or strategy work.
Further advice included:
1. Start a new chat per article phase.
Use separate chats for formatting, QA, and social copy so context does not bloat across unrelated steps.
2. Compact every 10-15 turns in long sessions.
If a session goes long, run `/compact` before switching tasks or files.
3. Stop pasting full articles into prompts.
Point me to the file and section instead. Large pasted prompts were a major cost driver for you.
4. Batch requests into one instruction block.
Instead of many micro-prompts, use one: “Format + summary + takeaways + social + run QA + show top 3 actionable fixes.”
5. Keep QA cycles short and decisive.
When QA issues are shown, apply/skip in batches (`Apply 1,2,3` or `Skip 1,2,3`) to reduce back-and-forth turns.
6. Use lighter models for routine editorial passes.
Reserve premium models for hard rewrites, strategy, or complex analysis.
7. Reuse a standard editorial prompt template.
A consistent template reduces clarification turns and repeated context setup.
8. Avoid reopening solved threads.
When a piece is finalized, start a fresh chat for the next article instead of continuing the old one.
9. Limit scope per session.
Target one file at a time unless cross-file edits are necessary.
10. Turn on sync-based usage visibility.
Enable `chat.sessionSync.enabled` so we can analyze true token-level usage and tune your workflow with exact data, not proxies.
It also offered to draft a "default low-cost article workflow prompt" so I can paste it at the start of every new article session.
Low-Cost Editorial Workflow Prompt
Task:
Format the open article for CMS with minimal token usage.
Rules:
1. Work only on the currently open file.
2. Do not ask exploratory questions unless blocked.
3. Make one consolidated edit pass, then one QA pass.
4. Do not propose optional rewrites unless I ask.
5. Keep responses brief: status + exact changes only.
6. In QA, show only actionable issues (ignore false positives like product names/technical terms).
7. Present QA issues in batches of 3 max.
8. Wait for Apply/Skip commands before changing QA items.
9. If no actionable issues remain, say QA complete.
Required output order:
1. Headline
2. Byline
3. Summary (one sentence)
4. Key Takeaways (3 bullets)
5. Body with CMS subhead format
6. Social posts at end (X + LinkedIn)
Cost controls:
1. Use existing file context, no full-file re-paste.
2. Keep each progress update to 1-2 short sentences.
3. Avoid repeated scans unless file changed.
4. Stop after completion and wait for next command.
After formatting:
Run one QA check and return:
1. Applied fixes summary
2. Remaining actionable issues only
3. Ready for Apply/Skip
I'll try this advice and report back. Meanwhile, beware.
About the Author
David Ramel is an editor and writer at Converge 360.