The Bill That Didn't Match
My application said I had spent $12.
I had been building a project on top of a Gemini model, and like any careful engineer I had instrumented it. Every call my app made to the model was metered, logged, and totted up. Month to date: $12. I trusted that number the way you trust a gauge on your own dashboard — it was my code counting my calls.
Then the real bill arrived: $32.

The hunt
A near-threefold surprise is the kind of thing that ruins an afternoon. My first assumption was that my own accounting had a hole in it — a code path I’d forgotten to meter, a retry loop double-charging, a model I’d priced wrong. So I went hunting through my own books, looking for the missing twenty dollars somewhere in my app.
It wasn’t there. My app’s numbers were correct. They were just not the whole story.
The missing spend was me. While building, I had been poking at the same model from Google’s AI Studio — running test prompts, trying queries by hand, checking how the thing behaved. And I had been doing all of it with the same API key.
That was the moment the penny dropped. My app could only ever count what flowed through my app. Every prompt I ran somewhere else against that key was real money against the same account — and completely invisible to my careful little meter.
The blind spot is the design
So I went looking at how everyone else solves this. The tools fall into two camps, and they share the same flaw.
Some hand you a library wrapper: import their SDK, route your calls through it, and it tallies what passes by. Others stand up a proxy: point your traffic at their endpoint and they meter the stream.
Both work — right up until something doesn’t go through them. The wrapper only sees the calls you remembered to wrap. The proxy only sees the traffic you remembered to route. The afternoon I’d just lost was a wrapper’s blind spot in miniature: the AI Studio queries never touched my instrumented code, so they were never counted. An intermediary can only measure what you remember to push through it. The one bit of spend you forget about is, by definition, the bit it can’t see.
Going to the source
Then a simpler thought: the provider already knows.
Every single time you call a model — from your app, from a notebook, from a web console at two in the morning — the provider meters it. They have to; it’s how they bill you. There is exactly one ledger that is guaranteed complete, and it isn’t yours. It’s theirs.
So why count downstream at all? Why build a meter that can be bypassed, when the unbypassable meter already exists at the source?
That is CostCompass. Instead of asking you to route your work through one more intermediary, it goes to the source of truth and reads the spend back from the provider directly. There is nothing to wrap, nothing to route, nothing to remember. The number it shows you is the number on the bill, because it comes from the same place the bill does. If $32 was spent, you see $32 — including the afternoon you spent in a console you forgot to instrument.
One thing left to solve
Reading from the source means proving to each provider that you are you — which means credentials. And I did not want to be one more company holding a pile of everyone’s API keys on a server.
How CostCompass handles that — keeping the keys without the server ever being able to read them — is a story for another entry.
Keep your bearings.