CostCompass An Almanac Beta
OpenRouter cost tracking

See where your OpenRouter spend actually goes

OpenRouter routes one API key across dozens of model vendors and draws it all from a single prepaid balance. That convenience buries the question that matters — which model spent what. Here is how the billing works, and how to turn OpenRouter's own per-model figures into a running total and a forecast beside everything else you run.

By Joubert Berger Published June 7, 2026

OpenRouter's pitch is one endpoint for every model: point a single key at it and your calls fan out to Anthropic, OpenAI, Google, and a long list of open-weight hosts, all paid from one prepaid balance. The catch is in what you give up to get it. When every model draws on the same balance, the number going down tells you the level and nothing about which model is pulling it.

OpenRouter already keeps the figures that answer that, model by model. The work left is pulling them into a running total and a forecast before the balance runs low.

An antique almanac engraving: a single inflow channel pours into a central rotating distributor valve that divides the one stream into a fan of graduated measuring jugs of unequal fill, the rotating arm picked out in copper.
One stream in, many jugs out — and the arm in the middle decides which one fills.

How does OpenRouter bill for usage?

OpenRouter runs on a prepaid balance. You add credit to your account, and every call deducts from it as you go. There’s no monthly invoice and no seat to reason about, just a balance that falls. Behind that one balance sits a router that forwards each request to whichever model you asked for, across many providers at once, and bills them all to the same account.

Most calls are metered the way token APIs are: by the input tokens you send and the output tokens the model generates back, with output often the pricier side. The per-token rate then depends on which model the request was routed to, and that can be anything from a cheap open-weight model to a flagship reasoning run. Some models add their own units on top — a per-request charge, an image, a web-search call — but for most text work it’s the input and output tokens that move the bill. A blended bill across a dozen models is the normal case here.

On the inference itself, OpenRouter charges pass-through rates: you pay what the underlying provider charges, with no markup added to the per-token price. Where OpenRouter makes its money is around the edges of that: a small percentage fee when you buy credits, and a bring-your-own-key fee if you route high volumes through your own provider keys rather than OpenRouter’s. The credit-purchase fee in particular is charged at top-up and sits outside the per-model usage feed, so it’s a line that doesn’t show up in any model’s row. (Exact rates and fees live on OpenRouter’s pricing page and have changed before, so treat any figure you’ve memorized as provisional.)

One mechanism works in your favor, carried straight through from the upstream providers. OpenRouter passes provider prompt-caching pricing through on the models that support it: when a later call reuses a prompt prefix the model has already seen, that portion bills at the cheaper cached rate. Some providers cache automatically; others want you to mark the reused prefix explicitly, so it’s worth checking how the model you route to handles it. Keep your stable prefixes consistent across calls — long system prompts and reference material especially — and a slice of your input cost moves to the cheaper tier.

So a single day of OpenRouter usage can draw on several different prices at once:

What you’re paying forHow OpenRouter handles it
Input / output tokensThe routed model’s own per-token rate, pass-through
Cached inputThe cheaper cached rate on supported models, passed through
Credit purchasesA small percentage fee when you top up — outside per-model usage
Bring-your-own-key routingA percentage fee above a free monthly threshold

Why is OpenRouter spend hard to keep ahead of?

The routing that makes OpenRouter convenient is also why the spend resists a quick glance. For a solo developer running it next to three or four other providers, the routing layer is usually the part that surprises you most. The difficulty comes down to what one shared balance hides:

  • One balance, many models. A falling balance tells you the level, not which model pulled it down or how fast. By the time the number looks low enough to notice, the routing that drained it is already behind you.
  • Routing can shift the blend silently. Change which model handles a job, or let a fallback route send traffic to a pricier model when your first choice is busy, and the bill moves while your request volume sits flat.
  • The fee sits off to the side. Inference is pass-through and itemized by model. The credit-purchase fee isn’t attributed to anything in the usage feed, so it leaves your balance with no row of its own.
  • OpenRouter is rarely your only provider. Even though it fronts many models, you likely also call Anthropic for Claude directly, build on OpenAI, run Gemini, or reach for DeepSeek, plus whatever you spend on inference hosts and serving, so the OpenRouter balance is one slice of a bill you otherwise stitch together by hand from a stack of dashboards.
A horizontal bar chart of month-to-date OpenRouter spend broken down by routed model — a Claude model, a GPT model, Gemini, and an open-weight model — each with its own cost.
Month-to-date OpenRouter spend rolled up by routed model, summed from the per-endpoint rows OpenRouter's activity feed reports.

How can you reduce your OpenRouter bill?

Tracking shows you where the balance goes. A few levers slow the drain:

  • Right-size the route. OpenRouter’s whole point is that you can send the easy, high-volume calls to a cheap model and reserve a flagship for the work that needs it. The per-token spread across models is wide, so that one routing choice usually moves the bill more than anything else you can do.
  • Watch the per-model split. A ranked list catches a creeping shift toward a pricier model within days. The balance alone shows it only once you’re nearly dry.
  • Keep stable prefixes consistent. On models that support caching, long system prompts and reference documents that don’t change land on OpenRouter’s pass-through cached rate, which costs a fraction of full input.
  • Keep outputs tight. Output is usually the pricier side. Trimming verbose responses and capping how much the model generates cuts the part of the bill that grows fastest, across whichever model the request routes to.

How do you forecast next month’s OpenRouter bill?

Forecasting a prepaid balance comes down to a burn rate. Take your spend over the last several days, turn it into a daily average, and project that across a full month.

CostCompass does this. It scales your trailing seven-day burn rate to the number of days in next month and adds any fixed subscriptions you’ve entered, and the result is one forward number for what next month costs if the current pace holds. Seven days is the window. It’s long enough that one heavy day doesn’t throw the average, and short enough to pick up a recent change, like a route that started leaning on a pricier model. OpenRouter reports cost exactly, so that projection rides on real figures rather than estimates. The same engine runs across every provider you’ve connected, so OpenRouter folds into one whole-stack number.

How does CostCompass track your OpenRouter costs?

CostCompass connects to OpenRouter with a Provisioning key and reads two things: your activity feed, itemized one day at a time by the endpoint each call was routed to, with input and output tokens for each, and your prepaid balance. Because one model can be served by several providers, CostCompass rolls those endpoint rows up per model. Each row already carries the cost OpenRouter worked out from the tokens its upstream providers processed, so CostCompass records that figure as reported and doesn’t recompute it. The per-model number you see is the sum of what OpenRouter charged against your balance.

OpenRouter issues two kinds of key, and only one of them works here:

Key typeWhat it doesWorks with CostCompass
Inference keyAuthenticates your API calls to route modelsNo
Provisioning keyReads account activity, usage, and credit balanceYes

You make a Provisioning key in the OpenRouter dashboard under Provisioning keys. Paste an inference key by mistake and CostCompass can’t read your activity, so it flags the key as the wrong type.

That gives you the breakdown a single balance can’t: which model spent what, day by day, ranked. When you first connect, CostCompass pulls about the last month of completed days. OpenRouter’s activity feed reaches back roughly thirty days and leaves out the day still in progress, so your history starts a month back rather than at your account’s first call, and each refresh pulls in any newly completed days. The credit-purchase fee is the one piece no usage feed itemizes; it leaves your balance without a model to attribute it to. That fee sits outside the per-model running total, which CostCompass builds from your usage — but CostCompass reads your prepaid balance too, and that balance reflects every drop, the fee included.

The CostCompass dashboard showing month-to-date spend across providers with a forecast and burn rate.
Month-to-date spend across every connected provider, with a forecast and burn rate, rolled up for you in a single view.

Two things make that practical for a solo developer. Your key is encrypted in your browser before anything is stored. The Provisioning key you paste is sealed with your vault password on your own device, and only the sealed version is ever saved, a blob CostCompass can’t open because the vault password stays with you. When it’s time to read your activity, your browser unseals the key and passes it to a relay that uses it for that one call to OpenRouter and is built not to log or keep it. What we hold at rest is locked ciphertext. The key your account runs on never reaches us in usable form.

And OpenRouter doesn’t sit alone. The same dashboard rolls its running total up with every other AI and compute provider you’ve connected, from Claude and OpenAI to Gemini and your hosting, into one figure with a forecast. How you reach that cross-provider number is its own decision, whether you do it by hand or read straight from each billing API; the ways to track AI costs across providers lay the options out side by side.

Getting started takes three steps:

  1. In OpenRouter, create a Provisioning key, the kind that can read your account activity rather than the inference key you make calls with.
  2. Paste it into CostCompass. It’s encrypted in your browser before it’s stored, so the server only ever holds ciphertext.
  3. Click Refresh. CostCompass reads your per-model activity and balance, and from there your running total, forecast, and by-model breakdown roll up, OpenRouter included in the whole-stack number.

Frequently asked questions

What kind of OpenRouter key does CostCompass need?
A Provisioning key, not a regular inference key. You create one in the OpenRouter dashboard under Provisioning keys. It's the credential that can read your account activity and credit balance, which the ordinary key you make API calls with can't do. Paste an inference key by mistake and CostCompass will tell you. One Provisioning key is everything it needs. There's no per-model setup to do.
Does CostCompass store my OpenRouter key?
Not in any form we can read. Your Provisioning key is encrypted in your browser with your vault password before it leaves your device, and what CostCompass stores is only the resulting ciphertext — an opaque blob it can't decrypt, since your vault password never leaves your browser. When it's time to read your activity and credit balance, the key is unsealed in your browser and forwarded to OpenRouter through a relay that holds it only for the length of that one call and is built not to log it. So the request does pass the key through our infrastructure, but the plaintext is never written to our database or our logs. What we keep at rest is ciphertext that's useless without your vault password.
Does CostCompass break OpenRouter spend down by model?
Yes. OpenRouter's activity feed records usage one day at a time, itemized by the endpoint each call was routed to. Because the same model can be served by more than one provider, CostCompass adds those endpoint rows back up per model — Claude, a GPT model, Gemini, an open-weight host — with the input and output tokens behind each. The spend you'd otherwise see only as a falling balance becomes a ranked list of where it went.
Is the OpenRouter cost exact or estimated?
Exact, as OpenRouter reports it. Many token APIs hand back counts and leave the costing to you. OpenRouter takes the tokens its upstream providers processed, works out the cost itself, and returns that figure per model per day. CostCompass records it as reported and doesn't recompute it from a pricing table, so there's no estimate to drift. The number you see is the one OpenRouter charged your balance.
Can it forecast next month's OpenRouter bill?
Yes. CostCompass takes your trailing seven-day burn rate, scales it to the length of next month, then adds any fixed subscriptions you've entered. The result is one forward number for what next month costs if the current pace holds. The same forecasting engine runs across every provider you've connected, so OpenRouter folds into the whole-stack projection.
Why use CostCompass instead of OpenRouter's activity page?
OpenRouter's activity page shows what you've already spent through OpenRouter, after the fact, for OpenRouter alone. It won't tell you how fast your balance is falling or where your spend is heading, and it says nothing about the rest of your bill. CostCompass turns that same per-model usage into a running total and a forecast, and rolls OpenRouter up with Claude, OpenAI, Gemini, and your hosting into one view. It updates the moment you click Refresh, rather than a number you go read and a slope you work out yourself.

About the author

Joubert Berger builds CostCompass, a spend-intelligence dashboard that pulls usage from AI and compute providers into one month-to-date total, a forecast, and a per-provider breakdown. This guide reflects how CostCompass reads each provider's own usage API — see the security model for how your keys are handled.

Stop watching one balance fall for a dozen models

Connect OpenRouter once and turn its per-model usage into a running total, a forecast, and a breakdown by routed model, rolled up with every other provider in one click.