Does my Claude Pro or Max subscription cover my API usage?

No — they're billed separately. A Claude Pro, Max, Team, or Enterprise plan covers the Claude app, while anything your own code sends through the API is metered per token on top of it. Programmatic tools like Claude Code and the Agent SDK now draw from a separate per-plan credit pool charged at full API rates, so that spend is real metered usage even when it rides a subscription. CostCompass reads the metered side, so your API spend sits beside the flat subscription figure you enter by hand.

Do my plan's usage limits change what I pay?

No. A Claude Pro or Max plan has usage limits — a rolling five-hour window and a weekly cap — but those throttle how much you can run on the plan before it resets, not what it costs. Hit a limit and you wait for the window to reset; the flat monthly fee doesn't move either way. They're a throughput cap, like the API's rate limits, not a billing line, so there's no extra charge to track. The only Claude usage that adds to your bill is metered API and credit-pool spend, which is exactly the side CostCompass pulls in.

Does CostCompass track Claude Code and Agent SDK spend?

Yes, the part that runs through the API. Claude Code, the Agent SDK, and anything else calling Claude programmatically are metered through the same usage report CostCompass reads, so they land in your per-model breakdown instead of hiding inside one opaque line. Claude Code's own /usage command shows a single session; CostCompass rolls every key and tool into one month-to-date total across the whole organization.

Why is my Claude bill higher than the per-token rate suggests?

Several multipliers stack on the headline rate. Output tokens bill at several times the input rate, so a verbose model costs more than its prompt size implies; writing to the prompt cache costs above normal input; and priority service runs dearer than standard. The advertised per-token number is the floor, not the whole bill. Seeing spend per model and per day is how you find which one is driving the total.

Does prompt caching actually lower my Claude costs?

It shifts cost more than it erases it. Reading a cached prefix on later calls costs a fraction of normal input, but the first cache write costs above input and output tokens bill in full either way, so a workload dominated by long generations saves far less than the headline cache discount suggests. Caching only pays off when the cached prefix stays byte-identical — a timestamp or session id slipped into a system prompt quietly invalidates it and you pay to write the cache again. CostCompass folds cached and uncached tokens into each model's total, so the saving shows up in that model's cost.

Why does CostCompass need an Anthropic Admin API key?

The usage report lives behind Anthropic's Admin API, which returns usage grouped by model and token category across your whole organization. A standard API key can make Claude calls but can't read the usage report, so CostCompass asks for an Admin API key instead. You create it in your Anthropic organization settings — it needs an organization account, and an admin mints the key.

How current is the Claude spend CostCompass shows?

It's current as of the last time you clicked Refresh — CostCompass doesn't poll Anthropic in the background. Earlier days use Anthropic's settled daily usage, while today is read hour by hour, so each Refresh brings today's spend current to the most recent hour instead of leaving it blank until the day ends. Today's figure is a live estimate that firms up to the final daily number once the day closes; the month-to-date total and forecast recompute from every pull.

Can I track a single Anthropic workspace?

Yes. A card covers your whole organization by default — every workspace combined — but you can scope it to one workspace, chosen from the card's settings. Add a separate card per workspace to follow each on its own, or point one at just the default workspace where keys land unless you say otherwise. The org-wide view stays available either way.

Why use CostCompass instead of the Anthropic Console?

Anthropic's Console shows what you've already spent, not where your spend is heading. There's no running forecast, and no read on whether today's burn rate is trending over budget. CostCompass gives you a live month-to-date total and a projection of what next month will cost at your current pace, pulled in with a click. Add a second provider later — an image API, a hosting bill, an inference box — and it's already folded into one combined total instead of another dashboard to reconcile.

Claude cost tracking

Know what Claude is actually costing you

Anthropic bills Claude by model and by token, and the invoice arrives a month too late to act on. Here is how the billing works, and how to keep a running total and a forecast in front of you.

See your own Claude spend — start free

By Joubert Berger Updated June 15, 2026

The invoice from Anthropic tells you one number: what you spent on Claude last month. It won't tell you whether that number is about to double, which models are driving it, or how it stacks up against everything else you run. If you're a solo developer or a small team shipping on top of Claude, that figure arrives after the month is already over.

This guide covers how Anthropic bills for Claude, why keeping a running total is harder than it sounds, and how to see where your spend is heading before the invoice lands.

An antique almanac engraving: a jug pours coin-like tokens into a hopper that splits the single stream into four collecting pans of unequal fill — one pan picked out in copper — while a wax-sealed letter lies unopened nearby. — One spend, split across four token meters — totalled only by an invoice still sealed.

How does Claude API billing work?

Claude costs usually fall into two buckets, and a complete picture needs both. The first is metered usage: pay-as-you-go API calls, priced by the token and varying by model. The second is a flat-rate subscription — Claude Pro and Max for the chat app, Team and Enterprise seats for organizations. Most developers shipping on the API live mostly in the first. The two get reported in different places, which is the first reason a single running total is hard to keep.

Start with metered usage. At its simplest, a request meters two things. There are the input tokens you send: your prompt, system instructions, and any conversation history. And there are the output tokens Claude generates in reply. (Prompt caching splits input further, as the next section covers.) Output is the more expensive of the two, often several times the input rate, so a verbose assistant costs more than its prompt size alone would suggest.

Rates vary widely by tier. Anthropic’s lineup runs from Haiku, the fast and inexpensive tier, through Sonnet — currently Sonnet 5, the everyday workhorse — up to Opus for hard reasoning work. Above all three sits the Mythos-class frontier — Claude Fable 5, Anthropic’s most capable widely available model and its most expensive, priced at roughly double the per-token cost of Opus. Across the current lineup the gap from cheapest to priciest now spans about an order of magnitude, so choosing the right model for each job is the single biggest lever on your bill.

Tier	Relative cost	Best suited for
Haiku	Cheapest	High-volume, latency-sensitive, simpler tasks
Sonnet	Mid-range	Everyday coding, drafting, and agent work
Opus	High-end (many times Haiku per token)	Hard reasoning and long-horizon agentic coding
Fable (Mythos)	Most expensive (about double Opus)	The most demanding, long-horizon autonomous work

Prompt caching adds a wrinkle. When you reuse a large, stable prefix — a long system prompt, say, or a big reference document — Anthropic can cache it. Writing to the cache is priced above ordinary input (how far above depends on the cache duration you choose), while reading from it on later calls costs a fraction of the normal input rate. Used well, caching cuts the cost of repeated context sharply. That’s why cache-read and cache-write tokens are worth tracking on their own rather than lumping all input together. Exact per-model rates live on Anthropic’s API pricing page and change over time, so treat any number you commit to memory as provisional.

That leaves four distinct token categories on a single request, each priced differently:

Token category	How it’s priced
Input	Base rate for the prompt, system instructions, and history you send
Output	Highest rate — typically several times the input rate
Cache write	Above the input rate (how much depends on cache duration), charged when a prefix is first cached
Cache read	A fraction of the input rate on later calls that reuse the cached prefix

The subscription side is simpler, but easier to lose track of. A Claude Pro or Max plan, or a row of Team seats, is a fixed monthly charge that doesn’t move with usage, and it lives entirely outside the usage API. No billing endpoint returns it. You account for it by hand.

Why is Claude spend hard to track?

Tokens are simple. Keeping a running total across a real project is not. A few things get in the way:

Keys multiply. Once you have a staging environment, a production app, and a personal experiment, you’re holding several API keys, sometimes split across separate Anthropic organizations or accounts, each with its own slice of spend. The number you care about is the total across all of them, and once they span more than one account — let alone more than one provider — nothing adds that up for you.
Usage is spiky. A batch job, a traffic surge, or one runaway retry loop can multiply a day’s spend several times over. By the time it shows up on the invoice, the money is already gone.
There’s no forecast. Anthropic shows you what you have used. It doesn’t show what you’re on track to use. The question you actually need answered — “at this rate, what will next month cost?” — isn’t answered anywhere in the console.
Claude is rarely your only provider. If you also call OpenAI, run inference on RunPod, or serve a site through Vercel and Cloudflare, the Claude figure is one line in a bill you assemble by hand from a half-dozen dashboards.

How can you reduce your Claude API bill?

Tracking is half the job. Acting on what you see is the other half. A few levers move a Claude bill the most:

Right-size the model. Send high-volume, simpler calls to Haiku, keep everyday work on Sonnet, and reserve Opus — or the frontier Fable tier above it — for the work that genuinely needs it. With the priciest tier running about ten times the per-token cost of the cheapest, matching model to task is the biggest cut available.
Lean hard on prompt caching. A large, stable prefix read from cache costs a fraction of full input. Keep that prefix byte-stable so it stays cached, rather than re-sending it uncached on every call. On Claude specifically this is often the biggest saving you can find.
Trim output. Output bills several times the input rate. Asking for concise answers, capping max tokens, and avoiding needless re-generation all pull the expensive side down.
Batch the non-urgent work. Move evals, backfills, and offline jobs off the interactive path so a spike there doesn’t blur your live spend.
Watch the trend, not just the total. A per-day, per-model view turns a creeping spike into something you catch in days rather than discover on the invoice.

Here’s how quickly this compounds. A large system prompt re-sent uncached on every request can cost several times what the same prefix costs as a cache read. Turning that one prefix into a cache hit often saves more than switching models does.

Spreadsheet or dashboard: which should you use?

The usual first answer is a spreadsheet. You log into the Anthropic Console, read the usage page, copy the figure into a sheet, and repeat for every other provider. It works for a while. Then you miss a few days, the numbers go stale, and the sheet quietly stops matching reality, usually right when a spike hits.

The deeper problem is that a spreadsheet is a snapshot. Keeping it current means re-reading every console and re-typing every figure by hand. It won’t break spend down by model without manual work. And it only ever shows you where the month has been, never where it’s heading.

A dashboard does the arithmetic for you. Pull the latest usage with one click and you get month-to-date, per model, across every provider. The point isn’t prettier charts. It’s that nothing between the raw usage and the number you read is yours to convert or keep current.

How do you forecast next month’s Claude bill?

Forecasting next month’s spend doesn’t require anything exotic. The reliable approach is a run rate. Take your spend over the last several days, turn it into a daily average, and project it across a full month.

A 30-day line chart of daily Claude spend, with a recent upward slope. — A 30-day spend trend makes a developing spike visible days before the invoice would.

CostCompass uses exactly this method. It takes your trailing seven-day burn rate, multiplies it by the number of days in next month, then adds next month’s fixed subscriptions. The result is one forward number: at this rate, here is what next month will cost. Seven days is short enough to react when usage shifts, yet long enough to even out a single heavy day against a quiet one.

How does CostCompass track your Claude API costs?

CostCompass connects to Anthropic’s Admin API and reads your Claude usage directly, broken down by model, one day at a time. Anthropic reports usage split into ordinary input, cache writes, cache reads, and output, and CostCompass reads all of it — so the per-model view is built from your real traffic. Anthropic’s report is in tokens, not a finished total, so CostCompass prices each category from its published per-model rates — the reduced cache-read rate, the higher cache-write rate, and the standard input and output rates — into a close month-to-date estimate.

Anthropic only publishes a day’s total once that day has closed, so a naive daily read would miss today entirely. CostCompass reads the current day hour by hour instead and folds it into the same per-day view, so today’s spend shows up to the latest hour rather than waiting until tomorrow. Once the day closes, that live figure settles to Anthropic’s final daily number, and a re-read of the previous month each time means late billing adjustments land too.

The CostCompass dashboard showing month-to-date spend with a forecast and burn rate. — Month-to-date spend across every connected provider, with a forecast and burn rate, rolled up for you in a single view.

Two things make it practical for a single developer. First, your key is encrypted in your browser before it’s ever stored. The Anthropic admin key you connect is sealed with your vault password and saved only as ciphertext CostCompass can’t decrypt, so our App Server only ever holds that ciphertext. Your vault password stays in your browser. When we fetch Anthropic usage data, the key is decrypted in your browser and forwarded straight to Anthropic through a broker that holds it for the moment of the call and is built not to log it. The plaintext stays out of our database and our logs.

Second, Claude doesn’t sit alone. The same dashboard rolls Anthropic up alongside your other connected AI and compute providers — OpenAI and the models you reach through OpenRouter included, plus voice work on ElevenLabs — into one month-to-date total, with the same forecast and per-provider breakdown covering all of them. For the wider walkthrough, the guide on tracking AI costs across your providers weighs every approach by effort.

A horizontal bar chart of month-to-date spend by provider, Anthropic at the top. — Where the month's spend actually went, provider by provider.

For the part no API exposes — a Claude Pro, Max, Team, or Enterprise subscription — you add the plan on the Claude card itself. The flat monthly fee then joins that card’s month-to-date total and forecast beside the metered tokens, prorated by the day so it’s counted once and never missed. If the plan is all you have, with no API key, a subscription-only card tracks just that.

Getting started takes three steps:

In Anthropic, create an Admin API key — this needs an organization account, and an admin can mint it. The usage report comes from the Admin API, not from a standard API key.
Paste it into CostCompass. It’s encrypted in your browser before it’s stored, so our server only ever holds ciphertext.
Confirm the month-to-date figure matches what you expect. From then on, a click on Refresh pulls your latest metered Claude usage, and the running total and the forecast roll up from it for you.

Frequently asked questions

Does CostCompass store my Anthropic API key?: Not in any form we can read. Your API key is encrypted in your browser with your vault password before it ever leaves your machine. CostCompass stores only the resulting ciphertext, an opaque blob it has no way to decrypt. Your vault password stays in your browser. When we fetch Anthropic usage data, the key is decrypted locally and forwarded to Anthropic through a broker that holds it only for the duration of that call and is built not to log it. The plaintext never reaches our database or our logs.
Does my Claude Pro or Max subscription cover my API usage?: No — they're billed separately. A Claude Pro, Max, Team, or Enterprise plan covers the Claude app, while anything your own code sends through the API is metered per token on top of it. Programmatic tools like Claude Code and the Agent SDK now draw from a separate per-plan credit pool charged at full API rates, so that spend is real metered usage even when it rides a subscription. CostCompass reads the metered side, so your API spend sits beside the flat subscription figure you enter by hand.
Do my plan's usage limits change what I pay?: No. A Claude Pro or Max plan has usage limits — a rolling five-hour window and a weekly cap — but those throttle how much you can run on the plan before it resets, not what it costs. Hit a limit and you wait for the window to reset; the flat monthly fee doesn't move either way. They're a throughput cap, like the API's rate limits, not a billing line, so there's no extra charge to track. The only Claude usage that adds to your bill is metered API and credit-pool spend, which is exactly the side CostCompass pulls in.
Does CostCompass track Claude Code and Agent SDK spend?: Yes, the part that runs through the API. Claude Code, the Agent SDK, and anything else calling Claude programmatically are metered through the same usage report CostCompass reads, so they land in your per-model breakdown instead of hiding inside one opaque line. Claude Code's own /usage command shows a single session; CostCompass rolls every key and tool into one month-to-date total across the whole organization.
Why is my Claude bill higher than the per-token rate suggests?: Several multipliers stack on the headline rate. Output tokens bill at several times the input rate, so a verbose model costs more than its prompt size implies; writing to the prompt cache costs above normal input; and priority service runs dearer than standard. The advertised per-token number is the floor, not the whole bill. Seeing spend per model and per day is how you find which one is driving the total.
Does prompt caching actually lower my Claude costs?: It shifts cost more than it erases it. Reading a cached prefix on later calls costs a fraction of normal input, but the first cache write costs above input and output tokens bill in full either way, so a workload dominated by long generations saves far less than the headline cache discount suggests. Caching only pays off when the cached prefix stays byte-identical — a timestamp or session id slipped into a system prompt quietly invalidates it and you pay to write the cache again. CostCompass folds cached and uncached tokens into each model's total, so the saving shows up in that model's cost.
Why does CostCompass need an Anthropic Admin API key?: The usage report lives behind Anthropic's Admin API, which returns usage grouped by model and token category across your whole organization. A standard API key can make Claude calls but can't read the usage report, so CostCompass asks for an Admin API key instead. You create it in your Anthropic organization settings — it needs an organization account, and an admin mints the key.
How current is the Claude spend CostCompass shows?: It's current as of the last time you clicked Refresh — CostCompass doesn't poll Anthropic in the background. Earlier days use Anthropic's settled daily usage, while today is read hour by hour, so each Refresh brings today's spend current to the most recent hour instead of leaving it blank until the day ends. Today's figure is a live estimate that firms up to the final daily number once the day closes; the month-to-date total and forecast recompute from every pull.
Can I track a single Anthropic workspace?: Yes. A card covers your whole organization by default — every workspace combined — but you can scope it to one workspace, chosen from the card's settings. Add a separate card per workspace to follow each on its own, or point one at just the default workspace where keys land unless you say otherwise. The org-wide view stays available either way.
Why use CostCompass instead of the Anthropic Console?: Anthropic's Console shows what you've already spent, not where your spend is heading. There's no running forecast, and no read on whether today's burn rate is trending over budget. CostCompass gives you a live month-to-date total and a projection of what next month will cost at your current pace, pulled in with a click. Add a second provider later — an image API, a hosting bill, an inference box — and it's already folded into one combined total instead of another dashboard to reconcile.

About the author

Joubert Berger builds CostCompass, a spend-intelligence dashboard that pulls usage from AI and compute providers into one month-to-date total, a forecast, and a per-provider breakdown. This guide reflects how CostCompass reads each provider's own usage API — see the security model for how your keys are handled.

Stop guessing at the Claude bill

Connect Anthropic once and pull your metered Claude spend — month-to-date, forecast, and per-model breakdown — with one click, rolled up for you.

Start free