Does this work with Azure OpenAI?

No. Azure OpenAI is billed through your Azure account and shows up in Azure's own cost tools, not in your OpenAI billing. CostCompass reads spend from your OpenAI account (platform.openai.com), so it covers keys you create there — not models you run through Azure. If you use both, the OpenAI side flows into CostCompass and the Azure side stays in the Azure portal.

Can it tell me what each of my customers or end-users cost?

Not per end-user. OpenAI reports spend for your whole organization, split by model and by service — there's no per-customer breakdown in what it bills. If you put each customer in their own OpenAI project, you can point a CostCompass card at a single project and track that slice on its own. Splitting a shared project's spend back to individual users isn't something OpenAI bills for, so that has to come from usage tracking in your own app.

Can CostCompass show how much prepaid credit I have left?

No — and no tool can, because OpenAI doesn't report your remaining credit anywhere. What it shows is what you've already spent, not the balance you have left. CostCompass gives you the other half — a running month-to-date total and a forecast of where the month is heading, so you can see how fast you're burning through whatever you topped up, instead of finding an empty balance mid-call.

How does it handle reasoning tokens?

When a GPT-5 (or older o-series) model "thinks" before answering, OpenAI charges you for that hidden reasoning and folds it into what the model costs. Since CostCompass shows what OpenAI actually charged, that reasoning cost is already included in the model's figure — there's nothing extra for you to add up.

Why doesn't a day's spend match my local calendar day?

OpenAI groups each day's spend by one global clock (UTC), not your local time zone, and CostCompass shows the days exactly as OpenAI reports them. So if you're several hours ahead of or behind UTC, a charge from late in your evening can show up on the next day. It all adds up the same over the month — only the most recent day can look shifted against your local calendar, until that day closes.

Why might the CostCompass figure differ slightly from OpenAI's dashboard?

It should track closely, because CostCompass reads the amount OpenAI billed rather than estimating from token counts — so the Batch API's reduced rate, automatic cached-input pricing, and other discounts are already baked into the figure it reads. The main thing to expect is a short delay — OpenAI's billing data lands roughly a day behind, so the most recent day reads low (in CostCompass and in OpenAI's own dashboard alike) until it catches up, usually within a day. Smaller gaps can come from promotional credits applied after the fact, or from when you last refreshed.

Can CostCompass cap or stop my OpenAI spend?

No — CostCompass reads your spend, it doesn't sit in front of your API calls, so it can't block a request or enforce a limit. OpenAI's own console can — it's rolling out a hard monthly spend limit that stops API traffic once your account reaches the cap you set, until you raise it or the next billing cycle begins. Alongside it sit softer spend alerts that email you as spend climbs without interrupting traffic. A hard cap is blunt, though — it kills every call until you lift it — so catching a spike early, while traffic still flows, stays the better first line of defense, and that's CostCompass's part — a running total and a per-day trend that make a spike visible when you look, while there's still time to act.

Do ChatGPT or Codex usage limits change my API bill?

No. Codex on a ChatGPT plan — and the ChatGPT app itself — carries usage limits — a rolling five-hour window plus weekly caps that throttle how much you can run on the plan before it resets. Those gate plan usage; they don't change what your metered API costs. CostCompass reads the metered side — pay-as-you-go calls billed per token through your Platform account, which is also where Codex lands when you sign it in with an API key rather than your plan. The one thing the costs API doesn't expose — if you hit a plan limit and buy extra Codex credits to keep going, that top-up is a subscription-side charge, so enter it by hand alongside your flat plan cost.

Why use CostCompass instead of OpenAI's usage dashboard?

OpenAI's dashboard shows what you've already spent on OpenAI, after the fact, with no projection of where your spend is heading. CostCompass gives you a forward-looking picture — a live month-to-date total with a forecast and a per-model split — for OpenAI and the rest of your connected providers, in one view. You can see a spike forming while there's still time to do something about it.

OpenAI cost tracking

Know what OpenAI is actually costing you

OpenAI bills the API by model and by token, across a GPT-5 lineup that stretches from cheap nano models to expensive reasoning runs. Here is how the billing works, and how to keep a running total and a forecast in front of you before the invoice lands.

See your own OpenAI spend — start free

By Joubert Berger Updated June 13, 2026

An antique almanac engraving: a graduated row of balance-weights ascending from tiny to large, the largest cut away to reveal a copper inner core, with a single key threaded through their ring-tops. — One lineup from nano to reasoning — and the largest cost is hidden inside.

OpenAI's invoice gives you one number: what the API cost last month. It won't tell you whether a single reasoning-heavy feature ate half of it, or whether this month is already running hotter than the last. For a solo developer or a small team building on the API, that figure lands after the month is over, when there's nothing left to change about it.

The short version: read the organization costs API with an Admin key — it returns the billed amount for every line item, per day, across the whole org — then total it, split it by model, and project the run rate forward. What follows is how OpenAI bills for the API, where your spend data lives and what it takes to read it, and how to get that running total in front of you mid-month, while there's still time to change course.

How does OpenAI bill for usage?

OpenAI spend has two sides, and a complete picture needs both. The first is metered API usage: pay-as-you-go calls priced by the token, varying by model. The second is a flat-rate subscription: ChatGPT Plus and Pro for individuals, Business and Enterprise seats for organizations. Most developers building on the platform live in the first. The two are billed in completely different places, which is why a single running total takes some assembly.

Start with metered usage. OpenAI charges for the API by the token, and the rate depends on which model handled the request. Every call has two metered parts: the input tokens you send (your prompt, system instructions, tool definitions, and any conversation history) and the output tokens the model generates back. Output is the more expensive side, usually several times the input rate, so a long completion costs more than the prompt size alone suggests. For most self-serve accounts the metered side is prepaid: you load API credits, usage drains the balance, and an optional auto-recharge tops it up — what arrives at month’s end is less an invoice than a balance that has quietly gone down.

The lever that moves your bill the most is model choice. The modern lineup spreads across price, and it also spans different units of billing. The flagship GPT-5 family handles text and reasoning, billed per token across a range that covers more than two orders of magnitude: a gpt-5.4-nano call costs a tiny fraction of a gpt-5.5-pro one. Around it sit models for other modalities, each metered its own way.

What you’re generating	Example models	How OpenAI bills it
Text & reasoning	gpt-5.5, gpt-5.4, gpt-5.4-mini/nano	Per input / output token
Realtime & voice	gpt-realtime-2	Per token (audio far above text) or per minute
Images	gpt-image-2, gpt-image-1-mini	Per generated image, plus prompt tokens
Video	sora-2, sora-2-pro	Per second of generated video

The older o-series reasoning models (o3, o4-mini) still show up on some bills, winding down as the GPT-5 family takes over their role. So one bill can mix three different units of metering.

Reasoning is the wrinkle that trips people up. Ask a GPT-5 model to think hard, whether through a high reasoning-effort setting or just a hard problem, and before it answers it generates a block of hidden reasoning tokens — its internal chain of thought, billed at the output rate even though you never see them. A prompt that looks small can carry a large, invisible output charge. (Exact per-model rates live on OpenAI’s API pricing page and change often — treat any number you’ve memorized as provisional.)

A couple of mechanisms can pull your actual bill below a naive tokens-times-rate estimate. OpenAI automatically caches large repeated prompt prefixes and charges a reduced rate when a later call reuses that input; the Batch API runs non-urgent work asynchronously at half price. Use both where you can. They also mean a figure computed from token counts at standard rates is an upper bound — your real invoice usually lands a little under it.

Then there’s the part that isn’t metered at all. A ChatGPT Plus, Pro, Business, or Enterprise plan is a fixed monthly charge that lives entirely outside the API billing system. No usage endpoint returns it, so you have to account for it by hand.

Where does OpenAI’s spend data actually live?

Your spend comes from OpenAI’s organization costs API, and reaching it is less straightforward than reading the bill. It takes a particular kind of key, and it returns data at an org-wide scope. Both are worth knowing before you start.

First, the key. The endpoint only accepts an Admin key (prefixed sk-admin-) that holds the Usage read permission, created under your organization’s settings. The ordinary project keys you use to make model calls (sk-proj-…) are refused outright. This is a different class of credential from the one your application ships with.

Second, the scope. That admin key reads spend for the whole organization at once, across every project and every project key underneath it. The dashboard gives you the opposite, a per-project view, and the org-wide read helps: where the console makes you click through each project and add the pieces up yourself, one admin key already returns the sum. The costs report comes back as a billed amount for every line item, bucketed by day — and because line items cover far more than tokens, the same read picks up storage, audio, image generation, fine-tuning, and the rest of what OpenAI charges for, not just model inference.

A horizontal bar chart of month-to-date OpenAI spend broken down by model — GPT-5.5, GPT-5.4, GPT-5.4 mini, and o4-mini. — Month-to-date OpenAI spend grouped by model, drawn from the billed amounts in the organization costs API.

Why is OpenAI spend hard to keep ahead of?

The billing is mechanical. Staying ahead of it across a real project is the hard part. Four things get in the way:

Reasoning tokens are invisible. A high-reasoning feature can burn far more output than its prompts imply, and you don’t see the thinking tokens that drove the charge, only the total, after the fact.
The lineup sprawls. With nano, mini, flagship, and reasoning-heavy models all in play, often within one codebase, the blended rate depends on a mix that shifts as you tune which model handles which job. A change that routes more traffic to a pricier model moves the bill without any change in request volume.
Usage is spiky. A batch job or one runaway retry loop can move a day’s spend several-fold, and by the time it surfaces on the invoice the spend is already done.
OpenAI is rarely your only provider. If you also call Anthropic for Claude or Google for Gemini, route through OpenRouter, run inference on RunPod, or serve a site through Vercel and Cloudflare, the OpenAI figure is one line in a bill you have to pull together by hand from a half-dozen dashboards.

How can you reduce your OpenAI API bill?

Tracking is only half the job. A handful of levers bring an OpenAI bill down:

Right-size the model. Route simple, high-volume calls to a mini or nano model and reserve the flagship and reasoning models for work that needs them. Because the cheapest and priciest GPT-5 models sit more than a hundred-fold apart per token, one routing change can cut a workload’s cost sharply.
Dial reasoning effort down. Reasoning tokens bill at the output rate, so a lower reasoning-effort setting on tasks that don’t need deep thinking trims the invisible output charge.
Lean on cached input. Keep large, stable prefixes like system prompts and reference documents consistent, so OpenAI’s automatic caching charges the reduced rate instead of full input on every call.
Batch the non-urgent work. Move backfills, evals, and offline jobs to the Batch API for half-price throughput.
Set a spend limit in the OpenAI console. OpenAI is rolling out a hard monthly spend limit that stops API traffic once you reach the cap, alongside softer spend alerts that email you as spend crosses the thresholds you set. The alerts turn a silent overrun into something you catch while the month is still live; the hard cap is your backstop against a runaway bill.
Watch the per-day trend. A per-day, per-model view turns a creeping spike into something you catch in days, not weeks.

Of these, right-sizing the model usually beats the rest combined. A service that defaults to a reasoning model on every request can spend more on hidden thinking than on its actual answers; routing the easy cases to a mini model and saving reasoning for the hard ones is often the single biggest cut available.

How do you forecast next month’s OpenAI bill?

Forecasting doesn’t take anything exotic. The dependable method is a run rate: average your spend over the last several days, then project that daily figure across a full month. The same run-rate approach to forecasting AI spending works whatever provider you’re on.

A 30-day line chart of daily OpenAI spend with a sharp mid-month spike that settles to a higher baseline. — A 30-day trend makes a spike — say, a new reasoning-model feature — visible days before the invoice would.

CostCompass uses exactly this method. It scales your trailing seven-day burn rate to the length of next month and adds any fixed subscriptions, giving you a single forward number for what next month will cost at the current pace. Seven days is a deliberate window: short enough to pick up a recent change like a new feature or a model swap, long enough that one heavy day doesn’t drag the average around.

How does CostCompass track your OpenAI API costs?

CostCompass reads your spend straight from OpenAI — the amounts OpenAI actually charged you, not an estimate it works out from token counts. Two things come from that. Your total tracks your OpenAI bill closely, because the discounts OpenAI already applied are baked into the figures it reads. And it counts more than your chat models: storage, audio, image generation, fine-tuning, and the rest of what OpenAI charges for all show up. Each model gets its own line so you can see what’s driving the cost, while the non-model charges are grouped together under “Other services” — so a storage or audio charge doesn’t hide off the books.

The CostCompass dashboard showing month-to-date spend with a forecast and burn rate. — Month-to-date spend across every connected provider, with a forecast and burn rate, in a single view.

All of that runs on the one credential you connected — that Admin key. Two things about how CostCompass works make it practical for a single developer. Your key is encrypted in your browser before it’s ever stored — locked with your vault password and saved only as scrambled text we can’t unlock. When it’s time to fetch, your browser unlocks the key and passes it to a relay that uses it only for the moment of each call and never logs it; the real key never reaches our database or our logs.

And OpenAI doesn’t sit alone. The same dashboard rolls it up alongside your other connected AI and compute providers, into one month-to-date total, a forecast, and a per-model breakdown. How you reach that cross-provider number, whether by hand, through tooling, or read straight from each billing API, is its own decision; the ways to track AI costs across providers lay the options out side by side.

Your ChatGPT subscription — Plus, Pro, Business, or Enterprise — is the one piece no usage API reports. Add it to your OpenAI card and the flat monthly cost sits in that card’s month-to-date total and forecast next to the metered API spend, prorated across the month. Only have the plan and no API key? A subscription-only card tracks that on its own.

Getting started takes three steps:

In OpenAI, create an Admin key with the Usage read permission.
Paste it into CostCompass — it’s encrypted in your browser before it’s stored, so our server only ever holds ciphertext.
Confirm the month-to-date figure matches what you expect. From then on, a click on Refresh pulls your latest metered API usage, and the running total and forecast roll up from it for you.

Frequently asked questions

Does this work with Azure OpenAI?: No. Azure OpenAI is billed through your Azure account and shows up in Azure's own cost tools, not in your OpenAI billing. CostCompass reads spend from your OpenAI account (platform.openai.com), so it covers keys you create there — not models you run through Azure. If you use both, the OpenAI side flows into CostCompass and the Azure side stays in the Azure portal.
Can it tell me what each of my customers or end-users cost?: Not per end-user. OpenAI reports spend for your whole organization, split by model and by service — there's no per-customer breakdown in what it bills. If you put each customer in their own OpenAI project, you can point a CostCompass card at a single project and track that slice on its own. Splitting a shared project's spend back to individual users isn't something OpenAI bills for, so that has to come from usage tracking in your own app.
Can CostCompass show how much prepaid credit I have left?: No — and no tool can, because OpenAI doesn't report your remaining credit anywhere. What it shows is what you've already spent, not the balance you have left. CostCompass gives you the other half — a running month-to-date total and a forecast of where the month is heading, so you can see how fast you're burning through whatever you topped up, instead of finding an empty balance mid-call.
How does it handle reasoning tokens?: When a GPT-5 (or older o-series) model "thinks" before answering, OpenAI charges you for that hidden reasoning and folds it into what the model costs. Since CostCompass shows what OpenAI actually charged, that reasoning cost is already included in the model's figure — there's nothing extra for you to add up.
Why doesn't a day's spend match my local calendar day?: OpenAI groups each day's spend by one global clock (UTC), not your local time zone, and CostCompass shows the days exactly as OpenAI reports them. So if you're several hours ahead of or behind UTC, a charge from late in your evening can show up on the next day. It all adds up the same over the month — only the most recent day can look shifted against your local calendar, until that day closes.
Why might the CostCompass figure differ slightly from OpenAI's dashboard?: It should track closely, because CostCompass reads the amount OpenAI billed rather than estimating from token counts — so the Batch API's reduced rate, automatic cached-input pricing, and other discounts are already baked into the figure it reads. The main thing to expect is a short delay — OpenAI's billing data lands roughly a day behind, so the most recent day reads low (in CostCompass and in OpenAI's own dashboard alike) until it catches up, usually within a day. Smaller gaps can come from promotional credits applied after the fact, or from when you last refreshed.
Can CostCompass cap or stop my OpenAI spend?: No — CostCompass reads your spend, it doesn't sit in front of your API calls, so it can't block a request or enforce a limit. OpenAI's own console can — it's rolling out a hard monthly spend limit that stops API traffic once your account reaches the cap you set, until you raise it or the next billing cycle begins. Alongside it sit softer spend alerts that email you as spend climbs without interrupting traffic. A hard cap is blunt, though — it kills every call until you lift it — so catching a spike early, while traffic still flows, stays the better first line of defense, and that's CostCompass's part — a running total and a per-day trend that make a spike visible when you look, while there's still time to act.
Do ChatGPT or Codex usage limits change my API bill?: No. Codex on a ChatGPT plan — and the ChatGPT app itself — carries usage limits — a rolling five-hour window plus weekly caps that throttle how much you can run on the plan before it resets. Those gate plan usage; they don't change what your metered API costs. CostCompass reads the metered side — pay-as-you-go calls billed per token through your Platform account, which is also where Codex lands when you sign it in with an API key rather than your plan. The one thing the costs API doesn't expose — if you hit a plan limit and buy extra Codex credits to keep going, that top-up is a subscription-side charge, so enter it by hand alongside your flat plan cost.
Why use CostCompass instead of OpenAI's usage dashboard?: OpenAI's dashboard shows what you've already spent on OpenAI, after the fact, with no projection of where your spend is heading. CostCompass gives you a forward-looking picture — a live month-to-date total with a forecast and a per-model split — for OpenAI and the rest of your connected providers, in one view. You can see a spike forming while there's still time to do something about it.

About the author

Joubert Berger builds CostCompass, a spend-intelligence dashboard that pulls usage from AI and compute providers into one month-to-date total, a forecast, and a per-provider breakdown. This guide reflects how CostCompass reads each provider's own usage API — see the security model for how your keys are handled.

Stop guessing at the OpenAI bill

Connect OpenAI once and pull your metered spend with one click — a running total and a forecast of where the month is heading.

Start free