Know where your DeepSeek balance is heading
DeepSeek bills from a prepaid balance. You top it up, and every call draws it down. The console shows the number right now, but never the slope. Here is how the billing works, and how to turn a draining balance into a running total and a forecast in the same view as everything else you run.
DeepSeek doesn't send you a bill at the end of the month. It bills the other way around: you top up an account balance, and every call quietly draws it down. The console shows you that balance, one number, true as of the moment you look. It says nothing about how fast that number is falling, where it lands by month-end, or how it sits against the other providers drawing on the same budget.
What follows is how DeepSeek prices the API, why a balance is harder to stay ahead of than an itemized invoice, and how to turn that single draining number into a running total and a forecast before it runs low on you.

How does DeepSeek bill for usage?
DeepSeek runs on a prepaid balance. You add credit to your account, and every API call deducts from it as you go — there’s no monthly invoice and no subscription tier to reason about, just a number that goes down. DeepSeek draws from two pools when you have them: a granted balance (promotional or complimentary credit) is spent first, and your own topped-up balance covers the rest. Both drain the same way, and the account view shows their sum.
What pulls that balance down is metered the way most token APIs are: by the input tokens you send — your prompt, system instructions, and any context — and the output tokens the model generates back. Output is the pricier side, so a long or chatty completion costs more than the size of the prompt suggests.
DeepSeek splits into two modes, and the difference matters for cost. There’s a
plain non-thinking mode for general work, and a thinking mode
that reasons through a problem before it answers. That reasoning isn’t free. The
model produces a chain of thought first, and those tokens count toward the
output length you’re billed for, even though they’re working notes and not the
final reply. A request that looks small can carry a large, mostly invisible
output charge when thinking mode is doing real work. That’s the part that catches
people out the first time they lean on it. (DeepSeek is mid-transition on
model names — the long-standing deepseek-chat and deepseek-reasoner are
scheduled to be retired in mid-2026 in favor of the deepseek-v4 lineup,
deepseek-v4-flash and deepseek-v4-pro — so treat any specific model name as
a moving target and check the docs for the current set.)
One mechanism works in your favor. DeepSeek prices repeated prompt prefixes through context caching: when a later call reuses input the service has already seen, that portion bills at a much cheaper cache-hit rate instead of the full cache-miss rate. Keep your stable prefixes — long system prompts, reference material — consistent across calls and a real slice of your input cost moves to the cheaper tier.
So a single DeepSeek call can draw on three token prices at once, each quoted per 1M tokens:
| Token type | How DeepSeek prices it |
|---|---|
| Input — cache miss | The full input rate, for prompt content it hasn’t seen |
| Input — cache hit | A fraction of that rate, when a later call reuses a prefix |
| Output | The expensive side — what the model generates back. Reasoning-mode thinking tokens are counted here too, at the same output rate |
Exact rates, quoted per 1M tokens, live on DeepSeek’s pricing page and have changed before, so treat any figure you’ve memorized as provisional.
Why is DeepSeek spend hard to keep ahead of?
The pricing itself is simple. Staying ahead of it is the hard part, and most of the difficulty comes down to what DeepSeek does and doesn’t show you:
- The account balance is the only signal to poll. DeepSeek’s account API hands back your current balance and nothing else: no account-level usage history, no per-model report to break down or replay. You can see where you stand, but there’s no account endpoint to ask where the money went.
- Prepaid hides the slope. A balance tells you the level. It says nothing about the rate it’s falling at. By the time the number looks low enough to notice, the spend that pulled it down is already behind you, and a countdown gives no early warning.
- Thinking tokens are invisible. A reasoning-heavy request can burn far more output than its prompt implies, and the chain of thought that drove the charge isn’t part of the final answer. From the account balance, it shows up only after the fact.
- DeepSeek is rarely your only provider. If you also call Anthropic for Claude, build on OpenAI, run Gemini, or route through OpenRouter — plus whatever you spend on inference hosts and serving — the DeepSeek balance is one slice of a bill you otherwise stitch together by hand from a stack of dashboards.
How can you reduce your DeepSeek bill?
Tracking shows you where the balance goes. A few levers slow the drain:
- Match the mode to the task. Thinking mode earns its keep on hard problems, but it bills the reasoning tokens as output. Route the straightforward, high-volume calls to the non-thinking mode and save the reasoning model for work that needs it. That one routing choice tends to move the bill more than any other tweak.
- Lean on context caching. Keep large, stable prefixes consistent so they land on the cheaper cache-hit rate instead of paying full input on every call. Long system prompts and reference documents are where this pays off.
- Keep outputs tight. Output is the expensive side, and thinking tokens ride along with it. Trimming verbose responses and reining in how much the model reasons on tasks that don’t need deep thought cuts the part of the bill that grows fastest.
- Watch the trend, not the balance. A number that only goes down tells you little until it’s nearly gone. A running total with a slope on it shows a creeping spike within a few days, well before the account runs dry.
How do you forecast next month’s DeepSeek bill?
Forecasting a prepaid balance doesn’t need anything exotic, just a burn rate: take your spend over the last several days, turn it into a daily average, and project that across a full month.
CostCompass does exactly this. It scales your trailing seven-day burn rate to the number of days in next month and adds any fixed subscriptions you’ve entered. The result is one forward number for what next month costs if the current pace holds. Seven days is a deliberate window: long enough that one heavy day doesn’t throw the average, short enough to pick up a recent change like a shift to thinking mode. For DeepSeek the projection builds on a balance-derived total, so read it as a directional estimate. It gets the order of magnitude right and sharpens as more clean windows land, though it won’t be pinned to the last cent.
How does CostCompass track your DeepSeek costs?
CostCompass tracks DeepSeek by watching the one thing DeepSeek will tell it: your account balance. Each time you refresh, it reads the current balance and compares it against the previous reading, and the drop between the two becomes the spend for that window. There’s no account-level usage report behind this for CostCompass to read, so the figure is the total you spent across DeepSeek, marked approximate with a small asterisk. It can’t be a per-model breakdown, because the data to support one doesn’t exist. Tracking begins with your first refresh after connecting, which captures a baseline and shows no spend yet; from the next refresh on, every dip in the balance rolls into your running total.
That approach has two edges worth knowing up front. Spend from before your first refresh isn’t recoverable, since there’s no history to read back, so the total starts from that first reading instead of the first of the month. And a top-up in the middle of a window can mask the spend it overlaps. If you add credit and burn some in the same stretch, the balance may end higher than it started, so that window under-counts. A later window starts clean from the new balance and won’t recover the masked spend. Neither edge is something you can act on — they’re just what derived spend looks like — but it’s worth knowing the masked stretch reads low and never fills back in.

Two things make that practical for a solo developer. First, your key is encrypted in your browser before anything is stored. The DeepSeek key you paste is sealed with your vault password on your own device, and only the sealed version is ever saved — a blob CostCompass can’t open, because the vault password stays with you. When it’s time to read your balance, your browser unseals the key and passes it to a relay that uses it for that single call to DeepSeek and is built not to log or keep it. What we hold at rest is locked ciphertext that nobody can use against your account.
Second, DeepSeek doesn’t sit alone. The same dashboard rolls its running total up with every other AI and compute provider you’ve connected — Claude, OpenAI, Gemini, your hosting — into one figure with a forecast. DeepSeek becomes a line in your whole-stack spend instead of a balance you check in a console of its own.
Getting started takes three steps:
- In DeepSeek, create an API key — the ordinary kind, no special permission needed.
- Paste it into CostCompass. It’s encrypted in your browser before it’s stored, so the server only ever holds ciphertext.
- CostCompass captures a baseline balance on your first Refresh, and from the next Refresh on it shows your running total and forecast — DeepSeek included in the whole-stack number.
Frequently asked questions
- What does CostCompass need to connect DeepSeek?
- One ordinary DeepSeek API key — the same kind you already use to make calls. You create it in the DeepSeek platform under API keys and paste it into CostCompass. There's no admin tier, no special billing permission, and no cloud console to wire up, so DeepSeek is the simplest provider here to connect. The one key is enough for CostCompass to read your account balance, which is all it needs.
- Does CostCompass store my DeepSeek API key?
- Not in any form we can read. Your API key is encrypted in your browser with your vault password before it ever leaves your device, and CostCompass keeps only the resulting ciphertext — a blob it has no way to open, because your vault password stays in your browser. When it's time to read your DeepSeek balance, the key is unsealed in your browser and forwarded to DeepSeek through a relay that holds it just for that one call and is built not to log it. So the key does pass through our infrastructure for that request. The plaintext is never written to our database or our logs. What sits there at rest is locked ciphertext that nobody can use against your account.
- Does CostCompass break DeepSeek spend down by model?
- No, and that's a real limit we'd rather be honest about. DeepSeek's account API exposes your balance but no account-level usage report, so there is nothing to break down by model. CostCompass reads the one figure DeepSeek gives it exactly — your balance — and reports the drop between two readings as your spend, marked approximate because it's derived rather than billed. A per-model split would be guesswork, so it doesn't pretend to one.
- How does CostCompass know what I spent if DeepSeek has no usage report?
- It watches your balance. Each time you refresh, CostCompass reads your current DeepSeek balance and compares it to the previous reading; the drop between the two is the spend for that window. Because the figure is derived from a balance and not reported as billed usage, it carries an approximate marker (a small asterisk in the dashboard). Two things follow from this. Your first refresh after connecting just captures a baseline and shows no spend yet; from the second refresh on, the drop from the previous reading becomes spend — so anything spent before that first reading isn't recoverable, because there's no history to read back. And a top-up in the middle of a window can mask the spend it overlaps, which makes that stretch under-count. A later window starts clean from the new balance but won't recover what was masked.
- Can it forecast next month's DeepSeek bill?
- Yes. CostCompass takes your trailing seven-day burn rate, scales it to the length of next month, and adds any fixed subscriptions you've entered. For DeepSeek that projection rides on the balance-derived total, so treat it as a directional read on where your spend is heading, not a to-the-cent figure. It's the same forecasting engine that runs across every provider you've connected, so DeepSeek folds into one whole-stack number.
- Why is my DeepSeek bill higher than the prompts I sent suggest?
- Usually it's thinking mode. When DeepSeek reasons through a problem, it generates a chain of thought before the final answer, and those reasoning tokens count toward the output you pay for even though they aren't part of the final answer. Output is already the pricier side of the bill, so a short prompt that triggers heavy reasoning can cost far more than its length implies. From the account balance CostCompass reads, that charge shows up only after the fact. Routing simple calls to non-thinking mode and reining in how much the model reasons on the rest is the fastest way to pull that number back down.
- Why use CostCompass instead of the DeepSeek platform?
- The DeepSeek console shows your balance right now and your recent calls, after the fact, for DeepSeek alone. It won't tell you how fast that balance is falling, where it lands by month-end, or how it sits next to the rest of your bill. CostCompass turns each balance change into a running total and a forecast, and rolls DeepSeek up with Claude, OpenAI, Gemini, and your hosting spend into a single view — kept current the moment you click Refresh, instead of a number you have to go read and a slope you have to work out yourself.
About the author
Joubert Berger builds CostCompass, a spend-intelligence dashboard that pulls usage from AI and compute providers into one month-to-date total, a forecast, and a per-provider breakdown. This guide reflects how CostCompass reads each provider's own usage API — see the security model for how your keys are handled.
Stop watching the DeepSeek balance drop
Connect DeepSeek once and turn your prepaid balance into a running total and a next-month forecast — rolled up with every other provider, in one click.