Turn RunPod's per-second GPU spend into a running cost
RunPod rents GPUs by the second and runs serverless endpoints that scale on their own. It keeps charging for storage between jobs too, and all of it draws down a prepaid balance with no monthly invoice to anchor against. Here is how the billing works, and how to keep a running cost and a forecast in front of you while there's still time to act.
RunPod doesn't send you a bill at the end of the month. It draws down a prepaid balance, by the second, while you watch. A GPU pod meters for every second it's up. Serverless bills the seconds your workers actually run. A network volume keeps charging for space you've reserved long after the pod that used it is gone.
The spend is all there in RunPod's billing, day by day as you use it. The work left is turning those per-second meters into one running cost, and a forecast, before the balance runs out from under you.

How does RunPod bill for GPUs, serverless, and storage?
RunPod is a GPU cloud, and almost everything it charges for is metered on time or on space. GPU pods, serverless, and storage carry most of a working developer’s bill, and each meters differently.
GPU pods are billed by the second. You rent a machine of a given GPU type, and the meter runs from the moment it starts until you stop it. It’s pay-as-you-go, with no long-term commitment by default. The rate depends on the GPU class you pick and on where it runs: Secure Cloud (enterprise tier-3/tier-4 datacenters) costs more than Community Cloud (vetted third-party hosts). Community also offers spot instances, cheaper but interruptible, so they suit checkpoint-tolerant training more than a live service.
Serverless endpoints bill the seconds your workers actually run, and what you’re billed for depends on the worker type. Flex workers scale to zero between requests, so you pay only for the seconds spent handling them (plus a short cold start when one spins up). Active workers stay always-on at a steadier rate, with no cold start, but you pay for the idle time too. Which one fits depends on how constant your traffic is.
Network volumes — persistent storage you attach to pods — bill per gigabyte-month, and this is the meter that catches people. It keeps charging for the space you reserve whether or not a pod is using it. A volume left behind after a job finishes goes on costing until you delete it.
| What you’re paying for | How RunPod meters it |
|---|---|
| GPU pods | Per second of uptime, by GPU type and cloud (Secure/Community/spot) |
| Serverless (Flex) | Per second a worker runs, scaling to zero between requests |
| Serverless (Active) | Per second always-on, idle time included |
| Network volumes | Per gigabyte-month, billed whether or not a pod is attached |
A couple of things make RunPod’s billing friendlier than most. Outbound data transfer is free, so there’s no egress charge to model, and the whole thing runs on prepaid credit: you top up a balance and the usage draws it down. RunPod also offers multi-month Savings Plans that trade a prepaid commitment for a lower rate on steady pod usage. (The current rates per GPU class and cloud live on RunPod’s own pricing page — treat any figure you’ve memorized as provisional.)
Why is RunPod spend hard to keep ahead of?
The per-second, prepaid model is what makes RunPod cheap and flexible, and also what makes the spend hard to read at a glance. What surprises people is rarely the GPU rate. It’s the pod left up overnight, or the volume nobody got around to deleting. For a solo developer, the forgotten network volume is usually the bigger leak of the two, because a stopped pod at least nags at your memory while idle storage never does. Most of the difficulty is structural:
- There’s no invoice to anchor to. A prepaid balance just goes down. Without a monthly statement that says “here is what you owe,” the only signals you get are a shrinking number and the work you remember doing, and the two are easy to lose track of between top-ups.
- Per-second metering hides the rate of burn. A pod left running over a weekend, an endpoint that quietly scaled up under load: each is just seconds ticking by. The individual charges are tiny. It’s the combined rate, across everything running at once, that draws the balance down.
- Idle storage spends silently. Spin up a pod with a 500 GB network volume for a training run. When the run finishes you shut the pod down but leave the volume. The GPU meter stops; the volume keeps billing for its 500 GB until you delete it. Storage you provisioned and forgot is the spend nothing in the moment flags.
- The cost is spread across resources. Pods, serverless endpoints, and volumes each accrue on their own, and reading whether this month is heavy means adding them up across however many GPU types and endpoints you’ve spun up.
- RunPod is rarely your only provider. Alongside your GPU rentals you’re likely calling Anthropic for Claude, building on OpenAI or Gemini, maybe routing through OpenRouter — so RunPod is one slice of a bill you otherwise stitch together by hand from a stack of dashboards.

Is the RunPod cost exact or estimated?
Exact, and that sets RunPod apart from most providers CostCompass tracks. A token-metered provider doesn’t hand back a money figure, so CostCompass has to read raw usage and price it against a rate card, which makes the number a close estimate. RunPod is different. Its billing reports the actual amount spent per resource, per day, and CostCompass reads that figure straight through rather than reconstructing it.
The only thing CostCompass adds is the roll-up: it groups those reported costs by resource type, keeps a running month-to-date total, and reads your prepaid balance alongside them.
How can you reduce your RunPod bill?
Tracking shows you where the seconds go; a few levers actually slow the burn:
- Stop pods you’re not using. Per-second billing rewards stopping the meter. A pod left up between sessions is the most common silent cost, and the easiest to cut.
- Delete volumes you’ve finished with. Storage bills per gigabyte-month whether or not a pod is attached, so cleaning up the volumes left behind by completed jobs stops you paying month after month for data you no longer touch.
- Pick the right cloud and instance for the job. Community Cloud costs less than Secure Cloud, and spot instances less than on-demand. For checkpoint-tolerant training that can take an interruption, both are a real saving. Keep Secure Cloud and on-demand for the work that needs the reliability. And right-size the card itself: a model that fits in less VRAM doesn’t need your largest GPU, and the per-second rate tracks the class you pick.
- Match the serverless worker to your traffic. Flex workers scale to zero and suit spiky, low-volume endpoints. Active workers cost less per second once an endpoint runs enough of the time to justify staying on. The wrong choice quietly overpays either way.
- Lock in a Savings Plan for steady baseline usage. If some pods run predictably month after month, RunPod’s multi-month Savings Plans drop the rate on that baseline in exchange for committing the credit up front. It’s worth it only once the usage is steady enough that the commitment won’t sit idle.
- Watch the by-resource split. A ranked view of pods, serverless, and volumes catches a forgotten pod or an orphaned volume in days. A draining balance, on its own, only tells you something’s off later.
How do you forecast next month’s RunPod bill?
Forecasting per-second usage doesn’t need anything exotic, just a burn rate: take your spend over the last several days, turn it into a daily average, and project it across a full month.
CostCompass does exactly this. It scales your trailing seven-day burn rate to the number of days in next month and adds any fixed subscriptions you’ve entered. The result is one forward number for what next month costs if the current pace holds. Seven days is the window: long enough that one heavy training day doesn’t throw the average, short enough to pick up a recent change like an endpoint that started staying busy. It’s the same engine that runs across every provider you’ve connected, so RunPod folds into one whole-stack projection. That’s especially useful here, where a prepaid balance gives you no invoice to extrapolate from.
How does CostCompass track your RunPod costs?
CostCompass reads your billing from RunPod’s API: your GPU pod, serverless, and network-volume spend, each as a daily reported cost, plus your current prepaid balance. It rolls that into a running month-to-date total grouped by resource type. Because RunPod reports the money directly, there’s no pricing step in between, so the figure is RunPod’s own. When you first connect it starts from that point forward, and each refresh brings the running total and the balance current.

Two things make that practical for a solo developer. First, your key is encrypted in your browser before anything is stored. The key you paste is sealed with your vault password on your own device, and only the sealed version is ever saved: a blob CostCompass can’t open, because the vault password stays with you. When it’s time to read your usage, your browser unseals the key and passes it to a relay that uses it for that one call to RunPod and is built not to log or keep it. What we hold at rest is locked ciphertext. The key your account runs on never reaches us in usable form.
Second, RunPod doesn’t sit alone. The same dashboard rolls its running cost up with every other AI and compute provider you’ve connected, like Claude, OpenAI, and your other hosting, into one figure with a forecast. How you get to that cross-provider number (by hand, through tooling, or straight from each API) is its own decision. The ways to track AI costs across providers lay the options out side by side.
| RunPod’s console | CostCompass | |
|---|---|---|
| Timing | Usage and a draining balance, read after the fact | Running cost, current when you Refresh |
| Forecast | No burn-rate forecast; a prepaid balance only | Month-end projection from your burn rate |
| View | Per-second spend across pods, endpoints, volumes | Grouped by resource into one running cost |
| Scope | RunPod only | RunPod beside every other provider |
Getting started takes three steps:
- In RunPod, open Settings → API Keys and create an API key.
- Paste it into CostCompass. The key is encrypted in your browser before it’s stored, so the server only ever holds ciphertext.
- Click Refresh. CostCompass reads your spend and balance, groups it by resource, and from there your running cost and forecast roll up — with RunPod folded into the whole-stack number.
Frequently asked questions
- What does CostCompass need to connect RunPod?
- One RunPod API key, created in the RunPod console under Settings → API Keys. CostCompass uses it to read your billing (GPU pod, serverless endpoint, and network-volume spend) plus your current prepaid balance. There's nothing to configure per pod or per endpoint. The one key covers your whole account, and CostCompass checks it reaches RunPod when you connect, so a bad key shows up at setup rather than as an empty first refresh.
- Does CostCompass store my RunPod key?
- Not in any form it can use. Before the key leaves your device, your browser seals it with your vault password, and the only thing that reaches our servers is that sealed blob. We have no way to open it, because your vault password never leaves your browser. When it's time to read your usage, the key is unsealed in your browser and handed to a relay that forwards the one request to RunPod and is built not to log or keep it. What sits in our database at rest is locked ciphertext, useless without your vault password.
- What RunPod spend does CostCompass track?
- The three things that make up a RunPod bill — GPU pods, serverless endpoints, and network volumes, plus your prepaid balance. The headline breakdown groups your spend into those resource types, and you can drill into a type to see the detail underneath, like which GPU model, which endpoint, or which volume drove it. Two honest limits are worth knowing. Pod spend is grouped by GPU type rather than by individual pod, so you see "all your RTX 4090 time" instead of a line per pod. And the breakdown doesn't split Secure from Community Cloud, spot from on-demand, or Flex from Active serverless, because RunPod's billing groups by resource rather than by those purchasing modes.
- Is the RunPod cost exact or estimated?
- Exact. A token-metered provider doesn't hand back a money figure, so CostCompass has to price raw usage against a rate card. RunPod's billing reports the actual money spent per resource, per day, and CostCompass reads that figure straight through. The cost you see is RunPod's own billed amount, which is why these resource lines aren't marked "estimated" the way a usage-priced provider's are.
- Does CostCompass show my RunPod prepaid balance?
- Yes. RunPod runs on prepaid credit, so alongside your spend CostCompass reads your current balance each time it refreshes. That matters more than usual here. Without a monthly invoice to settle against, the balance is the one thing that tells you how much runway is left at the current pace.
- Can it forecast next month's RunPod bill?
- Yes. CostCompass takes your trailing seven-day burn rate, scales it to the length of next month, and adds any fixed subscriptions you've entered. The result is one forward number for what next month costs if the current pace holds. It's the same forecasting engine that runs across every provider you've connected, so RunPod folds into the whole-stack projection alongside everything else. That helps most when a prepaid balance gives you no invoice to extrapolate from.
- Why use CostCompass instead of RunPod's console?
- RunPod's console shows your usage and your draining balance for RunPod alone — pods and endpoints metered by the second, volumes by the gigabyte. It won't tell you the pace you're burning at, or where the month lands. CostCompass turns that per-second spend into one running cost, projects where the month ends from your burn rate, and rolls RunPod up with Claude, OpenAI, and every other provider you've connected into a single view. That view is current the moment you click Refresh, instead of a balance you keep glancing at and doing the math on yourself.
About the author
Joubert Berger builds CostCompass, a spend-intelligence dashboard that pulls usage from AI and compute providers into one month-to-date total, a forecast, and a per-provider breakdown. This guide reflects how CostCompass reads each provider's own usage API — see the security model for how your keys are handled.
Stop watching your RunPod balance drain without a number
Connect RunPod once and turn its per-second GPU and serverless spend, plus storage, into a live running cost and forecast. Broken down by resource and rolled up with every other provider, in one click.