Per-Model Pricing
The billing unit varies by model type. Image generation models typically charge per image or per megapixel of output, where higher resolutions cost proportionally more. Video generation models charge per second of generated video or a flat rate per video. Other models, such as LLMs or audio models, charge per request or per output unit specific to the model. Models that do not have a fixed per-output price fall back to per-second billing based on the GPU machine type used to run the request. This fallback applies to some less common models and to your own Serverless endpoints.| Model type | Billing unit | How it works |
|---|---|---|
| Image generation | Per image or per megapixel | Flat rate per image, or proportional to output resolution |
| Video generation | Per second or per video | Per second of generated video, or flat rate per video |
| Other models | Per request or compute seconds | Flat rate per request, or per-second billing by GPU type |
Prices vary by model and may change. Check the model’s page or the pricing page for current rates. You can also query prices programmatically through the Platform APIs.
What You Pay For
You are billed for successfully generated outputs. The billing unit (image, megapixel, video second, etc.) is defined per model. When a model does not have a fixed output price, billing falls back to per-second pricing based on the GPU machine type that processed your request.What You Are Not Charged For
Server errors are never billed. If a request fails with an HTTP 500 or higher status code, no charge is incurred. Time spent waiting in the queue before a runner starts processing your request is also free. Only the actual inference work counts toward your bill.Checking Prices Programmatically
You can retrieve pricing information for any model endpoint through the Platform APIs. This is useful for building cost estimation into your application or comparing rates across models.Platform APIs for Models
Full reference for pricing, usage, and analytics APIs