Reliability | fal.ai Model APIs

fal is designed for production workloads and includes several built-in mechanisms to ensure your requests succeed.

Queue-Based Processing

The queue system handles traffic surges gracefully and provides request tracking. When you submit a request, it enters a managed queue that ensures reliable processing even during peak demand.

Automatic Retries

When using the queue, fal automatically retries requests that fail due to:

Server errors (503): The model endpoint was temporarily unavailable
Timeouts (504): The request took too long due to transient issues
Connection errors: Network issues between fal infrastructure
Rate limits (429): Request waits and retries automatically when you temporarily exceed your concurrent request limit

Requests are retried up to 10 times with intelligent backoff.

Limit retry duration: Use start timeout to cap the total time a request can spend waiting (including retries). Once the timeout is reached, no further retries occur.

No charge for server errors: Failed requests that return 5xx status codes are not billed.

Automatic retries only apply to queue-based requests. Direct synchronous requests return errors immediately without retry.

For per-request control over retries and timeouts, see the Queue page — including disabling retries with the X-Fal-No-Retry header, setting a start timeout with X-Fal-Request-Timeout, and using client timeout to set a client-side deadline.

Timeout	Enforced by	Effect on retries
`start_timeout` / `X-Fal-Request-Timeout`	Server	Stops retries when total elapsed time is exceeded
`client_timeout` (Python SDK, `subscribe` only)	Client	Client stops polling; server and retries continue

Model Fallbacks

For supported models, fal might automatically reroute requests to equivalent alternative endpoints if the primary endpoint is temporarily unavailable. This only occurs after fal retries the request up to five times; if those retries fail, the request is routed to a fallback endpoint. This mechanism improves overall reliability and reduces the likelihood of failed requests. Fallbacks are enabled by default for all accounts. If you need to disable fallbacks for your account, please let your account team know. If you want to disable it per request, you can pass x-app-fal-disable-fallbacks header. For any questions, contact our sales team.

Models Endpoints

Authentication

Integrations

Real-Time

Reference

Help & Support

Reliability | fal.ai Model APIs

Queue-Based Processing

Automatic Retries

Model Fallbacks

Models Endpoints

Authentication

Integrations

Real-Time

Reference

Help & Support

​Queue-Based Processing

​Automatic Retries

​Model Fallbacks

Queue-Based Processing

Automatic Retries

Model Fallbacks