Queue-Based Processing
The queue system handles traffic surges gracefully and provides request tracking. When you submit a request, it enters a managed queue that ensures reliable processing even during peak demand.Automatic Retries
When using the queue, fal automatically retries requests that fail due to:- Server errors (503): The model endpoint was temporarily unavailable
- Timeouts (504): The request took too long due to transient issues
- Connection errors: Network issues between fal infrastructure
- Rate limits (429): Request waits and retries automatically when you temporarily exceed your concurrent request limit
Automatic retries only apply to queue-based requests. Direct synchronous requests return errors immediately without retry.
Model Fallbacks
For supported models, fal might automatically reroute requests to equivalent alternative endpoints if the primary endpoint is temporarily unavailable. This only occurs after fal retries the request up to five times; if those retries fail, the request is routed to a fallback endpoint. This mechanism improves overall reliability and reduces the likelihood of failed requests. Fallbacks are enabled by default for all accounts. If you need to disable fallbacks for your account, please let your account team know. If you want to disable it per request, you can passx-fal-disable-fallbacks header. For any questions, contact our sales team.