HTTP headers that control request behavior across all inference methods on fal.
When you call a model or your own deployed app on fal, you can pass platform-level HTTP headers that control how the request is handled. These headers are separate from the model’s input arguments (like prompt or image_size) and from SDK method parameters (like start_timeout or client_timeout). They apply at the infrastructure level — controlling retries, payload storage, media expiration, and routing.Some of these headers have dedicated SDK parameters that set them automatically. For example, passing start_timeout=30 in the SDK sets X-Fal-Request-Timeout: 30 under the hood. Others, like X-Fal-Store-IO, can only be set via the headers dict. This page documents all platform headers in one place. For headers that have SDK parameters, the corresponding method pages are linked.
Server-side time-to-start deadline in seconds. Despite the header name, this does not limit total request time. The clock starts when the request is submitted and covers queue wait, runner acquisition, and failed retry attempts. Once a runner successfully begins processing, the timeout stops and inference can run as long as it needs. If the deadline is reached before any runner starts processing, the server returns 504 Gateway Timeout with X-Fal-Request-Timeout-Type: user. To limit total client-side wait time (including processing), use client_timeout on subscribe() instead.
Routing hint that tells fal to try to route the request to a specific runner. Useful for session affinity — for example, keeping requests pinned to a runner that already has a LoRA adapter or conversation state loaded in memory. If the hinted runner is unavailable, fal routes to any available runner.
Queue priority for the request. Priority applies to the per-endpoint queue — every request to the same endpoint shares one queue, regardless of who sent it. A low-priority request sits behind all normal-priority requests. This means setting "low" on a shared model API deprioritizes your request relative to all other users of that model.
Prevent fal from storing request payloads (JSON inputs and outputs). Payloads are stored for 30 days by default and power the request history in your dashboard.
Header
X-Fal-Store-IO
Default
"1" (stored for 30 days)
Values
"0" to disable storage
Report incorrect code
Copy
Ask AI
result = fal_client.subscribe( "fal-ai/nano-banana-2", arguments={"prompt": "a sunset"}, headers={"X-Fal-Store-IO": "0"})
This only prevents storage of the JSON payloads. CDN files generated during processing are still accessible (subject to media expiration settings).
Disable automatic retries for this request. By default, queue-based requests are retried for up to 10 total attempts on server errors (503, 504, connection errors).
Header
X-Fal-No-Retry
Default
Retries enabled
Values
"1", "true", "yes" to disable
Report incorrect code
Copy
Ask AI
result = fal_client.subscribe( "fal-ai/nano-banana-2", arguments={"prompt": "a sunset"}, headers={"X-Fal-No-Retry": "1"})
Disable automatic model fallbacks for this request. By default, fal may reroute requests to equivalent alternative endpoints if the primary is unavailable.
Header
x-app-fal-disable-fallback
Default
Fallbacks enabled
Report incorrect code
Copy
Ask AI
result = fal_client.subscribe( "fal-ai/nano-banana-2", arguments={"prompt": "a sunset"}, headers={"x-app-fal-disable-fallback": "true"})
Reject the request with 429 if the endpoint’s queue already has more than this many requests waiting (across all callers). Useful for latency-sensitive applications that prefer to fail fast rather than wait in a long queue.
Query param
fal_max_queue_length
Default
No limit
Type
Integer
This parameter is passed as a query parameter on the URL, not as a header. The SDKs do not currently expose it as a named parameter; use the raw URL approach or pass it via headers.
cURL
Report incorrect code
Copy
Ask AI
curl -X POST "https://queue.fal.run/fal-ai/nano-banana-2?fal_max_queue_length=10" \ -H "Authorization: Key $FAL_KEY" \ -H "Content-Type: application/json" \ -d '{"prompt": "a sunset"}'