run sends a direct HTTP request to fal.run and returns the result immediately, with no queue involved. subscribe uses the queue under the hood but handles polling automatically, so it feels just as simple while giving you automatic retries and reliability.
Both are good starting points for scripts, prototyping, and any situation where you just want the output without managing the request lifecycle yourself. Use run when you want the fastest possible path with no overhead, and subscribe when you want queue-backed reliability. For production workloads that need parallel processing or webhooks, use async inference instead.
Using run (Direct)
run sends a direct HTTP request to fal.run. There is no queue, no retries, and no status polling. The response comes back in the same HTTP connection.
Using subscribe (Queue-backed)
subscribe submits to the queue and polls until the result is ready. You get automatic retries, timeout handling, and scaling, with the same simple blocking interface as run.
With Progress Updates
Track progress while waiting for results:run() Parameters
path
Endpoint path appended to the model ID. Leave empty for the default root endpoint. See path reference.
timeout
Client-side HTTP timeout in seconds. This is a standard HTTP connection timeout — how long the client library waits for the server to send back a complete response. If the connection is idle for longer than this, the client raises a timeout error. It does not affect the server.
start_timeout
Server-side deadline covering total elapsed time until a runner successfully starts processing. Returns 504 if exceeded. See start_timeout reference for full details on how the clock behaves across retries.
hint
Routing hint for session affinity — routes requests to the same runner. See hint reference.
headers
Additional HTTP headers for platform-level controls like disabling retries or payload storage. See headers reference.
subscribe() Parameters
subscribe() accepts all the parameters above except timeout, plus the queue-specific options below. For shared parameters (path, start_timeout, hint, headers, priority), see the submit() reference.
client_timeout
Client-side total deadline in seconds (Python SDK only). This limits the total time subscribe() blocks, including queue wait and processing. When exceeded, the client stops polling and raises a FalClientTimeoutError. The request may still be processing on the server after the client gives up.
If you set client_timeout without setting start_timeout, the SDK automatically sets start_timeout = client_timeout so the server also respects your deadline. If you set start_timeout larger than client_timeout, the SDK emits a warning.
with_logs / logs
When enabled, status updates include runner logs (print statements from your model’s code). Without this, the logs field in progress updates is empty. Useful for debugging or showing generation progress to users. The parameter is named with_logs in Python and logs in JavaScript.
on_enqueue
A callback function that fires once, immediately after the request enters the queue. It receives the request_id as its argument. Use this to store the request ID for later reference — for example, to display it in a UI or save it to a database so you can retrieve the result later even if the client disconnects.
on_queue_update / onQueueUpdate
A callback function that fires each time the client polls for status. It receives a status object with a status field indicating the current state.
In Python, the status is one of three types: Queued (has position), InProgress (has logs), or Completed (has logs and metrics). In JavaScript, check the status string field: "IN_QUEUE", "IN_PROGRESS", or "COMPLETED".
When to Use
Synchronous methods are best for simple scripts, prototyping, and any situation where you just want the result without managing the request lifecycle. Userun when you want the fastest path with no overhead. Use subscribe when you want queue-backed reliability with automatic retries.
For production workloads that need parallel request processing, webhooks, or full visibility into queue status, use async inference instead.
Error Responses
When a request fails due to infrastructure issues (timeouts, runner errors, etc.), the response includes a JSON body withdetail and error_type fields, along with an X-Fal-Error-Type header. For the full list of error types and their meanings, see Request Error Types.
For model-level validation errors (invalid inputs, content policy violations, etc.), see Model Errors.