stream() sends a direct HTTP request to fal.run using Server-Sent Events (SSE). The SDK wraps the SSE connection into an iterator, so each event arrives as a parsed object. Streaming does not use the queue, so there are no automatic retries.
Streaming is only supported by models that have a
/stream endpoint. Check the model’s API page to confirm support before using stream().Using stream()
data: ). The SDKs parse these automatically into objects. A model might stream progress updates followed by the final result:
stream() Parameters
path
Endpoint path appended to the model ID. Defaults to "/stream" for streaming endpoints. See path reference.
timeout
Client-side HTTP timeout in seconds — how long the client waits for the SSE connection. See timeout reference.
stream() does not support hint, priority, start_timeout, client_timeout, or headers because it bypasses the queue and sends a direct HTTP request. There are no retries. If you need queue-backed reliability, use submit() and poll for status with with_logs=True to track progress.When to Use Streaming
Streaming is best for LLMs, chat models, showing real-time progress to users, and reducing perceived latency in interactive applications. It is not needed for models that return a single result with no intermediate output, or backend-to-backend integrations where you just need the final response. In those cases,run() or subscribe() is simpler.