stream() method in the fal client SDKs.
Under the hood, streaming uses Server-Sent Events (SSE), a one-way protocol where the server pushes events to the client over a single HTTP connection. You define a streaming endpoint by returning a FastAPI StreamingResponse with SSE-formatted events from an @fal.endpoint("/stream") method. For bidirectional communication where clients send multiple inputs over a persistent connection, see Realtime Endpoints instead.
Streaming vs Realtime
Streaming and realtime endpoints serve different interaction patterns. Streaming is one-way (server to client) and suited for progressive output from a single request. Realtime is bidirectional (client and server) and suited for interactive applications with back-to-back requests over a persistent WebSocket connection.| Feature | Streaming (SSE) | Realtime (WebSocket) |
|---|---|---|
| Direction | One-way (server to client) | Bidirectional |
| Connection | New connection per request | Persistent, reusable |
| Best for | Progressive output, previews | Interactive apps, back-to-back requests |
| Protocol | JSON over SSE | Binary msgpack |
Example: Streaming Intermediate Steps with SDXL
This example shows how to stream intermediate image previews during Stable Diffusion XL generation. It uses a TinyVAE for fast preview decoding and the pipeline’s callback system to capture progress at each step.Example Details
This example usesmadebyollin/taesdxl, a TinyVAE that decodes intermediate latents roughly 10x faster than the full VAE. The diffusers callback_on_step_end hook captures latents at each denoising step, but the callback only streams every 5 steps to balance responsiveness with overhead. A thread-safe queue passes events from the pipeline thread (which runs inference) to the streaming generator (which yields SSE events to the client).
Client-Side Usage
Endpoint Path RequirementThe
fal_client.stream() (Python) and fal.stream() (JavaScript) functions automatically append /stream to your endpoint ID. This means your app must define a streaming endpoint at /stream using @fal.endpoint("/stream").For example, calling fal_client.stream("your-username/your-app-name", ...) will connect to https://fal.run/your-username/your-app-name/stream.Key Points
Your streaming endpoint must return a FastAPIStreamingResponse with media_type="text/event-stream". Each event is yielded as f"data: {json.dumps(payload)}\n\n" (note the double newline, which is part of the SSE spec). For images to display in the Playground, include both url and content_type in the event payload. Throttle your streaming to avoid sending every intermediate result, and use lower quality or resolution for previews to save bandwidth.