Streaming vs Realtime: Streaming (SSE) is one-way (server → client) and ideal for progressive output. For bidirectional communication where clients send multiple requests over a persistent connection, see Realtime Endpoints.
When to Use Streaming
| Feature | Streaming (SSE) | Realtime (WebSocket) |
|---|---|---|
| Direction | One-way (server → client) | Bidirectional (client ↔ server) |
| Connection | New connection per request | Persistent, reusable |
| Latency | Higher (new connection each time) | Lower (connection reuse) |
| Best for | Progressive output, previews | Interactive apps, back-to-back requests |
| Protocol | JSON over SSE | Binary msgpack |
- You want to show progressive output (e.g., image generation previews, video updates)
- Clients send a single request and receive multiple updates
- You don’t need bidirectional communication
- Users send multiple requests in quick succession (e.g., interactive image editing)
- You need the lowest possible latency between requests
- You’re building interactive, back-and-forth experiences
Example: Streaming Intermediate Steps with SDXL
This example shows how to stream intermediate image previews during Stable Diffusion XL generation. It uses a TinyVAE for fast preview decoding and the pipeline’s callback system to capture progress at each step.Example Details
This SDXL example demonstrates several techniques:- TinyVAE for previews: Uses
madebyollin/taesdxlwhich is ~10x faster than the full VAE for decoding intermediate latents - Pipeline callback: Uses diffusers’
callback_on_step_endto capture progress at each denoising step - Throttled streaming: Only streams every 5 steps to balance responsiveness with overhead
- Thread-safe queue: Uses a queue to safely pass events from the pipeline thread to the streaming generator
Client-Side Usage
Endpoint Path RequirementThe
fal_client.stream() (Python) and fal.stream() (JavaScript) functions automatically append /stream to your endpoint ID. This means your app must define a streaming endpoint at /stream using @fal.endpoint("/stream").For example, calling fal_client.stream("your-username/your-app-name", ...) will connect to https://fal.run/your-username/your-app-name/stream.Python
JavaScript
Key Points
- Use
StreamingResponsefrom FastAPI withmedia_type="text/event-stream" - Format events as SSE:
yield f"data: {json.dumps(payload)}\n\n"(note the double newline) - Include
content_type: Use{"image": {"url": "data:...", "content_type": "image/jpeg"}}for playground display - Throttle streaming: Don’t stream every intermediate result—balance responsiveness with overhead
- Use lower quality for previews: Save bandwidth with lower resolution or compression for intermediate results