your-username/your-app-name, and all of the inference methods work identically whether you are calling a marketplace model or your own deployed app.
This page shows quick examples of each calling pattern. subscribe is the simplest option since it handles polling for you and blocks until the result is ready. For production workloads where you need to manage many requests in parallel, submit gives you full control over the request lifecycle. For full details on parameters, response shapes, status polling, and cancellation, see the Inference documentation.
Subscribe
Submits a request to the queue, polls automatically, and returns the result when ready. This is the simplest calling pattern since it handles the request lifecycle for you. Optionally receive progress updates via callbacks.- Python
- JavaScript
Synchronous Inference
Full details on subscribe, progress updates, and timeout handling
Queue (Async)
For fire-and-forget workflows. Submit a request, get a request ID, and retrieve the result later by polling or webhook.- Python
- JavaScript
Asynchronous Inference
Full details on the queue system, status polling, and REST API reference
Streaming
For apps that produce progressive output via Server-Sent Events (SSE). Your app must define a streaming endpoint at/stream using @fal.endpoint("/stream").
- Python
- JavaScript
Building Streaming Endpoints
How to implement SSE streaming in your fal.App
Streaming Inference
Client-side streaming details and REST API
Real-Time (WebSocket)
For bidirectional, low-latency communication over a persistent connection. Your app must define a@fal.realtime("/realtime") endpoint.
- Python
- JavaScript
Building Realtime Endpoints
How to implement WebSocket endpoints in your fal.App
Real-Time Inference
Client-side real-time details and proxy setup
Webhooks
Submit a request and receive the result at a URL you specify, instead of polling.- Python
- JavaScript
- cURL
POST to your webhook URL with the result:
Webhooks API
Full details on webhook payloads, retries, verification, and IP allowlisting
Passing Headers
You can pass custom headers with any calling method to control platform behavior:- Python
- JavaScript
Next Steps
Inference Documentation
Full details on all calling methods, parameters, status polling, and the request handle
Async Inference (Queue)
Submit, status, result, cancel, webhooks, and streaming status updates
Client Setup
Install and configure the fal client SDK
Handle Inputs & Outputs
Define the input/output schema for your endpoints