Skip to main content
Once your app is deployed, you can call it using the same client SDKs and patterns used for any model on fal. Your app’s endpoint ID is your-username/your-app-name, and all of the inference methods work identically whether you are calling a marketplace model or your own deployed app. This page shows quick examples of each calling pattern. subscribe is the simplest option since it handles polling for you and blocks until the result is ready. For production workloads where you need to manage many requests in parallel, submit gives you full control over the request lifecycle. For full details on parameters, response shapes, status polling, and cancellation, see the Inference documentation.

Subscribe

Submits a request to the queue, polls automatically, and returns the result when ready. This is the simplest calling pattern since it handles the request lifecycle for you. Optionally receive progress updates via callbacks.
import fal_client

result = fal_client.subscribe("your-username/your-app-name", arguments={
    "prompt": "a sunset over mountains"
})
print(result)
With progress updates:
import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
            print(log["message"])

result = fal_client.subscribe(
    "your-username/your-app-name",
    arguments={"prompt": "a sunset over mountains"},
    with_logs=True,
    on_queue_update=on_queue_update,
)

Synchronous Inference

Full details on subscribe, progress updates, and timeout handling

Queue (Async)

For fire-and-forget workflows. Submit a request, get a request ID, and retrieve the result later by polling or webhook.
import fal_client

handler = fal_client.submit("your-username/your-app-name", arguments={
    "prompt": "a sunset over mountains"
})

print(f"Request ID: {handler.request_id}")

# Check status
status = handler.status()
print(status)

# Get result when ready
result = handler.get()
print(result)

Asynchronous Inference

Full details on the queue system, status polling, and REST API reference

Streaming

For apps that produce progressive output via Server-Sent Events (SSE). Your app must define a streaming endpoint at /stream using @fal.endpoint("/stream").
import fal_client

for event in fal_client.stream("your-username/your-app-name", arguments={
    "prompt": "a sunset over mountains"
}):
    print(event)

Real-Time (WebSocket)

For bidirectional, low-latency communication over a persistent connection. Your app must define a @fal.realtime("/realtime") endpoint.
import fal_client

with fal_client.realtime("your-username/your-app-name") as connection:
    connection.send({"prompt": "Hello, world!"})
    result = connection.recv()
    print(result)
Async version:
import asyncio
import fal_client

async def main():
    async with fal_client.realtime_async("your-username/your-app-name") as connection:
        await connection.send({"prompt": "Hello, world!"})
        result = await connection.recv()
        print(result)

asyncio.run(main())

Webhooks

Submit a request and receive the result at a URL you specify, instead of polling.
import fal_client

handler = fal_client.submit(
    "your-username/your-app-name",
    arguments={"prompt": "a sunset over mountains"},
    webhook_url="https://your-server.com/api/webhook",
)

print(f"Request ID: {handler.request_id}")
When the request completes, fal sends a POST to your webhook URL with the result:
{
  "request_id": "abc-123",
  "status": "OK",
  "payload": { ... }
}

Webhooks API

Full details on webhook payloads, retries, verification, and IP allowlisting

Passing Headers

You can pass custom headers with any calling method to control platform behavior:
import fal_client

result = fal_client.subscribe(
    "your-username/your-app-name",
    arguments={"prompt": "a sunset"},
    headers={
        "x-fal-no-retry": "1",
    },
)
See Platform Headers for all available headers, and Retries for retry control. Each inference method page also documents its available SDK parameters.

Next Steps