HTTP over WebSockets
For applications that require real-time interaction or handle streaming, fal offers a WebSocket-based integration. This allows you to establish a persistent connection and stream data back and forth between your client and the fal API. Using the same API as the HTTP apps.
WebSocket Endpoint
To utilize the WebSocket functionality, connect to the HTTP app you want to use but using the new ws.fal.run
solution:
wss://ws.fal.run/{appId}
Communication Protocol
Once connected, the communication follows a specific protocol with JSON messages for control flow and raw data for the actual response stream:
-
Payload Message: Send a JSON message containing the payload for your application. This is equivalent to the request body you would send to the HTTP endpoint.
-
Start Metadata: Receive a JSON message containing the HTTP response headers from your application. This allows you to understand the type and structure of the incoming response stream.
-
Response Stream: Receive the actual response data as a sequence of messages. These can be binary chunks for media content or a JSON object for structured data, depending on the
Content-Type
header. -
End Metadata: Receive a final JSON message indicating the end of the response stream. This signals that the request has been fully processed and the next payload will be processed.
Example Interaction
Here’s an example of a typical interaction with the WebSocket API:
Client Sends (Payload Message):
{"prompt": "generate a 10-second audio clip of a cat purring"}
Server Responds (Start Metadata):
{ "type": "start", "request_id": "5d76da89-5d75-4887-a715-4302bf435614", "status": 200, "headers": { "Content-Type": "text/event-stream; charset=utf-8", "Transfer-Encoding": "chunked", // ... }}
Server Sends (Response Stream):
<binary audio data chunk 1><binary audio data chunk 2>...<binary audio data chunk N>
Server Sends (Completion Message):
{ "type": "end", "request_id": "5d76da89-5d75-4887-a715-4302bf435614", "status": 200, "time_to_first_byte_seconds": 0.577083}
This WebSocket integration provides a powerful mechanism for building dynamic and responsive AI applications on the fal platform. By leveraging the streaming capabilities, you can unlock new possibilities for creative and interactive user experiences.
Example Program
For instance, should you want to make fast prompts to any LLM, you can use fal-ai/any-llm
.
import fal.apps
with fal.apps.ws("fal-ai/any-llm") as connection: for i in range(3): connection.send( { "model": "google/gemini-flash-1.5", "prompt": f"What is the meaning of life? Respond in {i} words.", } )
# they should be in order for i in range(3): import json
response = json.loads(connection.recv()) print(response)
And running this program would output:
{'output': '(Silence)\n', 'partial': False, 'error': None}{'output': 'Growth\n', 'partial': False, 'error': None}{'output': 'Personal fulfillment.\n', 'partial': False, 'error': None}
Example Program with Stream
The fal-ai/any-llm/stream
model is a streaming model that can generate text in real-time. Here’s an example of how you can use it:
with fal.apps.ws("fal-ai/any-llm/stream") as connection: # NOTE: this app responds in 'text/event-stream' format # For example: # # event: event # data: {"output": "Growth", "partial": true, "error": null}
for i in range(3): connection.send( { "model": "google/gemini-flash-1.5", "prompt": f"What is the meaning of life? Respond in {i+1} words.", } )
for i in range(3): for bs in connection.stream(): lines = bs.decode().replace("\r\n", "\n").split("\n")
event = {} for line in lines: if not line: continue key, value = line.split(":", 1) event[key] = value.strip()
print(event["data"])
print("----")
And running this program would output:
{"output": "Perspective", "partial": true, "error": null}{"output": "Perspective.\n", "partial": true, "error": null}{"output": "Perspective.\n", "partial": true, "error": null}{"output": "Perspective.\n", "partial": false, "error": null}----{"output": "Find", "partial": true, "error": null}{"output": "Find meaning.\n", "partial": true, "error": null}{"output": "Find meaning.\n", "partial": true, "error": null}{"output": "Find meaning.\n", "partial": false, "error": null}----{"output": "Be", "partial": true, "error": null}{"output": "Be, love, grow.\n", "partial": true, "error": null}{"output": "Be, love, grow.\n", "partial": true, "error": null}{"output": "Be, love, grow.\n", "partial": false, "error": null}----