fal.App running on Serverless. When you deploy your own app, you get the same queue-based reliability, the same analytics dashboard, and the same client SDKs. The difference is that you control the code, the model weights, and the container environment. You can also publish your app to the marketplace so anyone can call it with their own API key.
How It Works
The best deployment approach depends on where you are starting from. If you are migrating from another provider, you can be up and running with minimal code changes. If you are starting a new project, fal can build and manage the container for you. All three paths give you the same autoscaling, observability, and runner management.Migrating an existing server
If you already have a working HTTP server (FastAPI, Flask, or any framework), this is the fastest path. Deploy it with@fal.function and exposed_port, and fal routes traffic to your server’s port with no code changes to your existing application.
fal.function supports all the same scaling parameters as fal.App (keep_alive, min_concurrency, max_concurrency, and more). See Migrate a Docker Server for a complete walkthrough and the full fal.function parameter reference. There are also step-by-step guides for Replicate, Modal, and RunPod.
Migrating a custom container
If you have a Docker image with your model and dependencies baked in but not a full HTTP server, you can bring it directly. UseContainerImage to reference your Dockerfile or pull from a registry. You keep full control over the build while using fal’s endpoint system and scaling.
Starting a new project
If you are building from scratch, use a nativefal.App with pip requirements. You write a Python class, list your dependencies, and define your endpoints. fal builds the container for you.
Test and deploy
Regardless of which approach you use, the workflow is the same. Test locally withfal run, then deploy with fal deploy. After deployment, your app gets a persistent URL, a Playground for browser-based testing, and automatic scaling based on incoming traffic. See the Quick Start to try it in under two minutes.