fal deploy, fal builds your code into a container image, pushes it to a registry, and makes it available at a permanent endpoint ID. From that point, runners spin up on demand to handle requests, scale automatically based on traffic, and shut down when idle. You can roll back to any previous version instantly, deploy to separate environments for staging and production, and tune scaling parameters without redeploying.
Before diving into this section, make sure you have installed the CLI and built your app following the Development guides. The pages here cover everything after your code is written: understanding what runners are, deploying to production, managing versions and environments, choosing hardware, and configuring scaling. If you are migrating from another platform, the migration guides can help you get started faster.
Quick Start
The simplest deployment is a single command:your-username/my-model that callers use with the fal client SDKs.
Runners and Requests
Before deploying, it helps to understand the execution model. When a caller submits a request, it enters a persistent queue and is dispatched to a runner. Runners are compute instances that pull your container image, runsetup(), and serve requests until they scale down. Understanding how requests flow through the queue and how runners start, process, and shut down is essential for debugging latency, configuring scaling, and managing costs.
Understanding Requests
Request lifecycle, retry interaction, and platform architecture diagram
Understanding Runners
Runner lifecycle states, startup, shutdown, and scaling behavior
Caching
How Docker layers, model weights, and compiled artifacts are cached across runners
Deploying and Managing
Deployment creates a versioned revision of your app. Each deploy creates a new revision, so you can roll back to any previous version instantly if something goes wrong. You choose a rollout strategy (recreate for speed, rolling for zero downtime), configure authentication (private, public, or shared billing), and optionally deploy to separate environments for staging and production.Deploy to Production
Build, configure, and ship your app with rollout strategies and auth modes
Manage Deployments
List, update, and delete deployed apps
Rollbacks
Switch between revisions or restart runners with rollouts
Environments
Isolate staging and production with separate secrets and config