Use a Custom Container Image
The easiest way to understand how to run a containerized application is to see an example. Let’s convert the example from the previous section into a containerized application.
import falfrom fal.container import ContainerImagefrom fal.toolkit import Image, optimize
from pydantic import BaseModel, Field
dockerfile_str = """FROM python:3.11
RUN apt-get update && apt-get install -y ffmpeg
RUN python -m venv .venvENV PATH="$PWD/.venv/bin:$PATH"RUN pip install "accelerate" "transformers>=4.30.2" "diffusers>=0.26" "torch>=2.2.0""""
class Input(BaseModel): prompt: str = Field( description="The prompt to generate an image from.", examples=[ "A cinematic shot of a baby racoon wearing an intricate italian priest robe.", ], )
class Output(BaseModel): image: Image = Field( description="The generated image.", )
class FalModel( fal.App, image=ContainerImage.from_dockerfile_str(dockerfile_str), kind="container", ): machine_type = "GPU"
def setup(self) -> None: import torch from diffusers import AutoPipelineForText2Image
# Load SDXL self.pipeline = AutoPipelineForText2Image.from_pretrained( "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", ) self.pipeline.to("cuda")
# Apply fal's spatial optimizer to the pipeline. self.pipeline.unet = optimize(self.pipeline.unet) self.pipeline.vae = optimize(self.pipeline.vae)
# Warm up the model. self.pipeline( prompt="a cat", num_inference_steps=30, )
@fal.endpoint("/") def text_to_image(self, input: Input) -> Output: result = self.pipeline( prompt=input.prompt, num_inference_steps=30, ) [image] = result.images return Output(image=Image.from_pil(image))
Voila! 🎉 The highlighted changes are the only modifications you need to make; the rest remains your familiar fal application.
fal Specific Considerations
When deploying your application on fal, you don’t need to worry about enabling Docker Buildx or BuildKit. We take care of it for you. However, there are several fal-specific requirements you must follow:
Required Package Versions
fal has specific dependencies that must be installed with exact versions:
pydantic==2.10.6
protobuf==4.25.1
boto3==1.35.74
CRITICAL: These packages must be installed LAST in your Dockerfile to ensure they override any conflicting versions installed by other dependencies.
FROM falai/base:3.11-12.1.0
# Install your application dependencies firstRUN pip install torch transformers your-packages
# ALWAYS install fal packages last to avoid version conflictsRUN pip install --no-cache-dir \\ boto3==1.35.74 \\ protobuf==4.25.1 \\ pydantic==2.10.6
Ensure Curl
FROM ubuntu:22.04
# Install CurlRUN apt-get update && apt-get install -y curl
1. File Upload Instead of COPY
COPY
and ADD
(from local filesystem) are not supported as of now to copy files into the container
from the host. Instead you can use fal’s fal.toolkit
to upload files and
refer them in the container using links.
json_url = File.from_path("my-file.json", repository="cdn").url
dockerfile_str = f"""FROM python:3.11-slimRUN apt-get update && apt-get install -y curlRUN curl '{json_url}' > my-file.json"""
or you can use ADD
to directly download the file from the URL:
json_url = File.from_path("requirements.txt", repository="cdn").url
dockerfile_str = f"""FROM python:3.11-slimADD {json_url} /app/requirements.txtWORKDIR /appRUN pip install -r requirements.txt"""
2. Container Image Best Practices
When building container images for fal, follow these best practices:
Use Fal Base Image
Its recommended to use falai/base:3.11-12.1.0
as your base image as it comes with the right python version, cuda and more.
Most importantly its small size improves startup times!
Pin All Package Versions
This ensures reproducibility of builds leaving no doors open for issues with newer package versions and incompatibility!
# Good: Pinned versions ensure reproducible buildsRUN pip install torch==2.6.0 transformers==4.51.3
# Bad: Unpinned versions can break your appRUN pip install torch transformers
Clean Up Package Caches
Cleaning up package caches reduces build time and startup time, making for a faster iteration and coldstart!
RUN apt-get update && apt-get install -y package \\ && rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir package==version
Use Multi-Stage Builds for Smaller Images
Multi Stage builds are a great way to significantly reduce the image size, saving time building and downloading the container on startup!
# Build stageFROM python:3.11 as builderCOPY requirements.txt .RUN pip install --user -r requirements.txt
# Runtime stageFROM python:3.11-slimCOPY --from=builder /root/.local /root/.local
Docker Templates
To help you get started quickly and avoid common pitfalls, here are production-ready Docker templates for common use cases:
Base Python Template
Perfect for applications that only need Python packages from pip or simple apt packages.
dockerfile_str = """FROM falai/base:3.11-12.1.0
USER root
RUN apt-get update && apt-get install -y --no-install-recommends \\ git \\ wget \\ curl \\ && rm -rf /var/lib/apt/lists/*
# Install your application packagesRUN pip install --no-cache-dir \\ requests==2.31.0 \\ numpy==1.24.3 \\ pandas==2.0.3
# IMPORTANT: Install fal-required packages LAST to ensure correct versionsRUN pip install --no-cache-dir \\ boto3==1.35.74 \\ protobuf==4.25.1 \\ pydantic==2.10.6"""
class FalModel( fal.App, image=ContainerImage.from_dockerfile_str(dockerfile_str), kind="container",): # Your application code
PyTorch + HuggingFace Template
For deep learning applications using PyTorch and HuggingFace ecosystem.
FROM falai/base:3.11-12.1.0
# Install PyTorch with CUDA support firstRUN pip install --no-cache-dir \\ torch==2.6.0 \\ accelerate==1.6.0 \\ transformers==4.51.3 \\ diffusers==0.31.0 \\ hf_transfer==0.1.9 \\ peft==0.15.0 \\ sentencepiece==0.2.0 \\ --extra-index-url \\ https://download.pytorch.org/whl/cu124
# IMPORTANT: Install fal-required packages LAST to ensure correct versionsRUN pip install --no-cache-dir \\ boto3==1.35.74 \\ protobuf==4.25.1 \\ pydantic==2.10.6
# Set CUDA environment variablesENV CUDA_HOME=/usr/local/cudaENV PATH=$CUDA_HOME/bin:$PATHENV LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATHENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0 7.5 8.0 8.6 8.9 9.0 9.0a"
Custom CUDA Template
For some applications, you might require a different cuda runtime, here is an example to get CUDA 12.8:
FROM nvidia/cuda:12.8.0-runtime-ubuntu22.04
# Avoid prompts during apt installENV DEBIAN_FRONTEND=noninteractive \\ PYTHONDONTWRITEBYTECODE=1 \\ PYTHONUNBUFFERED=1
# Install Python and system dependenciesRUN apt-get update && apt-get install -y --no-install-recommends \\ software-properties-common \\ && add-apt-repository ppa:deadsnakes/ppa \\ && apt-get update && apt-get install -y --no-install-recommends \\ python3.11 \\ python3.11-dev \\ python3.11-venv \\ python3-pip \\ wget \\ curl \\ ca-certificates \\ ffmpeg \\ libsndfile1 \\ && rm -rf /var/lib/apt/lists/*
# IMPORTANT: Create symlinks for python binary accessibility# fal requires the python binary to be accessible via standard pathsRUN ln -sf /usr/bin/python3.11 /usr/bin/python3 && \\ ln -sf /usr/bin/python3.11 /usr/bin/python
# Upgrade pipRUN python3 -m pip install --no-cache-dir --upgrade pip
# Install PyTorch first (CUDA 12.8 compatible)RUN pip install torch==2.7.0 -f https://download.pytorch.org/whl/cu128/torch_stable.html
# Install your packagesRUN pip install --no-cache-dir \ stable-audio-tools==0.0.19 \ librosa==0.10.1 \ soundfile==0.12.1
# IMPORTANT: Install fal-required packages LAST to ensure correct versionsRUN pip install --no-cache-dir \\ boto3==1.35.74 \\ protobuf==4.25.1 \\ pydantic==2.10.6
Common Issues & Solutions
fal Dependency Conflicts
Problem: ImportError
or version conflicts with pydantic, protobuf, or boto3
Solution: Always install fal-required packages last:
# Install all other packages firstRUN pip install torch transformers your-other-packages
# Install fal-required packages LASTRUN pip install --no-cache-dir \\ boto3==1.35.74 \\ protobuf==4.25.1 \\ pydantic==2.10.6
Python Binary Not Found
Problem: python: command not found
or /usr/bin/env: python: No such file or directory
Solution: Create proper symlinks when using custom base images:
RUN ln -sf /usr/bin/python3.11 /usr/bin/python3 && \\ ln -sf /usr/bin/python3.11 /usr/bin/python
CUDA Related Issues
Example Problem: RuntimeError: No CUDA GPUs are available
Solution: Ensure CUDA environment variables are set:
ENV CUDA_HOME=/usr/local/cudaENV PATH=$CUDA_HOME/bin:$PATHENV LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
Container Validation Checklist
Before deploying your container app, ensure:
- All package versions are pinned
- fal-required packages (pydantic==2.10.6, protobuf==4.25.1, boto3==1.35.74) are installed LAST
- Curl is installed
- Container builds without errors with fal run
Using Private Docker Registries
To use private docker registries, you need to provide registry credentials like so:
Dockerhub
class FalModel( fal.App, kind="container", image=ContainerImage.from_dockerfile_str( "FROM myuser/image:tag", registries={ "https://index.docker.io/v1/": { "username": "myuser", "password": "$DOCKERHUB_TOKEN", # use `fal secrets set` first to create this secret }, }, ),) ...
Google Artifact Registry
We recommend using a service account and setting a base64-encoded version of the key as a Fal secret, which you can then use in your code:
-
Create a JSON key for a service account. It should be automatically downloaded to your computer.
-
Encode it in base64 with a command like:
Terminal window cat key.json | base64 -
Set the result as a Fal secret:
Terminal window fal secrets set GOOGLE_AR_JSON_BASE64=<value from above> -
Use the secret as the password, and
_json_key_base64
as the username for the Artifact Registry in your code:class FalModel(fal.App,kind="container",image=ContainerImage.from_dockerfile_str("FROM us-central1-docker.pkg.dev/myuser/image:tag",registries={"us-central1-docker.pkg.dev": {"username": "_json_key_base64","password": "$GOOGLE_AR_JSON_BASE64",},},),)...
For more details and options check out Google’s documentation.
Amazon Elastic Container Registry
class FalModel( fal.App, kind="container", image=ContainerImage.from_dockerfile_str( "FROM 123456789012.dkr.ecr.us-east-1.amazonaws.com/image:tag", registries={ "https://1234567890.dkr.ecr.us-east-1.amazonaws.com": { "username": "AWS", # Use `aws ecr get-login-password --region us-east-1` to get a token. Note that they only last # 12 hours so it is better to just create them dynamically here instead of creating one and # setting it as a `fal secret`. # https://awscli.amazonaws.com/v2/documentation/api/latest/reference/ecr/get-login-password.html "password": aws_token, }, }, ),) ...
Build Secrets
We currently only support secret mounts.
class FalModel( fal.App, kind="container", image=ContainerImage.from_dockerfile_str( """ FROM python:3.11 RUN --mount=type=secret,id=aws-key-id,env=AWS_ACCESS_KEY_ID \ --mount=type=secret,id=aws-secret-key,env=AWS_SECRET_ACCESS_KEY \ --mount=type=secret,id=aws-session-token,env=AWS_SESSION_TOKEN \ aws s3 cp ... """, secrets={ # use `fal secrets set` first to create these secrets "aws-key-id": "$AWS_ACCESS_KEY_ID", "aws-secret-key": "$AWS_SECRET_ACCESS_KEY", "aws-session-token": "$AWS_SESSION_TOKEN", }, ),): ...