Product Changelog

November 25, 2025

Scale your application with the new scaling delay feature

Scale your application with the new scaling delay feature.

Scaling delay - the amount of seconds the system will wait for a request to be picked up by a runner before triggering a scale up of a runner

Example:

class MyApp(fal.App):
    scaling_delay = 30
    # ...

See scaling docs.

November 17, 2025

Reduce cold start times with shared compiled PyTorch caches

Dramatically reduce cold start times for torch.compile() models with the new inductor cache utilities.

Load pre-compiled CUDA kernels in ~2 seconds instead of recompiling for 20-30 seconds on each worker
GPU-specific caching automatically organized by GPU type (H100, H200, A100)
Two usage patterns: Manual control with load_inductor_cache() / sync_inductor_cache() or automatic with synchronized_inductor_cache() context manager
Persistent shared storage at /data/inductor-caches/<GPU_TYPE>/<cache_key>.zip
First worker compiles and shares, subsequent workers load instantly

Example:

from fal.toolkit import synchronized_inductor_cache

with synchronized_inductor_cache("mymodel/v1"):
    self.model = torch.compile(self.model)
    self.warmup()  # Compilation happens once, synced automatically

See compilation cache docs.

November 14, 2025

Get Slack notifications for serverless app failures

Never miss critical issues with instant Slack alerts for your serverless applications.

Connect your workspace with one-click OAuth installation
Choose notification channel from a dropdown of your Slack channels
Instant alerts for:
- App startup failures and timeouts
- Critical platform issues
- Real-time error notifications
Team visibility - everyone in the channel sees important updates
Configure at https://fal.ai/dashboard/notifications/settings

November 4, 2025

Stop and kill runners directly from the dashboard

No more switching to the CLI to manage your runners. You now have full lifecycle control right from the dashboard.

Graceful shutdown or force kill runners with a single click
Access at https://fal.ai/dashboard/apps/{username}/{appname}/runners

Stream platform logs to your own endpoint with drains

Integrate fal’s logging with your existing observability stack using the new Serverless Drains feature.

Automatic log forwarding from apps, runners, and file operations in NDJSON format
Works with Datadog, Splunk, Elasticsearch, or any HTTP endpoint
Configure at https://fal.ai/dashboard/drains

November 2, 2025

Upload larger files with improved timeout handling

We’ve significantly improved the reliability of file uploads from URLs, especially for large datasets and model files.

Extended timeout to 10 minutes for fal files upload and fal files upload-url
Upload multi-GB files without timeout errors
See fal files docs

November 1, 2025

Restart all runners without redeploying

Apply environment changes or recover from bad states instantly with the new fal apps rollout command.

Restart all runners for an app without creating a new deployment
Graceful by default (runners finish current requests) or use --force for immediate restart
Pick up new secrets, environment variables, or clear memory issues
See fal apps rollout docs

Stop specific runners without affecting others

Target individual runners for maintenance with graceful shutdown via fal runners stop.

Stop specific runners without affecting others, useful for targeted maintenance
See fal runners docs

Debug production runners with interactive shell access

Jump directly into any running container to troubleshoot issues in real-time with fal runners shell.

SSH-like access to inspect files, environment variables, and dependencies
Debug production issues without redeploying
See fal runners shell docs

October 31, 2025

See everything happening in your app with the events timeline

Complete activity history for runners, deployments, and config changes in one place.

Unified timeline of runner events, deployments, and config changes
Access at https://fal.ai/dashboard/apps/{username}/{appname}/events

October 25, 2025

Get from zero to deployed in minutes with in-app onboarding

New interactive guide walks you through your first serverless deployment step-by-step.

Step-by-step walkthrough from installation to deployment with copy-paste examples
Access at https://fal.ai/dashboard/serverless-get-started

October 22, 2025

Delete files from fal storage

Remove files and directories with the new fal files rm command.

Recursive deletion: fal files rm path/to/file-or-directory
See fal files docs

October 21, 2025

Platform APIs v1 officially released

Programmatically manage your model deployments with the new Platform APIs.

Model discovery - search and metadata retrieval for 600+ models
Pricing and cost estimation - real-time pricing information
Usage tracking - detailed line items with quantities and prices
Analytics - request counts, error rates, and latency percentiles
Available at https://api.fal.ai/v1 - see docs

Get notified when you hit concurrent requests limits

Never wonder why requests are queuing—we now send notifications when you reach your concurrency limit.

Email and dashboard notifications with smart throttling (immediate, 1h, 1d, weekly)
Limit value included in 429 responses for programmatic handling

Debug errors faster with the new errors page

Comprehensive error analytics to identify and resolve issues quickly.

Server vs client error rates with 4xx/5xx breakdown and sparklines
Error timeline with status code distribution and endpoint-level breakdown
Access at https://fal.ai/dashboard/apps/{username}/{appname}/errors

October 20, 2025

Stop or kill individual runners from the command line

Precise control over each runner’s lifecycle without touching the dashboard.

fal runners stop - gracefully stop a runner, allowing in-flight requests to complete
fal runners kill - immediately terminate a runner without waiting
See fal runners docs

October 16, 2025

See exactly how long runners spend starting up

Identify GPU availability bottlenecks and optimize cold start times.

Pending uptime metrics show how long runners wait before becoming active
Track PENDING, DOCKER_PULL, and SETUP state durations separately

October 15, 2025

Connect fal docs to Cursor with MCP

Access the complete fal documentation directly in Cursor using Model Context Protocol.

Complete documentation in your IDE with AI-powered suggestions
Simple setup: add fal MCP server to your mcp.json - see guide

Personalized dashboard with creator and developer views

The dashboard now adapts to your workflow with two distinct experiences.

Creator view - gallery-focused with favorite models and visual generation history
Developer view - metrics-driven with usage stats, error tracking, and API analytics
Quick stats showing credits, requests, and errors with sparklines

October 13, 2025

Add custom headers to your API requests

Integrate seamlessly with analytics, auth, and middleware by passing custom HTTP headers.

Add custom headers for analytics, authentication, or middleware integration
Works with all client libraries

Multi-GPU inference and training with fal.distributed

Scale AI workloads across multiple GPUs with the new fal.distributed module.

Data parallelism - generate multiple outputs simultaneously (e.g., 4 images on 4 GPUs)
Model parallelism - split large models across GPUs for faster generation
Distributed training - synchronized gradient updates with DDP
Supports 2, 4, or 8 GPU configurations on H100s and A100s
See distributed docs

October 10, 2025

Dedicated pages for Analytics, Runners, Logs, and Versions

Complete app details redesign gives each deployment aspect its own focused view.

New Analytics page - runner-focused metrics with date range filtering
New Runners page - app-scoped runner view with enhanced filters
New Logs page - dedicated log viewer for debugging
New Versions page - manage and view app revisions
Enhanced Overview - endpoint stats and performance metrics at a glance

October 9, 2025

Compare models side-by-side in the new Sandbox

Find the perfect model by testing multiple options in parallel with the same prompt.

Run multiple models simultaneously with the same prompt
Available at https://fal.ai/sandbox

October 8, 2025

Manage deployments from Python without async/await

New synchronous client makes serverless management feel just like the CLI.

Manage apps, runners, and deployments programmatically without async/await
Same API as CLI: client.apps.*, client.runners.*, client.deploy()
See Python client docs

October 6, 2025

Bring your own container to any deployment

Full control over your runtime environment with custom Docker images.

Use ContainerImage.from_dockerfile_str() or ContainerImage.from_dockerfile()
Install any dependencies, tools, or system packages you need
See custom containers guide

October 3, 2025

Dynamic auto-scaling with percentage-based buffers

Scale more intelligently by setting concurrency buffers as percentages instead of fixed numbers.

Configure buffer as a percentage of current concurrency for dynamic scaling
See scaling docs

Runner logs with streaming and filtering

Real-time log streaming and powerful filtering for faster debugging.

Stream logs in real-time with fal runners logs --follow
Filter by time range with --since and --until
Search logs with --search parameter
Scrollable and searchable in the dashboard with SSE-powered updates
See fal runners logs docs

Include local files in your deployments automatically

Bring configs, utilities, and code from your local machine into serverless apps.

Specify files with relative or absolute paths to include at runtime
Works with fal run and fal deploy
See app files docs

Clearer dashboard structure groups features by workflow: Generate, Serverless, and Manage.

Generate group: Sandbox, Model Gallery
Serverless group: Apps, Logs, Files, Runners
Manage group: Usage, Billing, API Keys, Webhooks, Team Members

October 2, 2025

Know exactly which version each runner is running

Track deployments better with revision IDs shown on every runner.

Revision ID displayed on runners to track which version is running
State renamed: “DEAD” → “TERMINATED” for clarity

October 1, 2025

Filter logs with custom labels and powerful queries

Find what you need instantly with EXACT/CONTAINS matching and multi-condition filters.

EXACT or CONTAINS matching for label values
Multiple conditions with OR logic (e.g., status IN ["error", "warning"])
Available in dashboard and API
Examples: error_type = "ValidationError", endpoint CONTAINS "/api/v2/"

See what runners are doing during startup

Track exactly where runners are in the startup process—pending, pulling images, or setting up.

fal runners list now shows PENDING, DOCKER_PULL, and SETUP states
Understand deployment progress in real-time

View all app endpoints and config at a glance

Redesigned app details page surfaces the information you need most.

Endpoints, configuration, and status all in one place

September 27, 2025

Monitor and clear your request queue from the CLI

Check how many requests are queued and flush them when needed.

fal queue size app_name - check queue size for an app
fal queue flush app_name - flush all pending requests
See fal queue docs

September 10, 2025

View runner history with time-based filtering

See terminated runners and filter by state to debug failures.

fal runners list --since "1h" - view runners from the last hour (max 24h)
fal runners list --state dead - filter by state (running, pending, setup, dead)
Helpful for debugging failed deployments and understanding runner lifecycle
See fal runners list docs

August 29, 2025

Reorganize files in fal storage without re-uploading

Move and rename files instantly with the new fal files mv command.

Rename or move files in fal storage: fal files mv source destination
See fal files docs

August 26, 2025

See all your endpoint URLs immediately when testing

No more guessing which URL to use—CLI shows playground, sync, and async routes for every run.

CLI prints playground, synchronous, and asynchronous routes for fal run

​Scale your application with the new scaling delay feature

​Reduce cold start times with shared compiled PyTorch caches

​Get Slack notifications for serverless app failures

​Stop and kill runners directly from the dashboard

​Stream platform logs to your own endpoint with drains

​Upload larger files with improved timeout handling

​Restart all runners without redeploying

​Stop specific runners without affecting others

​Debug production runners with interactive shell access

​See everything happening in your app with the events timeline

​Get from zero to deployed in minutes with in-app onboarding

​Delete files from fal storage

​Platform APIs v1 officially released

​Get notified when you hit concurrent requests limits

​Debug errors faster with the new errors page

​Stop or kill individual runners from the command line

​See exactly how long runners spend starting up

​Connect fal docs to Cursor with MCP

​Personalized dashboard with creator and developer views

​Add custom headers to your API requests

​Multi-GPU inference and training with fal.distributed

​Dedicated pages for Analytics, Runners, Logs, and Versions

​Compare models side-by-side in the new Sandbox

​Manage deployments from Python without async/await

​Bring your own container to any deployment

​Dynamic auto-scaling with percentage-based buffers

​Runner logs with streaming and filtering

​Include local files in your deployments automatically

​Find what you need faster with reorganized navigation

​Know exactly which version each runner is running

​Filter logs with custom labels and powerful queries

​See what runners are doing during startup

​View all app endpoints and config at a glance

​Monitor and clear your request queue from the CLI

​View runner history with time-based filtering

​Reorganize files in fal storage without re-uploading

​See all your endpoint URLs immediately when testing