keep_alive
Default: 60 seconds
Keep runners alive after their last request completes.
min_concurrency
Default: 0
Maintain minimum runners alive at all times, regardless of traffic.
concurrency_buffer
Default: 0
Maintain extra runners beyond current demand.
Takes precedence over
min_concurrency when higher.concurrency_buffer_perc
Default: 0
Set buffer as a percentage of current request volume.
Actual buffer is the maximum of
concurrency_buffer and concurrency_buffer_perc / 100 * request volume.max_multiplexing
Default: 1
Number of concurrent requests each runner handles simultaneously.
scaling_delay
Default: 0 seconds
Wait time before scaling up when a request is queued.
startup_timeout
Default: Varies
Maximum time allowed for setup() to complete.
Persistence Across Deploys
Scaling parameters set via CLI or dashboard (keep_alive, min_concurrency, concurrency_buffer, etc.) persist across deployments by default. You don’t lose your tuning when you deploy a code change.
To reset all parameters back to code values, deploy with --reset-scale:
Deploy Behavior & Priority
Full explanation of how code, CLI, and dashboard settings interact
Cost Considerations
More warm runners = lower latency but higher cost. Balance based on your needs:- Latency-critical apps: Accept higher cost for warm runners (
min_concurrency,keep_alive) - Cost-sensitive apps: Optimize cold start duration instead (container images, caching)
- Variable traffic: Use buffers and scaling delays
Full Scaling Reference
Complete guide to scaling configuration including CLI and dashboard methods