Skip to main content

Cold Starts vs Warm Starts

A cold start occurs when a new runner needs to be created from scratch. The runner goes through PENDINGSETUPIDLE (or PENDINGDOCKER_PULLSETUPIDLE if the Docker image isn’t cached) before it can serve requests. A warm start occurs when an existing IDLE runner is reused to handle a new request: IDLERUNNING.

What Triggers Cold Starts

  • No warm runners available (all busy or expired)
  • Traffic spike exceeds warm runner capacity
  • First deployment
  • Runners expired during low traffic periods

Factors Affecting Cold Start Duration

  • Image size: Larger Docker images take longer to pull
  • Model size: Larger models take longer to download and load
  • Setup complexity: Complex initialization in setup() adds time
  • Cache state: First runs are slower, subsequent runs benefit from caching
  • Hardware availability: GPU availability varies by region and time

How to Reduce Cold Starts

Each of these strategies targets a different phase of the cold start: