Skip to content

Scaling

Machine Types

You can scale your app to a different machine type by using the fal apps scale command. For more info on available machine types, see the resources.

Change Machine Type For New Runners

Changing the machine type for new runners will not affect existing runners, but any new runners will use the new machine type. If you want to change the machine type for existing runners, you can manually kill the existing runners using fal runners kill and they will be replaced with new ones using the new machine type.

Terminal window
fal apps scale myapp --machine-type GPU-A100

Allow Using Multiple Machine Types

Sometimes you may want to allow your app to use multiple machine types. For example, to have a larger pool of available machines.

Terminal window
fal apps scale myapp --machine-type GPU-A100-40G --machine-type GPU-A100-80G

Min Concurrency

Minimal concurrency is the minimum number of runners that will be kept alive for your app at any time. If your app takes a while to start up and you are expecting a burst of requests, you may want to set this to a higher number.

Terminal window
fal apps scale myapp --min-concurrency 2

Max Concurrency

Max concurrency is the maximum total number of runners that we are allowed to spin up for your app when there are more requests than available runners.

Terminal window
fal apps scale myapp --max-concurrency 10

Keep Alive

Keep alive is the amount of seconds a runner (beyond min concurrency) will be kept alive for your app. Depending on your traffic pattern, you might want to set this to a higher number, especially if your app is slow to start up.

Terminal window
fal apps scale myapp --keep-alive 300

Max Multiplexing

Maximum multiplexing is the maximum number of requests that can be handled by a single runner at any time. This is useful if your app instance is capable of handling multiple requests at the same time, which typically depends on the machine type and amount of resources that your app needs to process a request.

Terminal window
fal apps scale myapp --max-multiplexing 10