Downloading Files
You can download files from external sources using thedownload_file
function. This function handles downloading files from URLs with built-in caching and error handling.
This is particularly useful for downloading datasets, configuration files, or any external resources your application needs.
Downloading Model Weights
You can download model weights from external sources using thedownload_model_weights
function. This function is specifically designed for downloading model weights and provides several useful features:
- Predefined storage location: Automatically stores weights in an optimized directory structure
- Smart caching: Avoids re-downloading weights that are already present unless forced
- Authentication support: Supports custom request headers for private repositories
Improving Hugging Face download speeds
The Hugging Face library caches files locally to prevent duplicate downloads. Within Fal, this cache is automatically placed on the/data
persistent volume via the HF_HOME
environment variable that is set to /data/.cache/huggingface
.
The steps below offer additional speedups.
Set Hugging Face token
Ensure you have your Hugging Face token set to be used by Fal runs. Authenticated downloads seem to be faster than anonymous ones:fal secret set HF_TOKEN=xxx
Save weights to the /data
volume
Hugging Face weights need to be stored within the /data
volume. This:
- Speeds up runner starts by removing the need for files to be reconstructed from the cache (which is already on
/data
) - Ensures enough disk space is available
Speed up initial weights downloading
Depending on the size of the weights, the initial call tosnapshot_download
can take a while.
Hugging Face seems to reduce download speeds for large models after some time. While individual transfer thread often start at 50+ MB/s, after a while the speed drops to 5-6 MB/s.
There are 3 ways to speed up downloads:
-
Increase concurrency,
max_workers
and cache size: - Download many models in parallel: when downloading multiple models, it helps to start a separate Fal run for each one. The different source IP address reduces the risk of rate limiting.
- Restart slow downloads: restart the run after e.g. 10 minutes, which will likely cause it to go to a different physical server, and the download will resume at higher speeds.
Speed up model files check
Even after fully caching weights, callingsnapshot_download
with a large model can sometimes take 45+ seconds.
The Hugging Face library takes this time to do API calls and metadata checks. You can use local_files_only=True
to skip this step, which typically makes the call return in less than 1 second.
Using local_files_only=True
will throw an error if the files are not completely cached. To prevent that, it is a good idea to wrap the call in a try/catch block and retry it without local_files_only
: