Skip to content
Dashboard

Core Concepts

Understanding these essential terms will help you follow the tutorials and deploy your first model successfully.

App

An App is a Python class that wraps your AI model for deployment. Your app defines what packages it needs, how to load your model, and how users interact with it.

class MyApp(fal.App):
machine_type = "GPU-H100" # Choose your hardware
def setup(self):
# Load your model here
# Executed on each runner
@fal.endpoint("/")
def generate(self, input_data):
# Your endpoint logic here—usually a model call

Machine Type

Machine Type specifies the hardware (CPU or GPU) your app runs on. Choose based on your model’s needs: "CPU" for lightweight models, "GPU-H100" for most AI models, or "GPU-B200" for large models.

Runner

A Runner is a compute instance that executes your app using your chosen machine type. Runners automatically start when requests arrive and shut down when idle to save costs.

Endpoint

An Endpoint is a function in your app that users can call via API. It defines how your model processes inputs and returns outputs.

Playground

Each endpoint gets an automatic Playground - a web interface where you can test your model with different inputs before integrating it into your application.

fal run vs fal deploy

  • fal run: Test your app on a single cloud gpu during development. Creates a temporary URL that disappears when you stop the command.

  • fal deploy: Deploy your app to production. Creates a permanent URL that stays available until you delete it.

Use fal run while building and testing, then fal deploy when ready for production use.