Scaling, instances & sleep/wake

Your project runs as one or more instances of its live deployment. Scaling adjusts how many instances run and how much resource each gets.

Scale a project

Adjust scaling through the API:

curl -X POST https://api.cantila.app/v1/projects/PROJECT_ID/scale \
  -H "Authorization: Bearer <api_key>" \
  -H "Content-Type: application/json" \
  -d '{ "instances": 2 }'

Inspect the running instances:

curl https://api.cantila.app/v1/projects/PROJECT_ID/instances \
  -H "Authorization: Bearer <api_key>"

By default, idle apps sleep to save resources and wake on demand when the next request arrives. For latency-sensitive workloads where you don't want a cold start, pin the project always-on so an instance is always running.

Mode	Behavior
Sleep/wake	Idle app sleeps; wakes automatically on the next request
Always-on	An instance stays running; no wake delay

Tradeoff.

Sleep/wake keeps lightweight or bursty apps efficient. Always-on trades a bit of that efficiency for consistently fast first responses.

Logs & metrics to watch load before you scale
Rollbacks
Builds & the deploy pipeline

Scaling, instances & sleep/wake

Scale a project

Sleep on idle vs always-on

Related