Skip to content

Autoscaling Your App

Satusky supports three scaling approaches. This guide explains when to use each, walks you through every CLI flag, and shows exactly what each command prints and creates in the cluster.

ApproachWhat scalesTrigger
Manual replicasPod countYou change it
HPAPod countCPU or memory utilisation
VPACPU/memory per podHistorical usage patterns

Start simple. Add autoscaling only when you understand your traffic patterns.


  • 1ctl installed and authenticated (1ctl auth login)
  • A satusky.toml in your project directory, or use --image to skip the build step
  • A machine provisioned (1ctl machine list to confirm)

Create a working directory for this guide:

Terminal window
mkdir -p /tmp/autoscale-test
cat > /tmp/autoscale-test/satusky.toml << 'EOF'
[app]
name = "autoscale-test"
port = 80
cpu = "0.5"
memory = "256Mi"
EOF
cd /tmp/autoscale-test

For a new deployment, start with an explicit replica count. This gives you a stable baseline before adding autoscaling.

Terminal window
1ctl deploy --image nginx:alpine --machine compute-main-01 --replicas 1

The CLI prints a step-by-step progress indicator as it works:

💡 Using pre-built image: nginx:alpine
Step 2/5: Creating/updating deployment autoscale-test ✓
Step 3/5: Configuring services autoscale-test ✓
Step 4/5: Setting up environment and storage autoscale-test ✓
Step 5/5: Configuring public routing and dependencies autoscale-test ✓
✅ 🚀 Deployment for autoscale-test is successful! Your app is live at: https://<subdomain>.satusky.com
Deployment ID: <uuid>

Two replicas means one pod can restart without taking your app offline. Increase to 2 for any production workload:

Terminal window
1ctl deploy --image nginx:alpine --machine compute-main-01 --replicas 2

A PodDisruptionBudget (PDB) prevents Kubernetes from removing too many pods simultaneously during node maintenance or rolling updates.

PDB is automatically created when you pass --pdb. It is also auto-enabled by the platform when --replicas > 1, but you can control it explicitly.

Terminal window
1ctl deploy \
--image nginx:alpine \
--machine compute-main-01 \
--replicas 2 \
--pdb \
--pdb-type percent \
--pdb-percent 50

--pdb-percent 50 means at least 50% of pods must remain available at any time. For a 2-replica deployment this keeps at least 1 pod running.

Terminal window
1ctl deploy \
--image nginx:alpine \
--machine compute-main-01 \
--replicas 3 \
--pdb \
--pdb-type fixed \
--pdb-min-available 2
FlagDefaultDescription
--pdbfalseEnable PodDisruptionBudget
--pdb-typeautoauto, fixed, or percent
--pdb-min-available0Minimum available pods (used with --pdb-type fixed)
--pdb-percent0Minimum available percentage 1–100 (used with --pdb-type percent)

The PDB is named <app-name>-pdb in your namespace. You can verify it was created:

Terminal window
kubectl -n <your-namespace> get pdb
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
autoscale-test-pdb 50% N/A 1 4s

Known limitation: deploy destroy removes the Deployment, HPA, and VPA but does not delete the PDB. Delete it manually if needed:

Terminal window
kubectl -n <your-namespace> delete pdb autoscale-test-pdb

HPA scales the number of pod replicas based on CPU or memory utilisation. It is the right choice for stateless apps with variable traffic (web APIs, queue workers).

Terminal window
1ctl deploy \
--image nginx:alpine \
--machine compute-main-01 \
--hpa \
--hpa-min-replicas 2 \
--hpa-max-replicas 10 \
--hpa-cpu-target 70 \
--pdb \
--pdb-type percent \
--pdb-percent 50

What this does:

  • Keeps at least 2 replicas running at all times
  • Scales up to 10 replicas when average CPU exceeds 70%
  • Scales back down when CPU drops and stays low for ~5 minutes (Kubernetes default cooldown)

Adding memory-based scaling:

Terminal window
1ctl deploy \
--image nginx:alpine \
--machine compute-main-01 \
--hpa \
--hpa-min-replicas 2 \
--hpa-max-replicas 10 \
--hpa-cpu-target 70 \
--hpa-memory-target 80

--hpa-memory-target 0 disables memory scaling (default).

FlagDefaultDescription
--hpafalseEnable HorizontalPodAutoscaler
--hpa-min-replicas1Minimum replicas to keep running
--hpa-max-replicas10Maximum replicas to scale to
--hpa-cpu-target80Target average CPU utilisation percentage
--hpa-memory-target0Target average memory utilisation percentage (0 = disabled)

The HPA is named <app-name>-hpa. You can inspect it:

Terminal window
kubectl -n <your-namespace> get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
autoscale-test-hpa Deployment/autoscale-test cpu: <unknown>/70% 1 5 0 3s

Cluster requirement: HPA actually scaling pods based on live metrics requires metrics-server to be running in the cluster. On Satusky managed clusters, metrics-server is available. The <unknown> in the TARGETS column appears briefly after creation and resolves within one scrape interval (~60 seconds) once pods are running. Seeing CPU actually drive scale events also requires real traffic.


VPA analyses historical CPU and memory usage and adjusts resource requests over time. Use it to right-size your pods — not to handle traffic spikes (that is what HPA is for).

ModeBehaviour
OffComputes recommendations only; no changes applied
InitialSets resource requests when a pod starts; never updates running pods
AutoUpdates resource requests on running pods (triggers restarts)

Start with Initial mode so you can review recommendations before letting VPA apply them automatically:

Terminal window
1ctl deploy \
--image nginx:alpine \
--machine compute-main-01 \
--vpa \
--vpa-mode Initial \
--vpa-min-cpu 0.1 \
--vpa-max-cpu 2 \
--vpa-min-memory 64Mi \
--vpa-max-memory 1Gi

The bounds prevent VPA from allocating more than you intend.

FlagDefaultDescription
--vpafalseEnable VerticalPodAutoscaler
--vpa-modeOffOff, Initial, or Auto
--vpa-min-cpuMinimum CPU VPA may request (e.g. 100m)
--vpa-max-cpuMaximum CPU VPA may request (e.g. 4)
--vpa-min-memoryMinimum memory VPA may request (e.g. 64Mi)
--vpa-max-memoryMaximum memory VPA may request (e.g. 1Gi)

The VPA object is named <app-name>-vpa:

Terminal window
kubectl -n <your-namespace> get vpa
NAME MODE CPU MEM PROVIDED AGE
autoscale-test-vpa Initial 3s

CPU and MEM columns populate after VPA has collected enough historical data to make a recommendation.

Cluster requirement: VPA in Initial or Auto mode requires the VPA admission controller and recommender to be installed in the cluster. On Satusky managed clusters, the VPA components are available. In Off mode the object is created but nothing is applied.

You cannot use --hpa and --vpa-mode Auto together. The CLI enforces this and exits with an error:

❌ validation failed: HPA and VPA with mode 'Auto' cannot be used together - they both try to scale resources

The reason: HPA scales pod count based on percentage of the resource request. VPA in Auto mode changes the resource request. The two controllers fight each other and produce unpredictable behaviour.

Safe combinations:

  • HPA + --vpa-mode Off (VPA recommendations only, not applied)
  • HPA + --vpa-mode Initial (VPA sets requests at pod start, HPA drives count)
  • VPA --vpa-mode Auto alone (no HPA)

By default, Satusky uses a rolling update strategy (25% max surge, 25% max unavailable). You can specify the strategy explicitly:

Terminal window
1ctl deploy \
--image nginx:alpine \
--machine compute-main-01 \
--strategy rolling \
--rolling-max-surge 25% \
--rolling-max-unavailable 0

To replace all pods at once with a brief outage window, use recreate:

Terminal window
1ctl deploy \
--image nginx:alpine \
--machine compute-main-01 \
--strategy recreate
FlagDefaultDescription
--strategyrollingRollout strategy: rolling or recreate
--rolling-max-surge25%Extra pods allowed during rollout (count or percentage, e.g. 1 or 25%)
--rolling-max-unavailable25%Pods allowed to be unavailable during rollout (count or percentage, e.g. 0 or 25%)

The rollout tuning flags are propagated to the Kubernetes Deployment spec. For example, --rolling-max-surge 1 --rolling-max-unavailable 0 yields maxSurge: 1 and maxUnavailable: 0 on the live Deployment.


A stateless API with HPA, PDB, and conservative VPA right-sizing in Initial mode:

Terminal window
1ctl deploy \
--image nginx:alpine \
--machine compute-main-01 \
--cpu 0.5 \
--memory 512Mi \
--replicas 2 \
--hpa \
--hpa-min-replicas 2 \
--hpa-max-replicas 20 \
--hpa-cpu-target 65 \
--pdb \
--pdb-type fixed \
--pdb-min-available 1 \
--vpa \
--vpa-mode Initial \
--vpa-min-cpu 0.25 \
--vpa-max-cpu 2 \
--vpa-min-memory 128Mi \
--vpa-max-memory 1Gi \
--strategy rolling

What this creates in the cluster:

  • A Deployment with 2 initial replicas
  • An HPA that scales between 2 and 20 pods at 65% CPU
  • A PDB ensuring at least 1 pod is always available
  • A VPA that sets resource requests when pods start (no live restarts)

Shows the current deployment record from the Satusky control plane:

Terminal window
1ctl deploy get
Deployment Details
──────────────────
Deployment ID: e36538c6-b539-4fe8-9f73-8061160f306e
Status: completed
URL: https://cleverhawk-kwvmacg.satusky.com
Deployed to machines: <machine-uuid>
Type: production
Region:
Zone:
Version: alpine
Port: 80
CPU Request: 0.5
Memory Request: 256Mi
Memory Limit: 256Mi
Created: 1 minute ago
Last Updated: just now

deploy get shows the deployment configuration. It does not show current replica count, HPA scaling events, or VPA recommendations.

Shows the live status of the running deployment:

Terminal window
1ctl deploy status
Status: Running
Message: Deployment is running normally
Progress: 100%

deploy status does not show HPA scaling state (how many replicas are currently running, or what the current CPU utilisation is). For that, see your credits dashboard or inspect the HPA directly with kubectl.

Shows how many machine-hours your deployment has consumed over the last 7 days:

Terminal window
1ctl credits usage
💡 No machine usage found for the last 7 days

Usage data appears after your first billing cycle. Higher-than-expected credit consumption during a traffic spike confirms HPA scaled up as intended. Related commands:

Terminal window
1ctl credits balance # current credit balance
1ctl credits transactions # itemised billing history

Destroy the deployment and all associated resources:

Terminal window
1ctl deploy destroy -y
💡 Destroying deployment e36538c6-b539-4fe8-9f73-8061160f306e...
✅ Deployment e36538c6-b539-4fe8-9f73-8061160f306e destroyed successfully

deploy destroy removes the Deployment, HPA, and VPA objects. It does not remove the PDB. If you created one, delete it manually:

Terminal window
kubectl -n <your-namespace> delete pdb <app-name>-pdb

Verify everything is gone:

Terminal window
kubectl -n <your-namespace> get deployment,hpa,vpa,pdb | grep <app-name>

The following capabilities require infrastructure beyond what the CLI alone can verify:

CapabilityRequirement
HPA actually scaling podsmetrics-server running; real traffic load
VPA actually updating resource requestsVPA admission controller + recommender installed
Seeing RESTARTS drop after VPA right-sizingReal workload + time for VPA to collect data
credits usage reflecting scale eventsActive deployment with billing cycle elapsed
Rolling update surge/unavailable values applyingPlatform fix pending (current release: always uses defaults)

On Satusky managed clusters, metrics-server and the VPA components are installed. You do not need to install them yourself.