Autoscaling Your App
Satusky supports three scaling approaches. This guide explains when to use each, walks you through every CLI flag, and shows exactly what each command prints and creates in the cluster.
Overview
Section titled “Overview”| Approach | What scales | Trigger |
|---|---|---|
| Manual replicas | Pod count | You change it |
| HPA | Pod count | CPU or memory utilisation |
| VPA | CPU/memory per pod | Historical usage patterns |
Start simple. Add autoscaling only when you understand your traffic patterns.
Prerequisites
Section titled “Prerequisites”1ctlinstalled and authenticated (1ctl auth login)- A
satusky.tomlin your project directory, or use--imageto skip the build step - A machine provisioned (
1ctl machine listto confirm)
Create a working directory for this guide:
mkdir -p /tmp/autoscale-testcat > /tmp/autoscale-test/satusky.toml << 'EOF'[app] name = "autoscale-test" port = 80 cpu = "0.5" memory = "256Mi"EOFcd /tmp/autoscale-testStep 1: Start with Fixed Replicas
Section titled “Step 1: Start with Fixed Replicas”For a new deployment, start with an explicit replica count. This gives you a stable baseline before adding autoscaling.
1ctl deploy --image nginx:alpine --machine compute-main-01 --replicas 1The CLI prints a step-by-step progress indicator as it works:
💡 Using pre-built image: nginx:alpineStep 2/5: Creating/updating deployment autoscale-test ✓Step 3/5: Configuring services autoscale-test ✓Step 4/5: Setting up environment and storage autoscale-test ✓Step 5/5: Configuring public routing and dependencies autoscale-test ✓✅ 🚀 Deployment for autoscale-test is successful! Your app is live at: https://<subdomain>.satusky.comDeployment ID: <uuid>Two replicas means one pod can restart without taking your app offline. Increase to 2 for any production workload:
1ctl deploy --image nginx:alpine --machine compute-main-01 --replicas 2Step 2: Add a Pod Disruption Budget
Section titled “Step 2: Add a Pod Disruption Budget”A PodDisruptionBudget (PDB) prevents Kubernetes from removing too many pods simultaneously during node maintenance or rolling updates.
PDB is automatically created when you pass --pdb. It is also auto-enabled by the platform when --replicas > 1, but you can control it explicitly.
PDB with a percentage minimum
Section titled “PDB with a percentage minimum”1ctl deploy \ --image nginx:alpine \ --machine compute-main-01 \ --replicas 2 \ --pdb \ --pdb-type percent \ --pdb-percent 50--pdb-percent 50 means at least 50% of pods must remain available at any time. For a 2-replica deployment this keeps at least 1 pod running.
PDB with a fixed minimum
Section titled “PDB with a fixed minimum”1ctl deploy \ --image nginx:alpine \ --machine compute-main-01 \ --replicas 3 \ --pdb \ --pdb-type fixed \ --pdb-min-available 2PDB flags reference
Section titled “PDB flags reference”| Flag | Default | Description |
|---|---|---|
--pdb | false | Enable PodDisruptionBudget |
--pdb-type | auto | auto, fixed, or percent |
--pdb-min-available | 0 | Minimum available pods (used with --pdb-type fixed) |
--pdb-percent | 0 | Minimum available percentage 1–100 (used with --pdb-type percent) |
The PDB is named <app-name>-pdb in your namespace. You can verify it was created:
kubectl -n <your-namespace> get pdbNAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGEautoscale-test-pdb 50% N/A 1 4sKnown limitation:
deploy destroyremoves the Deployment, HPA, and VPA but does not delete the PDB. Delete it manually if needed:Terminal window kubectl -n <your-namespace> delete pdb autoscale-test-pdb
Step 3: Enable Horizontal Pod Autoscaler
Section titled “Step 3: Enable Horizontal Pod Autoscaler”HPA scales the number of pod replicas based on CPU or memory utilisation. It is the right choice for stateless apps with variable traffic (web APIs, queue workers).
1ctl deploy \ --image nginx:alpine \ --machine compute-main-01 \ --hpa \ --hpa-min-replicas 2 \ --hpa-max-replicas 10 \ --hpa-cpu-target 70 \ --pdb \ --pdb-type percent \ --pdb-percent 50What this does:
- Keeps at least 2 replicas running at all times
- Scales up to 10 replicas when average CPU exceeds 70%
- Scales back down when CPU drops and stays low for ~5 minutes (Kubernetes default cooldown)
Adding memory-based scaling:
1ctl deploy \ --image nginx:alpine \ --machine compute-main-01 \ --hpa \ --hpa-min-replicas 2 \ --hpa-max-replicas 10 \ --hpa-cpu-target 70 \ --hpa-memory-target 80--hpa-memory-target 0 disables memory scaling (default).
HPA flags reference
Section titled “HPA flags reference”| Flag | Default | Description |
|---|---|---|
--hpa | false | Enable HorizontalPodAutoscaler |
--hpa-min-replicas | 1 | Minimum replicas to keep running |
--hpa-max-replicas | 10 | Maximum replicas to scale to |
--hpa-cpu-target | 80 | Target average CPU utilisation percentage |
--hpa-memory-target | 0 | Target average memory utilisation percentage (0 = disabled) |
The HPA is named <app-name>-hpa. You can inspect it:
kubectl -n <your-namespace> get hpaNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEautoscale-test-hpa Deployment/autoscale-test cpu: <unknown>/70% 1 5 0 3sCluster requirement: HPA actually scaling pods based on live metrics requires
metrics-serverto be running in the cluster. On Satusky managed clusters, metrics-server is available. The<unknown>in the TARGETS column appears briefly after creation and resolves within one scrape interval (~60 seconds) once pods are running. Seeing CPU actually drive scale events also requires real traffic.
Step 4 (Optional): Right-Size with VPA
Section titled “Step 4 (Optional): Right-Size with VPA”VPA analyses historical CPU and memory usage and adjusts resource requests over time. Use it to right-size your pods — not to handle traffic spikes (that is what HPA is for).
VPA modes
Section titled “VPA modes”| Mode | Behaviour |
|---|---|
Off | Computes recommendations only; no changes applied |
Initial | Sets resource requests when a pod starts; never updates running pods |
Auto | Updates resource requests on running pods (triggers restarts) |
Start with Initial mode so you can review recommendations before letting VPA apply them automatically:
1ctl deploy \ --image nginx:alpine \ --machine compute-main-01 \ --vpa \ --vpa-mode Initial \ --vpa-min-cpu 0.1 \ --vpa-max-cpu 2 \ --vpa-min-memory 64Mi \ --vpa-max-memory 1GiThe bounds prevent VPA from allocating more than you intend.
VPA flags reference
Section titled “VPA flags reference”| Flag | Default | Description |
|---|---|---|
--vpa | false | Enable VerticalPodAutoscaler |
--vpa-mode | Off | Off, Initial, or Auto |
--vpa-min-cpu | — | Minimum CPU VPA may request (e.g. 100m) |
--vpa-max-cpu | — | Maximum CPU VPA may request (e.g. 4) |
--vpa-min-memory | — | Minimum memory VPA may request (e.g. 64Mi) |
--vpa-max-memory | — | Maximum memory VPA may request (e.g. 1Gi) |
The VPA object is named <app-name>-vpa:
kubectl -n <your-namespace> get vpaNAME MODE CPU MEM PROVIDED AGEautoscale-test-vpa Initial 3sCPU and MEM columns populate after VPA has collected enough historical data to make a recommendation.
Cluster requirement: VPA in
InitialorAutomode requires the VPA admission controller and recommender to be installed in the cluster. On Satusky managed clusters, the VPA components are available. InOffmode the object is created but nothing is applied.
HPA + VPA Auto incompatibility
Section titled “HPA + VPA Auto incompatibility”You cannot use --hpa and --vpa-mode Auto together. The CLI enforces this and exits with an error:
❌ validation failed: HPA and VPA with mode 'Auto' cannot be used together - they both try to scale resourcesThe reason: HPA scales pod count based on percentage of the resource request. VPA in Auto mode changes the resource request. The two controllers fight each other and produce unpredictable behaviour.
Safe combinations:
- HPA +
--vpa-mode Off(VPA recommendations only, not applied) - HPA +
--vpa-mode Initial(VPA sets requests at pod start, HPA drives count) - VPA
--vpa-mode Autoalone (no HPA)
Step 5: Set a Rolling Update Strategy
Section titled “Step 5: Set a Rolling Update Strategy”By default, Satusky uses a rolling update strategy (25% max surge, 25% max unavailable). You can specify the strategy explicitly:
1ctl deploy \ --image nginx:alpine \ --machine compute-main-01 \ --strategy rolling \ --rolling-max-surge 25% \ --rolling-max-unavailable 0To replace all pods at once with a brief outage window, use recreate:
1ctl deploy \ --image nginx:alpine \ --machine compute-main-01 \ --strategy recreateStrategy flags reference
Section titled “Strategy flags reference”| Flag | Default | Description |
|---|---|---|
--strategy | rolling | Rollout strategy: rolling or recreate |
--rolling-max-surge | 25% | Extra pods allowed during rollout (count or percentage, e.g. 1 or 25%) |
--rolling-max-unavailable | 25% | Pods allowed to be unavailable during rollout (count or percentage, e.g. 0 or 25%) |
The rollout tuning flags are propagated to the Kubernetes Deployment spec. For example, --rolling-max-surge 1 --rolling-max-unavailable 0 yields maxSurge: 1 and maxUnavailable: 0 on the live Deployment.
Full Production Example
Section titled “Full Production Example”A stateless API with HPA, PDB, and conservative VPA right-sizing in Initial mode:
1ctl deploy \ --image nginx:alpine \ --machine compute-main-01 \ --cpu 0.5 \ --memory 512Mi \ --replicas 2 \ --hpa \ --hpa-min-replicas 2 \ --hpa-max-replicas 20 \ --hpa-cpu-target 65 \ --pdb \ --pdb-type fixed \ --pdb-min-available 1 \ --vpa \ --vpa-mode Initial \ --vpa-min-cpu 0.25 \ --vpa-max-cpu 2 \ --vpa-min-memory 128Mi \ --vpa-max-memory 1Gi \ --strategy rollingWhat this creates in the cluster:
- A
Deploymentwith 2 initial replicas - An
HPAthat scales between 2 and 20 pods at 65% CPU - A
PDBensuring at least 1 pod is always available - A
VPAthat sets resource requests when pods start (no live restarts)
Checking Deployment State
Section titled “Checking Deployment State”deploy get
Section titled “deploy get”Shows the current deployment record from the Satusky control plane:
1ctl deploy getDeployment Details──────────────────Deployment ID: e36538c6-b539-4fe8-9f73-8061160f306eStatus: completedURL: https://cleverhawk-kwvmacg.satusky.comDeployed to machines: <machine-uuid>Type: productionRegion:Zone:Version: alpinePort: 80CPU Request: 0.5Memory Request: 256MiMemory Limit: 256MiCreated: 1 minute agoLast Updated: just nowdeploy get shows the deployment configuration. It does not show current replica count, HPA scaling events, or VPA recommendations.
deploy status
Section titled “deploy status”Shows the live status of the running deployment:
1ctl deploy statusStatus: RunningMessage: Deployment is running normallyProgress: 100%deploy status does not show HPA scaling state (how many replicas are currently running, or what the current CPU utilisation is). For that, see your credits dashboard or inspect the HPA directly with kubectl.
credits usage
Section titled “credits usage”Shows how many machine-hours your deployment has consumed over the last 7 days:
1ctl credits usage💡 No machine usage found for the last 7 daysUsage data appears after your first billing cycle. Higher-than-expected credit consumption during a traffic spike confirms HPA scaled up as intended. Related commands:
1ctl credits balance # current credit balance1ctl credits transactions # itemised billing historyCleanup
Section titled “Cleanup”Destroy the deployment and all associated resources:
1ctl deploy destroy -y💡 Destroying deployment e36538c6-b539-4fe8-9f73-8061160f306e...✅ Deployment e36538c6-b539-4fe8-9f73-8061160f306e destroyed successfullydeploy destroy removes the Deployment, HPA, and VPA objects. It does not remove the PDB. If you created one, delete it manually:
kubectl -n <your-namespace> delete pdb <app-name>-pdbVerify everything is gone:
kubectl -n <your-namespace> get deployment,hpa,vpa,pdb | grep <app-name>What Requires a Production Cluster
Section titled “What Requires a Production Cluster”The following capabilities require infrastructure beyond what the CLI alone can verify:
| Capability | Requirement |
|---|---|
| HPA actually scaling pods | metrics-server running; real traffic load |
| VPA actually updating resource requests | VPA admission controller + recommender installed |
| Seeing RESTARTS drop after VPA right-sizing | Real workload + time for VPA to collect data |
credits usage reflecting scale events | Active deployment with billing cycle elapsed |
| Rolling update surge/unavailable values applying | Platform fix pending (current release: always uses defaults) |
On Satusky managed clusters, metrics-server and the VPA components are installed. You do not need to install them yourself.