Autoscaling Your App

Satusky supports three scaling approaches. This guide explains when to use each, walks you through every CLI flag, and shows exactly what each command prints and creates in the cluster.

Overview

Approach	What scales	Trigger
Manual replicas	Pod count	You change it
HPA	Pod count	CPU or memory utilisation
VPA	CPU/memory per pod	Historical usage patterns

Start simple. Add autoscaling only when you understand your traffic patterns.

Prerequisites

1ctl installed and authenticated (1ctl auth login)
A satusky.toml in your project directory, or use --image to skip the build step
A machine provisioned (1ctl machine list to confirm)

Create a working directory for this guide:

mkdir -p /tmp/autoscale-test
cat > /tmp/autoscale-test/satusky.toml << 'EOF'
[app]
  name   = "autoscale-test"
  port   = 80
  cpu    = "0.5"
  memory = "256Mi"
EOF
cd /tmp/autoscale-test

Step 1: Start with Fixed Replicas

For a new deployment, start with an explicit replica count. This gives you a stable baseline before adding autoscaling.

1ctl deploy --image nginx:alpine --machine compute-main-01 --replicas 1

The CLI prints a step-by-step progress indicator as it works:

💡 Using pre-built image: nginx:alpine
Step 2/5: Creating/updating deployment autoscale-test ✓
Step 3/5: Configuring services autoscale-test ✓
Step 4/5: Setting up environment and storage autoscale-test ✓
Step 5/5: Configuring public routing and dependencies autoscale-test ✓
✅ 🚀 Deployment for autoscale-test is successful! Your app is live at: https://<subdomain>.satusky.com
Deployment ID: <uuid>

Two replicas means one pod can restart without taking your app offline. Increase to 2 for any production workload:

1ctl deploy --image nginx:alpine --machine compute-main-01 --replicas 2

Step 2: Add a Pod Disruption Budget

A PodDisruptionBudget (PDB) prevents Kubernetes from removing too many pods simultaneously during node maintenance or rolling updates.

PDB is automatically created when you pass --pdb. It is also auto-enabled by the platform when --replicas > 1, but you can control it explicitly.

PDB with a percentage minimum

1ctl deploy \
  --image nginx:alpine \
  --machine compute-main-01 \
  --replicas 2 \
  --pdb \
  --pdb-type percent \
  --pdb-percent 50

--pdb-percent 50 means at least 50% of pods must remain available at any time. For a 2-replica deployment this keeps at least 1 pod running.

PDB with a fixed minimum

1ctl deploy \
  --image nginx:alpine \
  --machine compute-main-01 \
  --replicas 3 \
  --pdb \
  --pdb-type fixed \
  --pdb-min-available 2

PDB flags reference

Flag	Default	Description
`--pdb`	`false`	Enable PodDisruptionBudget
`--pdb-type`	`auto`	`auto`, `fixed`, or `percent`
`--pdb-min-available`	`0`	Minimum available pods (used with `--pdb-type fixed`)
`--pdb-percent`	`0`	Minimum available percentage 1–100 (used with `--pdb-type percent`)

The PDB is named <app-name>-pdb in your namespace. You can verify it was created:

kubectl -n <your-namespace> get pdb

NAME                 MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
autoscale-test-pdb   50%             N/A               1                     4s

Known limitation: deploy destroy removes the Deployment, HPA, and VPA but does not delete the PDB. Delete it manually if needed:
Terminal window
kubectl -n <your-namespace> delete pdb autoscale-test-pdb

Step 3: Enable Horizontal Pod Autoscaler

HPA scales the number of pod replicas based on CPU or memory utilisation. It is the right choice for stateless apps with variable traffic (web APIs, queue workers).

1ctl deploy \
  --image nginx:alpine \
  --machine compute-main-01 \
  --hpa \
  --hpa-min-replicas 2 \
  --hpa-max-replicas 10 \
  --hpa-cpu-target 70 \
  --pdb \
  --pdb-type percent \
  --pdb-percent 50

What this does:

Keeps at least 2 replicas running at all times
Scales up to 10 replicas when average CPU exceeds 70%
Scales back down when CPU drops and stays low for ~5 minutes (Kubernetes default cooldown)

Adding memory-based scaling:

1ctl deploy \
  --image nginx:alpine \
  --machine compute-main-01 \
  --hpa \
  --hpa-min-replicas 2 \
  --hpa-max-replicas 10 \
  --hpa-cpu-target 70 \
  --hpa-memory-target 80

--hpa-memory-target 0 disables memory scaling (default).

HPA flags reference

Flag	Default	Description
`--hpa`	`false`	Enable HorizontalPodAutoscaler
`--hpa-min-replicas`	`1`	Minimum replicas to keep running
`--hpa-max-replicas`	`10`	Maximum replicas to scale to
`--hpa-cpu-target`	`80`	Target average CPU utilisation percentage
`--hpa-memory-target`	`0`	Target average memory utilisation percentage (`0` = disabled)

The HPA is named <app-name>-hpa. You can inspect it:

kubectl -n <your-namespace> get hpa

NAME                 REFERENCE                   TARGETS              MINPODS   MAXPODS   REPLICAS   AGE
autoscale-test-hpa   Deployment/autoscale-test   cpu: <unknown>/70%   1         5         0          3s

Cluster requirement: HPA actually scaling pods based on live metrics requires metrics-server to be running in the cluster. On Satusky managed clusters, metrics-server is available. The <unknown> in the TARGETS column appears briefly after creation and resolves within one scrape interval (~60 seconds) once pods are running. Seeing CPU actually drive scale events also requires real traffic.

Step 4 (Optional): Right-Size with VPA

VPA analyses historical CPU and memory usage and adjusts resource requests over time. Use it to right-size your pods — not to handle traffic spikes (that is what HPA is for).

VPA modes

Mode	Behaviour
`Off`	Computes recommendations only; no changes applied
`Initial`	Sets resource requests when a pod starts; never updates running pods
`Auto`	Updates resource requests on running pods (triggers restarts)

Start with Initial mode so you can review recommendations before letting VPA apply them automatically:

1ctl deploy \
  --image nginx:alpine \
  --machine compute-main-01 \
  --vpa \
  --vpa-mode Initial \
  --vpa-min-cpu 0.1 \
  --vpa-max-cpu 2 \
  --vpa-min-memory 64Mi \
  --vpa-max-memory 1Gi

The bounds prevent VPA from allocating more than you intend.

VPA flags reference

Flag	Default	Description
`--vpa`	`false`	Enable VerticalPodAutoscaler
`--vpa-mode`	`Off`	`Off`, `Initial`, or `Auto`
`--vpa-min-cpu`	—	Minimum CPU VPA may request (e.g. `100m`)
`--vpa-max-cpu`	—	Maximum CPU VPA may request (e.g. `4`)
`--vpa-min-memory`	—	Minimum memory VPA may request (e.g. `64Mi`)
`--vpa-max-memory`	—	Maximum memory VPA may request (e.g. `1Gi`)

The VPA object is named <app-name>-vpa:

kubectl -n <your-namespace> get vpa

NAME                 MODE      CPU   MEM   PROVIDED   AGE
autoscale-test-vpa   Initial                          3s

CPU and MEM columns populate after VPA has collected enough historical data to make a recommendation.

Cluster requirement: VPA in Initial or Auto mode requires the VPA admission controller and recommender to be installed in the cluster. On Satusky managed clusters, the VPA components are available. In Off mode the object is created but nothing is applied.

HPA + VPA Auto incompatibility

You cannot use --hpa and --vpa-mode Auto together. The CLI enforces this and exits with an error:

❌ validation failed: HPA and VPA with mode 'Auto' cannot be used together - they both try to scale resources

The reason: HPA scales pod count based on percentage of the resource request. VPA in Auto mode changes the resource request. The two controllers fight each other and produce unpredictable behaviour.

Safe combinations:

HPA + --vpa-mode Off (VPA recommendations only, not applied)
HPA + --vpa-mode Initial (VPA sets requests at pod start, HPA drives count)
VPA --vpa-mode Auto alone (no HPA)

Step 5: Set a Rolling Update Strategy

By default, Satusky uses a rolling update strategy (25% max surge, 25% max unavailable). You can specify the strategy explicitly:

1ctl deploy \
  --image nginx:alpine \
  --machine compute-main-01 \
  --strategy rolling \
  --rolling-max-surge 25% \
  --rolling-max-unavailable 0

To replace all pods at once with a brief outage window, use recreate:

1ctl deploy \
  --image nginx:alpine \
  --machine compute-main-01 \
  --strategy recreate

Strategy flags reference

Flag	Default	Description
`--strategy`	`rolling`	Rollout strategy: `rolling` or `recreate`
`--rolling-max-surge`	`25%`	Extra pods allowed during rollout (count or percentage, e.g. `1` or `25%`)
`--rolling-max-unavailable`	`25%`	Pods allowed to be unavailable during rollout (count or percentage, e.g. `0` or `25%`)

The rollout tuning flags are propagated to the Kubernetes Deployment spec. For example, --rolling-max-surge 1 --rolling-max-unavailable 0 yields maxSurge: 1 and maxUnavailable: 0 on the live Deployment.

Full Production Example

A stateless API with HPA, PDB, and conservative VPA right-sizing in Initial mode:

1ctl deploy \
  --image nginx:alpine \
  --machine compute-main-01 \
  --cpu 0.5 \
  --memory 512Mi \
  --replicas 2 \
  --hpa \
  --hpa-min-replicas 2 \
  --hpa-max-replicas 20 \
  --hpa-cpu-target 65 \
  --pdb \
  --pdb-type fixed \
  --pdb-min-available 1 \
  --vpa \
  --vpa-mode Initial \
  --vpa-min-cpu 0.25 \
  --vpa-max-cpu 2 \
  --vpa-min-memory 128Mi \
  --vpa-max-memory 1Gi \
  --strategy rolling

What this creates in the cluster:

A Deployment with 2 initial replicas
An HPA that scales between 2 and 20 pods at 65% CPU
A PDB ensuring at least 1 pod is always available
A VPA that sets resource requests when pods start (no live restarts)

Checking Deployment State

deploy get

Shows the current deployment record from the Satusky control plane:

1ctl deploy get

Deployment Details
──────────────────
Deployment ID: e36538c6-b539-4fe8-9f73-8061160f306e
Status: completed
URL: https://cleverhawk-kwvmacg.satusky.com
Deployed to machines: <machine-uuid>
Type: production
Region:
Zone:
Version: alpine
Port: 80
CPU Request: 0.5
Memory Request: 256Mi
Memory Limit: 256Mi
Created: 1 minute ago
Last Updated: just now

deploy get shows the deployment configuration. It does not show current replica count, HPA scaling events, or VPA recommendations.

deploy status

Shows the live status of the running deployment:

1ctl deploy status

Status: Running
Message: Deployment is running normally
Progress: 100%

deploy status does not show HPA scaling state (how many replicas are currently running, or what the current CPU utilisation is). For that, see your credits dashboard or inspect the HPA directly with kubectl.

credits usage

Shows how many machine-hours your deployment has consumed over the last 7 days:

1ctl credits usage

💡 No machine usage found for the last 7 days

Usage data appears after your first billing cycle. Higher-than-expected credit consumption during a traffic spike confirms HPA scaled up as intended. Related commands:

1ctl credits balance       # current credit balance
1ctl credits transactions  # itemised billing history

Cleanup

Destroy the deployment and all associated resources:

1ctl deploy destroy -y

💡 Destroying deployment e36538c6-b539-4fe8-9f73-8061160f306e...
✅ Deployment e36538c6-b539-4fe8-9f73-8061160f306e destroyed successfully

deploy destroy removes the Deployment, HPA, and VPA objects. It does not remove the PDB. If you created one, delete it manually:

kubectl -n <your-namespace> delete pdb <app-name>-pdb

Verify everything is gone:

kubectl -n <your-namespace> get deployment,hpa,vpa,pdb | grep <app-name>

What Requires a Production Cluster

The following capabilities require infrastructure beyond what the CLI alone can verify:

Capability	Requirement
HPA actually scaling pods	`metrics-server` running; real traffic load
VPA actually updating resource requests	VPA admission controller + recommender installed
Seeing RESTARTS drop after VPA right-sizing	Real workload + time for VPA to collect data
`credits usage` reflecting scale events	Active deployment with billing cycle elapsed
Rolling update surge/unavailable values applying	Platform fix pending (current release: always uses defaults)

On Satusky managed clusters, metrics-server and the VPA components are installed. You do not need to install them yourself.