Autoscaling

Satusky exposes three deployment scaling modes: manual replicas, Horizontal Pod Autoscaler (HPA), and Vertical Pod Autoscaler (VPA). Each maps to a Kubernetes mechanism.

Current scaling modes

Mode	Current CLI surface	Best for
Manual	`1ctl deploy --replicas <n>`	Predictable workloads
HPA	`--hpa --hpa-min-replicas --hpa-max-replicas --hpa-cpu-target --hpa-memory-target`	Stateless workloads with variable demand
VPA	`--vpa --vpa-mode ...`	Workloads whose resource sizing is uncertain

Manual scaling

Manual scaling is the simplest path: the deployment declares a replica count and Kubernetes maintains it.

Horizontal Pod Autoscaler

HPA changes replica count based on observed utilization. The CLI currently exposes the main user controls:

1ctl deploy --hpa \
  --hpa-min-replicas 2 \
  --hpa-max-replicas 10 \
  --hpa-cpu-target 80

The backend model also supports more detailed HPA behavior fields such as stabilization windows and max pod changes per period. Those lower-level tuning controls exist in the backend model, but they are not yet part of the ordinary public CLI contract.

Important distinction: HPA control-loop timing and custom stabilization policy are related, but not the same promise. The architecture should not hard-code a “3 minutes up / 5 minutes down” rule unless that is deliberately made part of the product contract.

Vertical Pod Autoscaler

VPA changes pod resource recommendations or applied requests over time.

Mode	Behavior
`Off`	Recommendations only
`Initial`	Apply recommendations when new pods start
`Auto`	Apply recommendations to live workloads, which can require eviction/restart

HPA and VPA

HPA and VPA should not both actively control the same resource dimension. HPA calculates utilization against requested resources; VPA changes those requests. If both mutate the same denominator, the control loops can fight each other.

Pod Disruption Budgets

A Pod Disruption Budget protects availability during voluntary disruption such as node drain or maintenance. The current CLI can configure PDB behavior and automatically enables one for multi-replica deployments.

PDBs do not protect against involuntary failures such as node loss. That requires replica placement across failure domains.

Current gaps and target direction

Gap	Target
Backend supports richer HPA behavior than the CLI documents.	Either expose it intentionally or keep it internal and avoid documenting product guarantees that are not user-configurable.
Scaling strategy is deployment-centric today.	Future machine-aware scheduling should compose with scaling, not fork it.
Metrics visibility is scattered across deploy/status surfaces.	Observability should make current utilization, desired replicas, and autoscaler decisions easier to inspect.