Skip to content

Autoscaling

Satusky exposes three deployment scaling modes: manual replicas, Horizontal Pod Autoscaler (HPA), and Vertical Pod Autoscaler (VPA). Each maps to a Kubernetes mechanism.

ModeCurrent CLI surfaceBest for
Manual1ctl deploy --replicas <n>Predictable workloads
HPA--hpa --hpa-min-replicas --hpa-max-replicas --hpa-cpu-target --hpa-memory-targetStateless workloads with variable demand
VPA--vpa --vpa-mode ...Workloads whose resource sizing is uncertain

Manual scaling is the simplest path: the deployment declares a replica count and Kubernetes maintains it.

HPA changes replica count based on observed utilization. The CLI currently exposes the main user controls:

Terminal window
1ctl deploy --hpa \
--hpa-min-replicas 2 \
--hpa-max-replicas 10 \
--hpa-cpu-target 80

The backend model also supports more detailed HPA behavior fields such as stabilization windows and max pod changes per period. Those lower-level tuning controls exist in the backend model, but they are not yet part of the ordinary public CLI contract.

Important distinction: HPA control-loop timing and custom stabilization policy are related, but not the same promise. The architecture should not hard-code a “3 minutes up / 5 minutes down” rule unless that is deliberately made part of the product contract.

VPA changes pod resource recommendations or applied requests over time.

ModeBehavior
OffRecommendations only
InitialApply recommendations when new pods start
AutoApply recommendations to live workloads, which can require eviction/restart

HPA and VPA should not both actively control the same resource dimension. HPA calculates utilization against requested resources; VPA changes those requests. If both mutate the same denominator, the control loops can fight each other.

Recommended product rule:

  • use HPA active + VPA off/recommendation-only, or
  • use VPA active without HPA, depending on the workload.

A Pod Disruption Budget protects availability during voluntary disruption such as node drain or maintenance. The current CLI can configure PDB behavior and automatically enables one for multi-replica deployments.

PDBs do not protect against involuntary failures such as node loss. That requires replica placement across failure domains.

GapTarget
Backend supports richer HPA behavior than the CLI documents.Either expose it intentionally or keep it internal and avoid documenting product guarantees that are not user-configurable.
Scaling strategy is deployment-centric today.Future machine-aware scheduling should compose with scaling, not fork it.
Metrics visibility is scattered across deploy/status surfaces.Observability should make current utilization, desired replicas, and autoscaler decisions easier to inspect.