Kubernetes HorizontalPodAutoscaler Builder

Generate an HPA YAML scaling pods on CPU or memory metrics

Build a Kubernetes HorizontalPodAutoscaler manifest (autoscaling/v2) with a target Deployment, min/max replicas, CPU and memory utilization targets, and a scale-down stabilization window.

What is a HorizontalPodAutoscaler?

An HPA automatically adjusts the replica count of a workload based on observed metrics such as CPU or memory utilization. When load rises above the target it adds pods; when load drops it removes them, within the min and max replica bounds you set.

Generate a HorizontalPodAutoscaler in seconds

An HPA keeps your application responsive under load and cost-efficient when idle by scaling pod replicas based on real metrics. This builder produces an autoscaling/v2 manifest targeting a Deployment or StatefulSet, with CPU and memory utilization targets and a scale-down stabilization window.

How it works

The HPA controller periodically reads metrics for the pods owned by your target workload and compares them against your targets. For a Utilization target, it computes the ratio of current usage to the pod’s declared resource request and averages across all pods. If the average exceeds the target percentage it scales up; if it falls below, it scales down — always staying within minReplicas and maxReplicas.

The desired replica count is roughly ceil(currentReplicas × currentMetric / targetMetric). Because utilization is relative to requests, your target pods must declare resources.requests for the metric you scale on, or the controller reports unknown and never scales.

Tips and notes

  • Always set resource requests on the target Deployment — they are the denominator for utilization math.
  • Use a scale-down stabilization window (e.g. 300s) to prevent flapping, and keep scale-up responsive for traffic spikes.
  • Combine CPU and memory targets when a workload is bound by both; the HPA scales to satisfy whichever demands more replicas.
  • The metrics-server must be installed in the cluster for CPU/memory targets to work.