feat: add HPA (horizontal pod autoscaler) support to performance config #21
feat: add HPA (horizontal pod autoscaler) support to performance config #21tangxiaoxin06 wants to merge 1 commit into
Conversation
Replicas were a static value, forcing operators to manually adjust them and redeploy under traffic spikes. This change adds an optional autoscaling block per-environment that drives a Kubernetes HPA (autoscaling/v2) targeting the application StatefulSet. Backend: - Add AutoscalingConfig (enabled/min/max/CPU/memory targets) on ApplicationPerformanceConfig.EnvironmentConfig (JSON blob, no schema migration needed). - In ArtifactDeployTask, skip writing spec.replicas when autoscaling is enabled to avoid fighting the HPA controller, and create/update or delete the HPA resource accordingly. Frontend: - Expose an "Autoscaling (HPA)" checkbox with min/max replicas and CPU/memory target utilization inputs in the performance tab; replicas input hides while autoscaling is on. - Add matching types, zod schema and zh/en translations. Requires metrics-server in the target cluster for HPA to function. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds per-environment Horizontal Pod Autoscaler (HPA) configuration support so apps can optionally be scaled by Kubernetes HPA instead of a fixed replica count.
Changes:
- Extend performance env config model/types/schema to include an optional
autoscalingblock (enabled/min/max/CPU+memory targets). - Update the UI to toggle autoscaling and show autoscaling-specific inputs while hiding the replicas input when enabled.
- Update backend deployment to create/update/delete an
autoscaling/v2HPA targeting the app’s StatefulSet and stop settingspec.replicaswhen autoscaling is enabled.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| web/locales/zh.ts | Adds new zh translations for autoscaling fields/hint. |
| web/locales/en.ts | Adds new en translations for autoscaling fields/hint. |
| web/lib/api/types.ts | Extends API types with AutoscalingConfig. |
| web/app/apps/schema.ts | Extends zod schema to validate autoscaling fields. |
| web/app/apps/components/application-performance-info.tsx | Adds autoscaling UI controls and conditional rendering of replicas input. |
| src/main/java/com/github/wellch4n/oops/task/ArtifactDeployTask.java | Creates/patches HPA when enabled; avoids setting StatefulSet replicas under HPA. |
| src/main/java/com/github/wellch4n/oops/data/ApplicationPerformanceConfig.java | Adds persisted autoscaling config fields to the env config model. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| var cfg = perfEnvConfig.getAutoscaling(); | ||
| int minReplicas = cfg.getMinReplicas() == null ? 1 : cfg.getMinReplicas(); | ||
| int maxReplicas = cfg.getMaxReplicas() == null ? Math.max(minReplicas, 10) : cfg.getMaxReplicas(); | ||
| if (maxReplicas < minReplicas) { | ||
| maxReplicas = minReplicas; | ||
| } | ||
|
|
||
| List<io.fabric8.kubernetes.api.model.autoscaling.v2.MetricSpec> metrics = new ArrayList<>(); | ||
| if (cfg.getTargetCpuUtilization() != null) { | ||
| metrics.add(buildResourceMetric("cpu", cfg.getTargetCpuUtilization())); | ||
| } | ||
| if (cfg.getTargetMemoryUtilization() != null) { | ||
| metrics.add(buildResourceMetric("memory", cfg.getTargetMemoryUtilization())); | ||
| } | ||
| if (metrics.isEmpty()) { | ||
| metrics.add(buildResourceMetric("cpu", 70)); | ||
| } |
There was a problem hiding this comment.
minReplicas, maxReplicas, and target utilizations are taken directly from config without validation/clamping (other than max < min). If values are out of range (e.g. utilization <=0 or >100), the HPA patch will fail and break deploys. Consider validating these fields server-side (clamp to valid ranges or throw a descriptive exception) before building the HPA spec.
| <Input | ||
| type="number" | ||
| min={1} | ||
| placeholder="1" | ||
| className="w-24" | ||
| value={field.value ?? ""} | ||
| onChange={e => field.onChange(e.target.value === "" ? undefined : parseInt(e.target.value, 10))} | ||
| /> |
There was a problem hiding this comment.
parseInt(e.target.value, 10) can produce NaN for certain valid <input type="number"> intermediate values (e.g. "-", "e"), which then gets stored in RHF state and may cause React warnings (value becomes NaN) and broken validation. Consider converting via valueAsNumber and mapping NaN to undefined (or validating/sanitizing the string before parsing).
| <Input | ||
| type="number" | ||
| min={1} | ||
| placeholder="10" | ||
| className="w-24" | ||
| value={field.value ?? ""} | ||
| onChange={e => field.onChange(e.target.value === "" ? undefined : parseInt(e.target.value, 10))} | ||
| /> |
There was a problem hiding this comment.
Same NaN issue as above: parseInt(e.target.value, 10) on a type="number" input can yield NaN, which then becomes the controlled value and can trigger React warnings / invalid form state. Prefer valueAsNumber + Number.isNaN guard (or sanitize before parsing).
| value={field.value ?? ""} | ||
| onChange={e => field.onChange(e.target.value === "" ? undefined : parseInt(e.target.value, 10))} |
There was a problem hiding this comment.
Same NaN issue as above for the utilization input: parseInt can yield NaN for intermediate/invalid number strings, and then value={field.value ?? ""} will pass NaN through as a controlled value. Guard against NaN (e.g. valueAsNumber + isNaN -> undefined).
| value={field.value ?? ""} | |
| onChange={e => field.onChange(e.target.value === "" ? undefined : parseInt(e.target.value, 10))} | |
| value={typeof field.value === "number" && !Number.isNaN(field.value) ? field.value : ""} | |
| onChange={e => { | |
| const value = e.target.valueAsNumber | |
| field.onChange(Number.isNaN(value) ? undefined : value) | |
| }} |
| <Input | ||
| type="number" | ||
| min={1} | ||
| max={100} | ||
| placeholder="—" | ||
| className="pr-8" | ||
| value={field.value ?? ""} | ||
| onChange={e => field.onChange(e.target.value === "" ? undefined : parseInt(e.target.value, 10))} | ||
| /> |
There was a problem hiding this comment.
Same NaN issue as above for the utilization input: parseInt can return NaN, which is then stored in form state and fed back into the controlled input value. Consider handling NaN explicitly (map to undefined) to avoid warnings and validation problems.
| minReplicas: z.number().int().min(1).optional(), | ||
| maxReplicas: z.number().int().min(1).optional(), | ||
| targetCpuUtilization: z.number().int().min(1).max(100).optional(), | ||
| targetMemoryUtilization: z.number().int().min(1).max(100).optional(), |
There was a problem hiding this comment.
The schema allows autoscaling.minReplicas and autoscaling.maxReplicas independently, but doesn't enforce maxReplicas >= minReplicas. Since the backend silently clamps maxReplicas up to minReplicas, users can save an invalid combination without feedback. Consider adding a cross-field refinement so the UI can show a validation error when both are set and max < min.
| targetMemoryUtilization: z.number().int().min(1).max(100).optional(), | |
| targetMemoryUtilization: z.number().int().min(1).max(100).optional(), | |
| }).superRefine((autoscaling, ctx) => { | |
| if ( | |
| autoscaling.minReplicas !== undefined && | |
| autoscaling.maxReplicas !== undefined && | |
| autoscaling.maxReplicas < autoscaling.minReplicas | |
| ) { | |
| ctx.addIssue({ | |
| code: z.ZodIssueCode.custom, | |
| message: "Max replicas must be greater than or equal to min replicas", | |
| path: ["maxReplicas"], | |
| }) | |
| } |
| } catch (Exception e) { | ||
| log.warn("Failed to delete HPA for {}/{}: {}", namespace, applicationName, e.getMessage()); |
There was a problem hiding this comment.
When autoscaling is disabled, HPA deletion errors are swallowed for all exception types. If deletion fails for reasons other than NotFound (e.g. RBAC/network), the old HPA may remain and continue reconciling replicas, which can conflict with setting StatefulSet.spec.replicas. Consider only ignoring NotFound, and failing the deploy task (or surfacing a clear error) for other deletion failures.
| } catch (Exception e) { | |
| log.warn("Failed to delete HPA for {}/{}: {}", namespace, applicationName, e.getMessage()); | |
| } catch (io.fabric8.kubernetes.client.KubernetesClientException e) { | |
| if (e.getCode() == 404) { | |
| log.debug("HPA for {}/{} does not exist, nothing to delete", namespace, applicationName); | |
| } else { | |
| log.error("Failed to delete HPA for {}/{}:", namespace, applicationName, e); | |
| throw e; | |
| } | |
| } catch (RuntimeException e) { | |
| log.error("Failed to delete HPA for {}/{}:", namespace, applicationName, e); | |
| throw e; |
|
Well done. However, as this was unplanned, I need to run some local tests before merging it into the codebase. It will take some time, but I'll get it done shortly. |
Summary
autoscaling/v2HPA targeting the app's StatefulSetspec.replicasto avoid fighting the HPA controllerMotivation
great fit for a PaaS. HPA is a K8s primitive that should be first-class.
Requirements
Target cluster must have metrics-server installed for HPA to compute utilization. Otherwise HPA stays in
<unknown>state.Test plan
kubectl get hpa -n <ns>shows the HPA