Skip to content

feat: add HPA (horizontal pod autoscaler) support to performance config #21

Open
tangxiaoxin06 wants to merge 1 commit into
wellCh4n:mainfrom
tangxiaoxin06:feat/performance-hpa
Open

feat: add HPA (horizontal pod autoscaler) support to performance config #21
tangxiaoxin06 wants to merge 1 commit into
wellCh4n:mainfrom
tangxiaoxin06:feat/performance-hpa

Conversation

@tangxiaoxin06
Copy link
Copy Markdown

Summary

  • Adds optional HPA support to per-environment performance config
  • Backend creates/updates/deletes autoscaling/v2 HPA targeting the app's StatefulSet
  • When autoscaling is enabled, OOPS stops setting spec.replicas to avoid fighting the HPA controller
  • UI exposes a checkbox + min/max replicas + CPU/memory target utilization inputs
  • Translations added for zh/en

Motivation

great fit for a PaaS. HPA is a K8s primitive that should be first-class.

Requirements

Target cluster must have metrics-server installed for HPA to compute utilization. Otherwise HPA stays in <unknown> state.

Test plan

  • Backend compiles on JDK 21
  • Deploy an app with autoscaling disabled — behavior identical to before
  • Enable autoscaling, set min=1 max=5 CPU=70, redeploy → kubectl get hpa -n <ns> shows the HPA
  • Disable autoscaling → HPA deleted
  • Frontend renders checkbox; replicas input hidden when autoscaling is on

Replicas were a static value, forcing operators to manually adjust
them and redeploy under traffic spikes. This change adds an optional
autoscaling block per-environment that drives a Kubernetes HPA
(autoscaling/v2) targeting the application StatefulSet.

Backend:
- Add AutoscalingConfig (enabled/min/max/CPU/memory targets) on
  ApplicationPerformanceConfig.EnvironmentConfig (JSON blob, no
  schema migration needed).
- In ArtifactDeployTask, skip writing spec.replicas when autoscaling
  is enabled to avoid fighting the HPA controller, and create/update
  or delete the HPA resource accordingly.

Frontend:
- Expose an "Autoscaling (HPA)" checkbox with min/max replicas and
  CPU/memory target utilization inputs in the performance tab;
  replicas input hides while autoscaling is on.
- Add matching types, zod schema and zh/en translations.

Requires metrics-server in the target cluster for HPA to function.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds per-environment Horizontal Pod Autoscaler (HPA) configuration support so apps can optionally be scaled by Kubernetes HPA instead of a fixed replica count.

Changes:

  • Extend performance env config model/types/schema to include an optional autoscaling block (enabled/min/max/CPU+memory targets).
  • Update the UI to toggle autoscaling and show autoscaling-specific inputs while hiding the replicas input when enabled.
  • Update backend deployment to create/update/delete an autoscaling/v2 HPA targeting the app’s StatefulSet and stop setting spec.replicas when autoscaling is enabled.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
web/locales/zh.ts Adds new zh translations for autoscaling fields/hint.
web/locales/en.ts Adds new en translations for autoscaling fields/hint.
web/lib/api/types.ts Extends API types with AutoscalingConfig.
web/app/apps/schema.ts Extends zod schema to validate autoscaling fields.
web/app/apps/components/application-performance-info.tsx Adds autoscaling UI controls and conditional rendering of replicas input.
src/main/java/com/github/wellch4n/oops/task/ArtifactDeployTask.java Creates/patches HPA when enabled; avoids setting StatefulSet replicas under HPA.
src/main/java/com/github/wellch4n/oops/data/ApplicationPerformanceConfig.java Adds persisted autoscaling config fields to the env config model.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +139 to +155
var cfg = perfEnvConfig.getAutoscaling();
int minReplicas = cfg.getMinReplicas() == null ? 1 : cfg.getMinReplicas();
int maxReplicas = cfg.getMaxReplicas() == null ? Math.max(minReplicas, 10) : cfg.getMaxReplicas();
if (maxReplicas < minReplicas) {
maxReplicas = minReplicas;
}

List<io.fabric8.kubernetes.api.model.autoscaling.v2.MetricSpec> metrics = new ArrayList<>();
if (cfg.getTargetCpuUtilization() != null) {
metrics.add(buildResourceMetric("cpu", cfg.getTargetCpuUtilization()));
}
if (cfg.getTargetMemoryUtilization() != null) {
metrics.add(buildResourceMetric("memory", cfg.getTargetMemoryUtilization()));
}
if (metrics.isEmpty()) {
metrics.add(buildResourceMetric("cpu", 70));
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minReplicas, maxReplicas, and target utilizations are taken directly from config without validation/clamping (other than max < min). If values are out of range (e.g. utilization <=0 or >100), the HPA patch will fail and break deploys. Consider validating these fields server-side (clamp to valid ranges or throw a descriptive exception) before building the HPA spec.

Copilot uses AI. Check for mistakes.
Comment on lines +314 to +321
<Input
type="number"
min={1}
placeholder="1"
className="w-24"
value={field.value ?? ""}
onChange={e => field.onChange(e.target.value === "" ? undefined : parseInt(e.target.value, 10))}
/>
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseInt(e.target.value, 10) can produce NaN for certain valid <input type="number"> intermediate values (e.g. "-", "e"), which then gets stored in RHF state and may cause React warnings (value becomes NaN) and broken validation. Consider converting via valueAsNumber and mapping NaN to undefined (or validating/sanitizing the string before parsing).

Copilot uses AI. Check for mistakes.
Comment on lines +334 to +341
<Input
type="number"
min={1}
placeholder="10"
className="w-24"
value={field.value ?? ""}
onChange={e => field.onChange(e.target.value === "" ? undefined : parseInt(e.target.value, 10))}
/>
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same NaN issue as above: parseInt(e.target.value, 10) on a type="number" input can yield NaN, which then becomes the controlled value and can trigger React warnings / invalid form state. Prefer valueAsNumber + Number.isNaN guard (or sanitize before parsing).

Copilot uses AI. Check for mistakes.
Comment on lines +361 to +362
value={field.value ?? ""}
onChange={e => field.onChange(e.target.value === "" ? undefined : parseInt(e.target.value, 10))}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same NaN issue as above for the utilization input: parseInt can yield NaN for intermediate/invalid number strings, and then value={field.value ?? ""} will pass NaN through as a controlled value. Guard against NaN (e.g. valueAsNumber + isNaN -> undefined).

Suggested change
value={field.value ?? ""}
onChange={e => field.onChange(e.target.value === "" ? undefined : parseInt(e.target.value, 10))}
value={typeof field.value === "number" && !Number.isNaN(field.value) ? field.value : ""}
onChange={e => {
const value = e.target.valueAsNumber
field.onChange(Number.isNaN(value) ? undefined : value)
}}

Copilot uses AI. Check for mistakes.
Comment on lines +379 to +387
<Input
type="number"
min={1}
max={100}
placeholder="—"
className="pr-8"
value={field.value ?? ""}
onChange={e => field.onChange(e.target.value === "" ? undefined : parseInt(e.target.value, 10))}
/>
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same NaN issue as above for the utilization input: parseInt can return NaN, which is then stored in form state and fed back into the controlled input value. Consider handling NaN explicitly (map to undefined) to avoid warnings and validation problems.

Copilot uses AI. Check for mistakes.
Comment thread web/app/apps/schema.ts
minReplicas: z.number().int().min(1).optional(),
maxReplicas: z.number().int().min(1).optional(),
targetCpuUtilization: z.number().int().min(1).max(100).optional(),
targetMemoryUtilization: z.number().int().min(1).max(100).optional(),
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema allows autoscaling.minReplicas and autoscaling.maxReplicas independently, but doesn't enforce maxReplicas >= minReplicas. Since the backend silently clamps maxReplicas up to minReplicas, users can save an invalid combination without feedback. Consider adding a cross-field refinement so the UI can show a validation error when both are set and max < min.

Suggested change
targetMemoryUtilization: z.number().int().min(1).max(100).optional(),
targetMemoryUtilization: z.number().int().min(1).max(100).optional(),
}).superRefine((autoscaling, ctx) => {
if (
autoscaling.minReplicas !== undefined &&
autoscaling.maxReplicas !== undefined &&
autoscaling.maxReplicas < autoscaling.minReplicas
) {
ctx.addIssue({
code: z.ZodIssueCode.custom,
message: "Max replicas must be greater than or equal to min replicas",
path: ["maxReplicas"],
})
}

Copilot uses AI. Check for mistakes.
Comment on lines +133 to +134
} catch (Exception e) {
log.warn("Failed to delete HPA for {}/{}: {}", namespace, applicationName, e.getMessage());
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When autoscaling is disabled, HPA deletion errors are swallowed for all exception types. If deletion fails for reasons other than NotFound (e.g. RBAC/network), the old HPA may remain and continue reconciling replicas, which can conflict with setting StatefulSet.spec.replicas. Consider only ignoring NotFound, and failing the deploy task (or surfacing a clear error) for other deletion failures.

Suggested change
} catch (Exception e) {
log.warn("Failed to delete HPA for {}/{}: {}", namespace, applicationName, e.getMessage());
} catch (io.fabric8.kubernetes.client.KubernetesClientException e) {
if (e.getCode() == 404) {
log.debug("HPA for {}/{} does not exist, nothing to delete", namespace, applicationName);
} else {
log.error("Failed to delete HPA for {}/{}:", namespace, applicationName, e);
throw e;
}
} catch (RuntimeException e) {
log.error("Failed to delete HPA for {}/{}:", namespace, applicationName, e);
throw e;

Copilot uses AI. Check for mistakes.
@wellCh4n
Copy link
Copy Markdown
Owner

Well done. However, as this was unplanned, I need to run some local tests before merging it into the codebase. It will take some time, but I'll get it done shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants