fix(k8s): handle ignored repo.Update errors in provisioner by jackthepunished · Pull Request #5 · jackthepunished/thecloud

jackthepunished · 2026-04-29T20:23:17Z

Summary

Closes Ignored errors from repo.Update() in K8s provisioner may cause silent state inconsistency poyrazK/thecloud#310. Four repo.Update() calls in internal/repositories/k8s/provisioner.go previously discarded their error with _ =, causing silent state inconsistency between in-memory cluster state and the database.
Provision now wraps and returns the persistence error after flipping status to Running, so callers know the state was not durably committed.
provisionControlPlane routes update failures (ControlPlaneIPs, encrypted kubeconfig) through failCluster so the cluster is consistently marked Failed.
failCluster logs the secondary update failure rather than silently dropping it, while still returning the original failure error to the caller.

Test plan

go build ./... clean
go test ./internal/repositories/k8s/... passes
New regression subtest TestFailCluster/UpdateFails_StillReturnsOriginalError verifies the original failure error is preserved when status persistence fails

The password reset flow logged the plaintext token alongside the user's email at Debug level so it could be picked up during MVP testing. Anyone with log access could mint a password reset and take over the account. Replace the Debug log with an Info entry that records only the user_id and reset-token row id. The plaintext token must reach the user via email; once an EmailService is wired in it should be the sole consumer. Closes poyrazK#293

…r injection Object keys flowed straight into the Content-Disposition header via fmt.Sprintf("attachment; filename=%s", key), so any key containing CRLF, double quotes, or backslashes could split the response or inject arbitrary headers. Path-bearing keys also leaked the full bucket path to the client as the suggested filename. Replace both call sites (authenticated download and presigned download) with a single helper, contentDispositionAttachment, that: - reduces the key to its basename via path.Base - emits an ASCII-only `filename="..."` fallback with control bytes, non-ASCII bytes, quotes, and backslashes mapped to `_` - emits the full Unicode basename in `filename*=UTF-8''...` per RFC 5987 with proper attr-char percent-encoding - guarantees CR/LF can never reach the wire Add table-driven unit tests covering response-splitting, quote/backslash injection, non-ASCII names, nested keys, and empty keys. Closes poyrazK#225, poyrazK#226

Webhook delivery in NotifyService.deliverToWebhook used http.DefaultClient and only inspected transport errors, so 4xx/5xx responses were silently treated as successful deliveries. Subscribers returning failures (bad endpoints, transient 5xx) had messages dropped without any log signal. Check resp.StatusCode and emit a structured warn log on >=400 responses, add a 15s timeout via a dedicated http.Client, drain the response body before close, and stop discarding the error from NewRequestWithContext. Closes poyrazK#338 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes poyrazK#310. Previously four repo.Update() calls in the kubeadm provisioner discarded their error with `_ =`, hiding silent state inconsistency when the database write failed: - Provision: returning success after Status flip to Running. - provisionControlPlane: persisting ControlPlaneIPs and KubeconfigEncrypted. - failCluster: persisting ClusterStatusFailed. Now: - Provision wraps and returns the persistence error so the caller knows state was not durably committed. - provisionControlPlane routes update failures through failCluster so the cluster is consistently marked Failed. - failCluster logs (rather than silently drops) a secondary update failure while preserving the original failure error for the caller. Adds a regression test ensuring failCluster surfaces the original error even when the status persistence fails.

coderabbitai · 2026-04-29T20:23:23Z

Warning

Rate limit exceeded

@jackthepunished has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 54 minutes and 54 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d43e9cb6-8f72-4012-bd5e-812975d9b15f

📥 Commits

Reviewing files that changed from the base of the PR and between e2cc272 and d978ba6.

📒 Files selected for processing (7)

internal/core/services/notify.go
internal/core/services/notify_unit_test.go
internal/core/services/password_reset.go
internal/handlers/storage_handler.go
internal/handlers/storage_handler_content_disposition_test.go
internal/repositories/k8s/provisioner.go
internal/repositories/k8s/provisioner_extra_test.go

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/issue-310-k8s-update-errors

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 54 minutes and 54 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR improves reliability and security across several areas by ensuring persistence errors are no longer silently ignored (notably in the k8s provisioner), and by hardening HTTP header generation and webhook delivery handling.

Changes:

k8s provisioner: stop discarding repo.Update() errors; route update failures through error returns / failCluster.
storage downloads: generate a safe RFC 6266/5987 Content-Disposition header to prevent header injection/response splitting.
notify + password reset: harden webhook delivery (request creation + response handling) and stop logging plaintext reset tokens.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
internal/repositories/k8s/provisioner.go	Propagates cluster persistence failures instead of ignoring them; logs secondary update failures in `failCluster`.
internal/repositories/k8s/provisioner_extra_test.go	Adds regression coverage ensuring `failCluster` preserves the original error even if status persistence fails.
internal/handlers/storage_handler.go	Introduces `contentDispositionAttachment` to safely set `Content-Disposition` on downloads.
internal/handlers/storage_handler_content_disposition_test.go	Adds unit tests for sanitization/encoding behavior of `Content-Disposition`.
internal/core/services/notify.go	Adds a bounded HTTP client, checks webhook request creation errors, drains response bodies, and logs delivery failures.
internal/core/services/notify_unit_test.go	Adds regression coverage for webhook non-2xx behavior via `httptest` + log capture.
internal/core/services/password_reset.go	Stops logging plaintext reset tokens; logs token issuance metadata instead.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-29T20:26:43Z

+			"cluster_id", cluster.ID, "error", updateErr, "original_error", err)
+	}
 	p.logger.Error(msg, "cluster_id", cluster.ID, "error", err)
 	return fmt.Errorf("%s: %w", msg, err)


failCluster wraps err with %w even when err is nil (several call sites pass nil). This yields a confusing error message (e.g. %!w(<nil>)) and prevents reliable unwrapping. Consider returning errors.New(msg) (or fmt.Errorf(msg)) when err == nil, and only using %w when a real underlying error is present.

Suggested change

return fmt.Errorf("%s: %w", msg, err)

if err != nil {

return fmt.Errorf("%s: %w", msg, err)

}

return fmt.Errorf("%s", msg)

Copilot · 2026-04-29T20:26:44Z

+	if resp.StatusCode >= 400 {
+		s.logger.Warn("webhook delivery failed",
+			"endpoint", sub.Endpoint,
+			"subscription_id", sub.ID,
+			"status", resp.StatusCode)
+		return


The code checks resp.StatusCode >= 400, but the comment/test intention says “non-2xx”. This currently treats 3xx redirects as success. If the desired behavior is “2xx only”, update the condition to fail for any status outside [200, 299] (or adjust the wording if 3xx is intentionally allowed).

Copilot · 2026-04-29T20:26:44Z

+			name:             "trailing slash falls back to download",
+			key:              "folder/",
+			wantFilename:     `filename="folder"`, // path.Base normalizes
+			wantFilenameStar: `filename*=UTF-8''folder`,
+		},


This test case name/comment says “trailing slash falls back to download”, but the expected value is filename="folder" (because path.Base("folder/") returns "folder"). Rename the case/comment to match the actual behavior (or change the expectation if you truly want the fallback).

Copilot · 2026-04-29T20:26:44Z

+// contentDispositionAttachment builds a safe `Content-Disposition: attachment`
+// header for a stored object.
+//
+// Object keys can contain path segments, non-ASCII characters, control
+// characters, quotes, backslashes, or CRLF that — if interpolated naively —
+// would either corrupt the header (HTTP response splitting) or let an attacker
+// inject additional headers. The output therefore emits two parameters per
+// RFC 6266:
+//
+//   - `filename="..."`     ASCII-only fallback for legacy clients. All bytes
+//                          outside the safe printable range and the two
+//                          characters that are special inside a quoted-string
+//                          (`"` and `\`) are replaced with `_`.
+//   - `filename*=UTF-8''…` RFC 5987 percent-encoded form preserving the
+//                          original Unicode basename for modern clients.
+//
+// `path.Base` is used to discard any path segments embedded in the key. If the
+// resulting name is empty we fall back to "download".
+func contentDispositionAttachment(key string) string {
+	name := path.Base(key)
+	if name == "." || name == "/" || name == "" {
+		name = "download"
+	}
+
+	return fmt.Sprintf(`attachment; filename="%s"; filename*=UTF-8''%s`,
+		asciiFilenameFallback(name), rfc5987Encode(name))
+}


The PR title/description focuses on k8s provisioner persistence errors, but this diff also introduces unrelated behavior changes in storage downloads (new Content-Disposition sanitization), notify webhook delivery (status handling + client), and password reset logging. To keep review/revert risk low, either update the PR description to include these additional changes and their motivation, or split them into separate PRs.

jackthepunished and others added 4 commits April 29, 2026 22:49

Copilot AI review requested due to automatic review settings April 29, 2026 20:23

Copilot started reviewing on behalf of jackthepunished April 29, 2026 20:23 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(k8s): handle ignored repo.Update errors in provisioner#5

fix(k8s): handle ignored repo.Update errors in provisioner#5
jackthepunished wants to merge 4 commits intomainfrom
fix/issue-310-k8s-update-errors

jackthepunished commented Apr 29, 2026

Uh oh!

coderabbitai Bot commented Apr 29, 2026

Rate limit exceeded

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jackthepunished commented Apr 29, 2026

Summary

Test plan

Uh oh!

coderabbitai Bot commented Apr 29, 2026

Rate limit exceeded

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants