Skip to content

Describe steps for safely releasing provision script changes#2384

Open
djeebus wants to merge 2 commits intomainfrom
provision-readme
Open

Describe steps for safely releasing provision script changes#2384
djeebus wants to merge 2 commits intomainfrom
provision-readme

Conversation

@djeebus
Copy link
Copy Markdown
Contributor

@djeebus djeebus commented Apr 13, 2026

No description provided.

@cursor
Copy link
Copy Markdown

cursor bot commented Apr 13, 2026

PR Summary

Low Risk
Low risk: documentation-only change with no runtime or behavioral impact.

Overview
Adds documentation describing how provision.sh runs and is cached during template builds, and outlines a safe rollout sequence for updating the script (including bumping build-provision-version and coordinating envd/sdk releases) to avoid cache stampedes and prompt users to rebuild templates.

Reviewed by Cursor Bugbot for commit 1da7437. Bugbot is set up for automated code reviews on this repo. Configure here.

@djeebus djeebus marked this pull request as ready for review April 14, 2026 00:35
Current process:

1. Deploy template managers with the new version of `provision.sh`.
2. Increment the `build-provision-version` environment variable in the template manager deployment.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Step 2 of the rollout guide incorrectly describes build-provision-version as an environment variable in the template manager deployment, but it is actually a LaunchDarkly feature flag — updating it requires the LaunchDarkly dashboard, not a deployment manifest change. This distinction matters operationally: env var changes require pod restarts, while feature flag changes take effect immediately and support per-template/per-team targeting for gradual rollout.

Extended reasoning...

What the bug is

Line 16 of provision.md instructs: "Increment the build-provision-version environment variable in the template manager deployment." This is factually incorrect. build-provision-version is a LaunchDarkly integer feature flag, not a deployment environment variable.

The code path

In packages/shared/pkg/featureflags/flags.go:166, the flag is declared as:

BuildProvisionVersion = newIntFlag("build-provision-version", 0)

In packages/orchestrator/pkg/template/build/phases/base/hash.go:35-43, it is read via the LaunchDarkly client:

bb.featureFlags.IntFlag(ctx, featureflags.BuildProvisionVersion, ...)

with per-template and per-team targeting contexts passed in. There is no environment variable named build-provision-version anywhere in the codebase.

Why existing docs don't prevent this

The document is newly introduced in this PR, so there is no prior documentation to catch the error. The flag name closely resembles what an env var might be called, which makes the mistake easy to introduce.

Impact

An operator following this runbook during a live rollout would search the deployment YAML or Helm charts for an env var that does not exist. They would be unable to execute step 2, blocking the rollout. Even if they eventually discovered the correct mechanism, the delay could mean either missed coordinated rollout timing or incorrect configuration methods being applied (e.g., accidentally adding a spurious env var to the deployment).

Additionally, the document misses an important capability: because this is a LaunchDarkly feature flag with per-template and per-team targeting, operators can perform a gradual rollout to a subset of templates or teams — which directly addresses the cache stampede concern raised by the document itself. By calling it an env var, this capability is obscured from the reader.

How to fix

Replace line 16 with accurate guidance, e.g.: "Increment the build-provision-version LaunchDarkly feature flag via the LaunchDarkly dashboard. Because this flag supports per-template and per-team targeting, you can roll out the cache invalidation gradually to avoid overwhelming template managers."

Step-by-step proof

  1. An operator reads step 2: "Increment the build-provision-version environment variable in the template manager deployment."
  2. They open the template manager Kubernetes deployment manifest or Helm values file and search for build-provision-version.
  3. They find no such env var — because it does not exist as an env var anywhere in the deployment configuration.
  4. They are stuck and must escalate or investigate the codebase independently.
  5. Eventually they find packages/shared/pkg/featureflags/flags.go:166 and hash.go:35-43, realizing the value must be updated in LaunchDarkly — a completely separate system requiring different tooling and access.

Note on duplicate reports

Multiple verifiers flagged this as a duplicate of other bugs (bug_003/bug_006). The synthesis agent has canonicalized these into merged_bug_001. The underlying issue is confirmed by all verifiers: the documentation is factually wrong about the mechanism for this flag.

@@ -0,0 +1,18 @@
The provison.sh script file sets up any base dependencies required for running envd and sdk commands.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Line 1 of provision.md contains a typo: 'provison.sh' is missing the second 'i' and should be 'provision.sh'. This is a minor documentation inconsistency since the rest of the file (lines 5, 15) spells it correctly.

Extended reasoning...

What the bug is: The very first sentence of the newly added provision.md file contains a misspelling: 'The provison.sh script file...' — the word 'provison' is missing the second 'i', making it different from the actual filename 'provision.sh'.

The specific code path: Line 1 of the diff reads: 'The provison.sh script file sets up any base dependencies required for running envd and sdk commands.' The misspelling exists only in this opening sentence.

Why existing code doesn't prevent it: This is a documentation file with no automated spell-checking or linting in place to catch typos. The rest of the document consistently uses the correct spelling 'provision.sh' (lines 5, 7, 8, 15), which confirms the opening sentence is a one-off typo rather than a systemic naming issue.

Impact: The impact is minimal — this is documentation-only and has no runtime effect. However, a misspelling in the very first sentence can cause confusion for readers trying to locate or reference the actual script file named 'provision.sh'.

How to fix it: Change 'provison.sh' to 'provision.sh' on line 1.

Step-by-step proof:

  1. Open the new file at packages/orchestrator/pkg/template/build/phases/base/provision.md
  2. Read line 1: 'The provison.sh script file sets up any base dependencies...'
  3. Count the letters: p-r-o-v-i-s-o-n → 'provison' has no second 'i', but the correct name is p-r-o-v-i-s-i-o-n → 'provision'
  4. Compare with line 5: 'The provision.sh script is executed immediately...' — correctly spelled
  5. Compare with line 15: 'Deploy template managers with the new version of provision.sh.' — correctly spelled
  6. Conclusion: line 1 is a typo that contradicts the rest of the document and the actual filename.

Copy link
Copy Markdown
Contributor

@dobrac dobrac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bot comments are valid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants