Describe steps for safely releasing provision script changes#2384
Describe steps for safely releasing provision script changes#2384
Conversation
PR SummaryLow Risk Overview Reviewed by Cursor Bugbot for commit 1da7437. Bugbot is set up for automated code reviews on this repo. Configure here. |
| Current process: | ||
|
|
||
| 1. Deploy template managers with the new version of `provision.sh`. | ||
| 2. Increment the `build-provision-version` environment variable in the template manager deployment. |
There was a problem hiding this comment.
🔴 Step 2 of the rollout guide incorrectly describes build-provision-version as an environment variable in the template manager deployment, but it is actually a LaunchDarkly feature flag — updating it requires the LaunchDarkly dashboard, not a deployment manifest change. This distinction matters operationally: env var changes require pod restarts, while feature flag changes take effect immediately and support per-template/per-team targeting for gradual rollout.
Extended reasoning...
What the bug is
Line 16 of provision.md instructs: "Increment the build-provision-version environment variable in the template manager deployment." This is factually incorrect. build-provision-version is a LaunchDarkly integer feature flag, not a deployment environment variable.
The code path
In packages/shared/pkg/featureflags/flags.go:166, the flag is declared as:
BuildProvisionVersion = newIntFlag("build-provision-version", 0)In packages/orchestrator/pkg/template/build/phases/base/hash.go:35-43, it is read via the LaunchDarkly client:
bb.featureFlags.IntFlag(ctx, featureflags.BuildProvisionVersion, ...)with per-template and per-team targeting contexts passed in. There is no environment variable named build-provision-version anywhere in the codebase.
Why existing docs don't prevent this
The document is newly introduced in this PR, so there is no prior documentation to catch the error. The flag name closely resembles what an env var might be called, which makes the mistake easy to introduce.
Impact
An operator following this runbook during a live rollout would search the deployment YAML or Helm charts for an env var that does not exist. They would be unable to execute step 2, blocking the rollout. Even if they eventually discovered the correct mechanism, the delay could mean either missed coordinated rollout timing or incorrect configuration methods being applied (e.g., accidentally adding a spurious env var to the deployment).
Additionally, the document misses an important capability: because this is a LaunchDarkly feature flag with per-template and per-team targeting, operators can perform a gradual rollout to a subset of templates or teams — which directly addresses the cache stampede concern raised by the document itself. By calling it an env var, this capability is obscured from the reader.
How to fix
Replace line 16 with accurate guidance, e.g.: "Increment the build-provision-version LaunchDarkly feature flag via the LaunchDarkly dashboard. Because this flag supports per-template and per-team targeting, you can roll out the cache invalidation gradually to avoid overwhelming template managers."
Step-by-step proof
- An operator reads step 2: "Increment the
build-provision-versionenvironment variable in the template manager deployment." - They open the template manager Kubernetes deployment manifest or Helm values file and search for
build-provision-version. - They find no such env var — because it does not exist as an env var anywhere in the deployment configuration.
- They are stuck and must escalate or investigate the codebase independently.
- Eventually they find
packages/shared/pkg/featureflags/flags.go:166andhash.go:35-43, realizing the value must be updated in LaunchDarkly — a completely separate system requiring different tooling and access.
Note on duplicate reports
Multiple verifiers flagged this as a duplicate of other bugs (bug_003/bug_006). The synthesis agent has canonicalized these into merged_bug_001. The underlying issue is confirmed by all verifiers: the documentation is factually wrong about the mechanism for this flag.
| @@ -0,0 +1,18 @@ | |||
| The provison.sh script file sets up any base dependencies required for running envd and sdk commands. | |||
There was a problem hiding this comment.
🟡 Line 1 of provision.md contains a typo: 'provison.sh' is missing the second 'i' and should be 'provision.sh'. This is a minor documentation inconsistency since the rest of the file (lines 5, 15) spells it correctly.
Extended reasoning...
What the bug is: The very first sentence of the newly added provision.md file contains a misspelling: 'The provison.sh script file...' — the word 'provison' is missing the second 'i', making it different from the actual filename 'provision.sh'.
The specific code path: Line 1 of the diff reads: 'The provison.sh script file sets up any base dependencies required for running envd and sdk commands.' The misspelling exists only in this opening sentence.
Why existing code doesn't prevent it: This is a documentation file with no automated spell-checking or linting in place to catch typos. The rest of the document consistently uses the correct spelling 'provision.sh' (lines 5, 7, 8, 15), which confirms the opening sentence is a one-off typo rather than a systemic naming issue.
Impact: The impact is minimal — this is documentation-only and has no runtime effect. However, a misspelling in the very first sentence can cause confusion for readers trying to locate or reference the actual script file named 'provision.sh'.
How to fix it: Change 'provison.sh' to 'provision.sh' on line 1.
Step-by-step proof:
- Open the new file at packages/orchestrator/pkg/template/build/phases/base/provision.md
- Read line 1: 'The provison.sh script file sets up any base dependencies...'
- Count the letters: p-r-o-v-i-s-o-n → 'provison' has no second 'i', but the correct name is p-r-o-v-i-s-i-o-n → 'provision'
- Compare with line 5: 'The provision.sh script is executed immediately...' — correctly spelled
- Compare with line 15: 'Deploy template managers with the new version of
provision.sh.' — correctly spelled - Conclusion: line 1 is a typo that contradicts the rest of the document and the actual filename.
No description provided.