You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Port the LLM-Based Quality Gate from open-agreements/open-agreements (PRs #324 + #325, ~$0.067/PR total cost) to UseJunior/safe-docx, with a checklist tuned to safe-docx's bug history. Two-phase architecture: pre-merge blocking gate + post-merge non-blocking audit.
Checklist (14 items, safe-docx-tuned)
Items 1–13 derived from two PR-history mining passes (Codex + Gemini deep research) against real safe-docx merged PRs. Item #14 (test quality) added on cross-project prior.
Phase 1 — Advisory mode (single PR): ship full 14-item checklist with LLM_GATE_BLOCKING=false. Comment posts; never sets failure status. Validate on ~3 incoming PRs.
Phase 2 — Blocking + override label + synchronize trigger (second PR): set LLM_GATE_BLOCKING=true, create llm-gate/override label (if not pre-created), add synchronize to pull_request trigger (open-agreements lesson), add Aggregate and post review to required status checks on main.
Phase 3 — Post-merge audit (third PR): ship llm-based-quality-gate-post-merge.yml + harness. Verify on next merge to main.
~$0.033/PR pre-merge with 14 items (extrapolated from open-agreements #325: $0.0334 / 14 items ≈ $0.0024/item). Post-merge audit roughly doubles to ~$0.067/PR total. Docs-only PRs are basically the same per-check cost as substantive PRs because input tokens dominate.
Bootstrap problem mitigation
Phase 1 PR will fail the gate's base_ref checkout (action.yml doesn't exist on main yet) — same trap as open-agreements PR #324. Since no LLM-gate check is in required status checks yet, this is non-blocking. Apply override label if needed.
Phase 2 mitigation: temporarily remove the new required check from branch protection in the same PR that adds it; re-add immediately after merge.
Overview
Port the LLM-Based Quality Gate from
open-agreements/open-agreements(PRs #324 + #325, ~$0.067/PR total cost) toUseJunior/safe-docx, with a checklist tuned to safe-docx's bug history. Two-phase architecture: pre-merge blocking gate + post-merge non-blocking audit.Checklist (14 items, safe-docx-tuned)
Items 1–13 derived from two PR-history mining passes (Codex + Gemini deep research) against real safe-docx merged PRs. Item #14 (test quality) added on cross-project prior.
read_fileresponse metadata parity (fix(docx-core): declare xmlns:w14/w15 on comments root before writing prefixed attributes (#154) #180, fix(docx-mcp): warn when read_file budget is exceeded by a single node (closes #184) #186, fix(docx-mcp): surface comment_load_error on the default budgeted read path (closes #189) #191)w:fldCharoutsidew:del(fix(docx-core): validate w:delInstrText placement and reject w:fldChar inside <w:del> #211, fix(docx-core): partition field-closure validation by ECMA-376 story (#212) #225, fix(docx-core): fragment w:fldChar outside w:del per ECMA-376 Part 4 #228)DocumentViewNode.headingstays canonical (fix(docx-core): harden heading detection (#157 Phase 1) #178, fix(docx-core): suppress non-sectional false-positive headings (closes #187) #188, feat(docx-core): add derived heading object to DocumentViewNode (closes #179) #190)*PrChange) (feat(docx-core): emit pPrChange/trPrChange/tcPrChange from layout setters (#140) #167, feat(docx-mcp): emit rPrChange from clear_formatting MCP tool (#141) #170, feat(docx-core): emit rPrChange for formatted paragraph replacements #215)SUPPORT.mdTable A drift vs. implementation ([120.8] Regression suite for canonical revision emission across the surface #143 review of replaceParagraphTextRange should emit w:rPrChange when run formatting changes #173, addCommentReply should emit body revision markup OR SUPPORT.md should be softened #174)Rollout phases
LLM_GATE_BLOCKING=false. Comment posts; never sets failure status. Validate on ~3 incoming PRs.synchronizetrigger (second PR): setLLM_GATE_BLOCKING=true, createllm-gate/overridelabel (if not pre-created), addsynchronizetopull_requesttrigger (open-agreements lesson), addAggregate and post reviewto required status checks onmain.llm-based-quality-gate-post-merge.yml+ harness. Verify on next merge tomain.Pre-flight (operator)
GEMINI_API_KEYrepository secretllm-gate/overridelabel (yellow#FBCA04)LLM_GATE_BLOCKING,LLM_GATE_MAX_PARALLEL,LLM_GATE_PRICE_PER_M_INPUT,LLM_GATE_PRICE_PER_M_OUTPUT,LLM_GATE_MODEL,LLM_GATE_CLI_VERSION(set atomically with Phase 1 PR)Cost expectation
~$0.033/PR pre-merge with 14 items (extrapolated from open-agreements #325: $0.0334 / 14 items ≈ $0.0024/item). Post-merge audit roughly doubles to ~$0.067/PR total. Docs-only PRs are basically the same per-check cost as substantive PRs because input tokens dominate.
Bootstrap problem mitigation
Phase 1 PR will fail the gate's
base_refcheckout (action.yml doesn't exist onmainyet) — same trap as open-agreements PR #324. Since no LLM-gate check is in required status checks yet, this is non-blocking. Apply override label if needed.Phase 2 mitigation: temporarily remove the new required check from branch protection in the same PR that adds it; re-add immediately after merge.
Precedent