Skip to content

feat: user sovereignty — AI models recommend, users decide (v0.13.2.0)#603

Merged
garrytan merged 5 commits intomainfrom
garrytan/user-sovereignty
Mar 28, 2026
Merged

feat: user sovereignty — AI models recommend, users decide (v0.13.2.0)#603
garrytan merged 5 commits intomainfrom
garrytan/user-sovereignty

Conversation

@garrytan
Copy link
Copy Markdown
Owner

Summary

When Claude and Codex agree on a scope change, they were auto-incorporating it without asking the user. Real incident: Codex said "merge two skills into one," Claude assessed "outside voice is right" and just did it. User had to push back twice.

This PR adds User Sovereignty as gstack's third core principle. Two root causes fixed:

  • review.ts cross-model tension template said [Your assessment of who's right.], letting Claude judge and act. Now says [Present both perspectives neutrally. State what context you might be missing.] with expanded options (Accept/Keep/Investigate/Defer).
  • autoplan/SKILL.md.tmpl had "auto-decide replaces the USER's judgment" with no carve-out for scope changes. Now has a User Challenge category that is never auto-decided, with proper contract updates (intro, important rules, audit trail schema, gate handling).

Changes across 6 source files + 25 regenerated SKILL.md files:

  • ETHOS.md: User Sovereignty section with Karpathy "Iron Man suit" and Willison "agents are merchants of complexity" references
  • preamble.ts: User sovereignty statement injected into all 21 tier 1+ skill voices
  • autoplan: 8 locations updated (Decision Classification, auto-decide exceptions, Phase 1/2/3 overrides, audit trail, Final Approval Gate, Important Rules, cognitive load, gate options)
  • CEO + Eng review templates: Outside Voice Integration Rule

Pre-Landing Review

Template-only changes (prompt text, no application code). Existing skill-validation and gen-skill-docs tests cover template correctness.

Outside Voice (Codex)

Codex plan review found 8 issues. Key accepted: autoplan contract breakage (intro text, important rules, audit trail schema, gate handling all need updating). Blocker carve-out for security/feasibility added per user decision.

Test plan

  • bun test passes (pre-existing failures only, verified on main)
  • bun run gen:skill-docs regenerates all 30 SKILL.md files successfully
  • "User sovereignty" confirmed in 21 generated SKILL.md files
  • "Two gates" confirmed in autoplan/SKILL.md Important Rules
  • "User Challenge" confirmed in autoplan/SKILL.md (10 occurrences)
  • "Outside Voice Integration Rule" confirmed in plan-ceo-review and plan-eng-review

🤖 Generated with Claude Code

garrytan and others added 3 commits March 28, 2026 08:46
When Claude and Codex agree on a scope change, they now present it to the
user instead of auto-incorporating it. Adds User Sovereignty as the third
core principle in ETHOS.md. Fixes the cross-model tension template in
review.ts to present both perspectives neutrally instead of judging. Adds
User Challenge category to autoplan with proper contract updates (intro,
important rules, audit trail, gate handling). Adds Outside Voice Integration
Rule to CEO and eng review templates.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 28, 2026

E2E Evals: ❌ FAIL

56/57 tests passed | $6.00 total cost | 12 parallel runners

Suite Result Status Cost
e2e-browse 4/4 $0.16
e2e-deploy 6/6 $1.09
e2e-design 3/3 $0.53
e2e-plan 7/7 $1.09
e2e-qa-workflow 3/3 $1.03
e2e-review 6/7 $1.27
e2e-workflow 3/3 $0.35
llm-judge 24/24 $0.48

12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

Failures

  • ❌ /review base branch detection: error_max_turns

…iting it

Codex kept overwriting agents/openai.yaml with a browse-only description.
Two fixes: (1) better description covering full PM/dev/eng/CEO/QA scope,
(2) add agents/ to the filesystem boundary so Codex stops modifying it.
@garrytan garrytan merged commit 247fc3b into main Mar 28, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant