diff --git a/GEMINI.md b/GEMINI.md new file mode 100644 index 000000000..e9279a78d --- /dev/null +++ b/GEMINI.md @@ -0,0 +1,309 @@ +# gstack development (Gemini) + +## Commands + +```bash +bun install # install dependencies +bun test # run free tests (browse + snapshot + skill validation) +bun run test:evals # run paid evals: LLM judge + E2E (diff-based, ~$4/run max) +bun run test:evals:all # run ALL paid evals regardless of diff +bun run test:gate # run gate-tier tests only (CI default, blocks merge) +bun run test:periodic # run periodic-tier tests only (weekly cron / manual) +bun run test:e2e # run E2E tests only (diff-based, ~$3.85/run max) +bun run test:e2e:all # run ALL E2E tests regardless of diff +bun run eval:select # show which tests would run based on current diff +bun run dev # run CLI in dev mode, e.g. bun run dev goto https://example.com +bun run build # gen docs + compile binaries +bun run gen:skill-docs # regenerate SKILL.md files from templates +bun run skill:check # health dashboard for all skills +bun run dev:skill # watch mode: auto-regen + validate on change +bun run eval:list # list all eval runs from ~/.gstack-dev/evals/ +bun run eval:compare # compare two eval runs (auto-picks most recent) +bun run eval:summary # aggregate stats across all eval runs +``` + +`test:evals` requires `ANTHROPIC_API_KEY`. Codex E2E tests (`test/codex-e2e.test.ts`) +use Codex's own auth from `~/.codex/` config — no `OPENAI_API_KEY` env var needed. +E2E tests stream progress in real-time (tool-by-tool via `--output-format stream-json +--verbose`). Results are persisted to `~/.gstack-dev/evals/` with auto-comparison +against the previous run. + +**Diff-based test selection:** `test:evals` and `test:e2e` auto-select tests based +on `git diff` against the base branch. Each test declares its file dependencies in +`test/helpers/touchfiles.ts`. Changes to global touchfiles (session-runner, eval-store, +touchfiles.ts itself) trigger all tests. Use `EVALS_ALL=1` or the `:all` script +variants to force all tests. Run `eval:select` to preview which tests would run. + +**Two-tier system:** Tests are classified as `gate` or `periodic` in `E2E_TIERS` +(in `test/helpers/touchfiles.ts`). CI runs only gate tests (`EVALS_TIER=gate`); +periodic tests run weekly via cron or manually. Use `EVALS_TIER=gate` or +`EVALS_TIER=periodic` to filter. When adding new E2E tests, classify them: +1. Safety guardrail or deterministic functional test? -> `gate` +2. Quality benchmark, Opus model test, or non-deterministic? -> `periodic` +3. Requires external service (Codex, Gemini)? -> `periodic` + +## Testing + +```bash +bun test # run before every commit — free, <2s +bun run test:evals # run before shipping — paid, diff-based (~$4/run max) +``` + +`bun test` runs skill validation, gen-skill-docs quality checks, and browse +integration tests. `bun run test:evals` runs LLM-judge quality evals and E2E +tests via `claude -p`. Both must pass before creating a PR. + +## Project structure + +``` +gstack/ +├── browse/ # Headless browser CLI (Playwright) +│ ├── src/ # CLI + server + commands +│ │ ├── commands.ts # Command registry (single source of truth) +│ │ └── snapshot.ts # SNAPSHOT_FLAGS metadata array +│ ├── test/ # Integration tests + fixtures +│ └── dist/ # Compiled binary +├── scripts/ # Build + DX tooling +│ ├── gen-skill-docs.ts # Template → SKILL.md generator +│ ├── skill-check.ts # Health dashboard +│ └── dev-skill.ts # Watch mode +├── test/ # Skill validation + eval tests +│ ├── helpers/ # skill-parser.ts, session-runner.ts, llm-judge.ts, eval-store.ts +│ ├── fixtures/ # Ground truth JSON, planted-bug fixtures, eval baselines +│ ├── skill-validation.test.ts # Tier 1: static validation (free, <1s) +│ ├── gen-skill-docs.test.ts # Tier 1: generator quality (free, <1s) +│ ├── skill-llm-eval.test.ts # Tier 3: LLM-as-judge (~$0.15/run) +│ └── skill-e2e-*.test.ts # Tier 2: E2E via claude -p (~$3.85/run, split by category) +├── qa-only/ # /qa-only skill (report-only QA, no fixes) +├── plan-design-review/ # /plan-design-review skill (report-only design audit) +├── design-review/ # /design-review skill (design audit + fix loop) +├── ship/ # Ship workflow skill +├── review/ # PR review skill +├── plan-ceo-review/ # /plan-ceo-review skill +├── plan-eng-review/ # /plan-eng-review skill +├── autoplan/ # /autoplan skill (auto-review pipeline: CEO → design → eng) +├── benchmark/ # /benchmark skill (performance regression detection) +├── canary/ # /canary skill (post-deploy monitoring loop) +├── codex/ # /codex skill (multi-AI second opinion via OpenAI Codex CLI) +├── land-and-deploy/ # /land-and-deploy skill (merge → deploy → canary verify) +├── office-hours/ # /office-hours skill (YC Office Hours — startup diagnostic + builder brainstorm) +├── investigate/ # /investigate skill (systematic root-cause debugging) +├── retro/ # Retrospective skill (includes /retro global cross-project mode) +├── bin/ # CLI utilities (gstack-repo-mode, gstack-slug, gstack-config, etc.) +├── document-release/ # /document-release skill (post-ship doc updates) +├── cso/ # /cso skill (OWASP Top 10 + STRIDE security audit) +├── design-consultation/ # /design-consultation skill (design system from scratch) +├── setup-deploy/ # /setup-deploy skill (one-time deploy config) +├── .github/ # CI workflows + Docker image +│ ├── workflows/ # evals.yml (E2E on Ubicloud), skill-docs.yml, actionlint.yml +│ └── docker/ # Dockerfile.ci (pre-baked toolchain + Playwright/Chromium) +├── setup # One-time setup: build binary + symlink skills +├── SKILL.md # Generated from SKILL.md.tmpl (don't edit directly) +├── SKILL.md.tmpl # Template: edit this, run gen:skill-docs +├── ETHOS.md # Builder philosophy (Boil the Lake, Search Before Building) +└── package.json # Build scripts for browse +``` + +## SKILL.md workflow + +SKILL.md files are **generated** from `.tmpl` templates. To update docs: + +1. Edit the `.tmpl` file (e.g. `SKILL.md.tmpl` or `browse/SKILL.md.tmpl`) +2. Run `bun run gen:skill-docs` (or `bun run build` which does it automatically) +3. Commit both the `.tmpl` and generated `.md` files + +To add a new browse command: add it to `browse/src/commands.ts` and rebuild. +To add a snapshot flag: add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts` and rebuild. + +**Merge conflicts on SKILL.md files:** NEVER resolve conflicts on generated SKILL.md +files by accepting either side. Instead: (1) resolve conflicts on the `.tmpl` templates +and `scripts/gen-skill-docs.ts` (the sources of truth), (2) run `bun run gen:skill-docs` +to regenerate all SKILL.md files, (3) stage the regenerated files. Accepting one side's +generated output silently drops the other side's template changes. + +## Platform-agnostic design + +Skills must NEVER hardcode framework-specific commands, file patterns, or directory +structures. Instead: + +1. **Read GEMINI.md** for project-specific config (test commands, eval commands, etc.) +2. **If missing, AskUserQuestion** — let the user tell you or let gstack search the repo +3. **Persist the answer to GEMINI.md** so we never have to ask again + +This applies to test commands, eval commands, deploy commands, and any other +project-specific behavior. The project owns its config; gstack reads it. + +## Writing SKILL templates + +SKILL.md.tmpl files are **prompt templates read by Claude or Gemini**, not bash scripts. +Each bash code block runs in a separate shell — variables do not persist between blocks. + +Rules: +- **Use natural language for logic and state.** Don't use shell variables to pass + state between code blocks. Instead, tell the agent what to remember and reference + it in prose (e.g., "the base branch detected in Step 0"). +- **Don't hardcode branch names.** Detect `main`/`master`/etc dynamically via + `gh pr view` or `gh repo view`. Use `{{BASE_BRANCH_DETECT}}` for PR-targeting + skills. Use "the base branch" in prose, `` in code block placeholders. +- **Keep bash blocks self-contained.** Each code block should work independently. + If a block needs context from a previous step, restate it in the prose above. +- **Express conditionals as English.** Instead of nested `if/elif/else` in bash, + write numbered decision steps: "1. If X, do Y. 2. Otherwise, do Z." + +## Browser interaction + +When you need to interact with a browser (QA, dogfooding, cookie setup), use the +`/browse` skill or run the browse binary directly via `$B `. NEVER use +`mcp__claude-in-chrome__*` tools — they are slow, unreliable, and not what this +project uses. + +## Vendored symlink awareness + +When developing gstack, `.claude/skills/gstack` may be a symlink back to this +working directory (gitignored). This means skill changes are **live immediately** — +great for rapid iteration, risky during big refactors where half-written skills +could break other agent sessions using gstack concurrently. + +**Check once per session:** Run `ls -la .claude/skills/gstack` to see if it's a +symlink or a real copy. If it's a symlink to your working directory, be aware that: +- Template changes + `bun run gen:skill-docs` immediately affect all gstack invocations +- Breaking changes to SKILL.md.tmpl files can break concurrent gstack sessions +- During large refactors, remove the symlink (`rm .claude/skills/gstack`) so the + global install at `~/.claude/skills/gstack/` is used instead + +**For plan reviews:** When reviewing plans that modify skill templates or the +gen-skill-docs pipeline, consider whether the changes should be tested in isolation +before going live (especially if the user is actively using gstack in other windows). + +## Compiled binaries — NEVER commit browse/dist/ + +The `browse/dist/` directory contains compiled Bun binaries (`browse`, `find-browse`, +~58MB each). These are Mach-O arm64 only — they do NOT work on Linux, Windows, or +Intel Macs. The `./setup` script already builds from source for every platform, so +the checked-in binaries are redundant. They are tracked by git due to a historical +mistake and should eventually be removed with `git rm --cached`. + +**NEVER stage or commit these files.** They show up as modified in `git status` +because they're tracked despite `.gitignore` — ignore them. When staging files, +always use specific filenames (`git add file1 file2`) — never `git add .` or +`git add -A`, which will accidentally include the binaries. + +## Commit style + +**Always bisect commits.** Every commit should be a single logical change. When +you've made multiple changes (e.g., a rename + a rewrite + new tests), split them +into separate commits before pushing. Each commit should be independently +understandable and revertable. + +Examples of good bisection: +- Rename/move separate from behavior changes +- Test infrastructure (touchfiles, helpers) separate from test implementations +- Template changes separate from generated file regeneration +- Mechanical refactors separate from new features + +When the user says "bisect commit" or "bisect and push," split staged/unstaged +changes into logical commits and push. + +## CHANGELOG + VERSION style + +**VERSION and CHANGELOG are branch-scoped.** Every feature branch that ships gets its +own version bump and CHANGELOG entry. The entry describes what THIS branch adds — +not what was already on main. + +**When to write the CHANGELOG entry:** +- At `/ship` time (Step 5), not during development or mid-branch. +- The entry covers ALL commits on this branch vs the base branch. +- Never fold new work into an existing CHANGELOG entry from a prior version that + already landed on main. If main has v0.10.0.0 and your branch adds features, + bump to v0.10.1.0 with a new entry — don't edit the v0.10.0.0 entry. + +**Key questions before writing:** +1. What branch am I on? What did THIS branch change? +2. Is the base branch version already released? (If yes, bump and create new entry.) +3. Does an existing entry on this branch already cover earlier work? (If yes, replace + it with one unified entry for the final version.) + +CHANGELOG.md is **for users**, not contributors. Write it like product release notes: + +- Lead with what the user can now **do** that they couldn't before. Sell the feature. +- Use plain language, not implementation details. "You can now..." not "Refactored the..." +- **Never mention TODOS.md, internal tracking, eval infrastructure, or contributor-facing + details.** These are invisible to users and meaningless to them. +- Put contributor/internal changes in a separate "For contributors" section at the bottom. +- Every entry should make someone think "oh nice, I want to try that." +- No jargon: say "every question now tells you which project and branch you're in" not + "AskUserQuestion format standardized across skill templates via preamble resolver." + +## AI effort compression + +When estimating or discussing effort, always show both human-team and agent+gstack time: + +| Task type | Human team | Agent+gstack | Compression | +|-----------|-----------|-----------|-------------| +| Boilerplate / scaffolding | 2 days | 15 min | ~100x | +| Test writing | 1 day | 15 min | ~50x | +| Feature implementation | 1 week | 30 min | ~30x | +| Bug fix + regression test | 4 hours | 15 min | ~20x | +| Architecture / design | 2 days | 4 hours | ~5x | +| Research / exploration | 1 day | 3 hours | ~3x | + +Completeness is cheap. Don't recommend shortcuts when the complete implementation +is a "lake" (achievable) not an "ocean" (multi-quarter migration). See the +Completeness Principle in the skill preamble for the full philosophy. + +## Search before building + +Before designing any solution that involves concurrency, unfamiliar patterns, +infrastructure, or anything where the runtime/framework might have a built-in: + +1. Search for "{runtime} {thing} built-in" +2. Search for "{thing} best practice {current year}" +3. Check official runtime/framework docs + +Three layers of knowledge: tried-and-true (Layer 1), new-and-popular (Layer 2), +first-principles (Layer 3). Prize Layer 3 above all. See ETHOS.md for the full +builder philosophy. + +## Local plans + +Contributors can store long-range vision docs and design documents in `~/.gstack-dev/plans/`. +These are local-only (not checked in). When reviewing TODOS.md, check `plans/` for candidates +that may be ready to promote to TODOs or implement. + +## E2E eval failure blame protocol + +When an E2E eval fails during `/ship` or any other workflow, **never claim "not +related to our changes" without proving it.** These systems have invisible couplings — +a preamble text change affects agent behavior, a new helper changes timing, a +regenerated SKILL.md shifts prompt context. + +**Required before attributing a failure to "pre-existing":** +1. Run the same eval on main (or base branch) and show it fails there too +2. If it passes on main but fails on the branch — it IS your change. Trace the blame. +3. If you can't run on main, say "unverified — may or may not be related" and flag it + as a risk in the PR body + +"Pre-existing" without receipts is a lazy claim. Prove it or don't say it. + +## Long-running tasks: don't give up + +When running evals, E2E tests, or any long-running background task, **poll until +completion**. Use `sleep 180 && echo "ready"` + `TaskOutput` in a loop every 3 +minutes. Never switch to blocking mode and give up when the poll times out. Never +say "I'll be notified when it completes" and stop checking — keep the loop going +until the task finishes or the user tells you to stop. + +The full E2E suite can take 30-45 minutes. That's 10-15 polling cycles. Do all of +them. Report progress at each check (which tests passed, which are running, any +failures so far). The user wants to see the run complete, not a promise that +you'll check later. + +## Deploying to the active skill + +The active skill lives at `~/.claude/skills/gstack/`. After making changes: + +1. Push your branch +2. Fetch and reset in the skill directory: `cd ~/.claude/skills/gstack && git fetch origin && git reset --hard origin/main` +3. Rebuild: `cd ~/.claude/skills/gstack && bun run build` + +Or copy the binary directly: `cp browse/dist/browse ~/.claude/skills/gstack/browse/dist/browse` diff --git a/GEMINI.md.tmpl b/GEMINI.md.tmpl new file mode 100644 index 000000000..3488fa930 --- /dev/null +++ b/GEMINI.md.tmpl @@ -0,0 +1,309 @@ +# gstack development + +## Commands + +```bash +bun install # install dependencies +bun test # run free tests (browse + snapshot + skill validation) +bun run test:evals # run paid evals: LLM judge + E2E (diff-based, ~$4/run max) +bun run test:evals:all # run ALL paid evals regardless of diff +bun run test:gate # run gate-tier tests only (CI default, blocks merge) +bun run test:periodic # run periodic-tier tests only (weekly cron / manual) +bun run test:e2e # run E2E tests only (diff-based, ~$3.85/run max) +bun run test:e2e:all # run ALL E2E tests regardless of diff +bun run eval:select # show which tests would run based on current diff +bun run dev # run CLI in dev mode, e.g. bun run dev goto https://example.com +bun run build # gen docs + compile binaries +bun run gen:skill-docs # regenerate SKILL.md files from templates +bun run skill:check # health dashboard for all skills +bun run dev:skill # watch mode: auto-regen + validate on change +bun run eval:list # list all eval runs from ~/.gstack-dev/evals/ +bun run eval:compare # compare two eval runs (auto-picks most recent) +bun run eval:summary # aggregate stats across all eval runs +``` + +`test:evals` requires `ANTHROPIC_API_KEY`. Codex E2E tests (`test/codex-e2e.test.ts`) +use Codex's own auth from `~/.codex/` config — no `OPENAI_API_KEY` env var needed. +E2E tests stream progress in real-time (tool-by-tool via `--output-format stream-json +--verbose`). Results are persisted to `~/.gstack-dev/evals/` with auto-comparison +against the previous run. + +**Diff-based test selection:** `test:evals` and `test:e2e` auto-select tests based +on `git diff` against the base branch. Each test declares its file dependencies in +`test/helpers/touchfiles.ts`. Changes to global touchfiles (session-runner, eval-store, +touchfiles.ts itself) trigger all tests. Use `EVALS_ALL=1` or the `:all` script +variants to force all tests. Run `eval:select` to preview which tests would run. + +**Two-tier system:** Tests are classified as `gate` or `periodic` in `E2E_TIERS` +(in `test/helpers/touchfiles.ts`). CI runs only gate tests (`EVALS_TIER=gate`); +periodic tests run weekly via cron or manually. Use `EVALS_TIER=gate` or +`EVALS_TIER=periodic` to filter. When adding new E2E tests, classify them: +1. Safety guardrail or deterministic functional test? -> `gate` +2. Quality benchmark, Opus model test, or non-deterministic? -> `periodic` +3. Requires external service (Codex, Gemini)? -> `periodic` + +## Testing + +```bash +bun test # run before every commit — free, <2s +bun run test:evals # run before shipping — paid, diff-based (~$4/run max) +``` + +`bun test` runs skill validation, gen-skill-docs quality checks, and browse +integration tests. `bun run test:evals` runs LLM-judge quality evals and E2E +tests via `claude -p`. Both must pass before creating a PR. + +## Project structure + +``` +gstack/ +├── browse/ # Headless browser CLI (Playwright) +│ ├── src/ # CLI + server + commands +│ │ ├── commands.ts # Command registry (single source of truth) +│ │ └── snapshot.ts # SNAPSHOT_FLAGS metadata array +│ ├── test/ # Integration tests + fixtures +│ └── dist/ # Compiled binary +├── scripts/ # Build + DX tooling +│ ├── gen-skill-docs.ts # Template → SKILL.md generator +│ ├── skill-check.ts # Health dashboard +│ └── dev-skill.ts # Watch mode +├── test/ # Skill validation + eval tests +│ ├── helpers/ # skill-parser.ts, session-runner.ts, llm-judge.ts, eval-store.ts +│ ├── fixtures/ # Ground truth JSON, planted-bug fixtures, eval baselines +│ ├── skill-validation.test.ts # Tier 1: static validation (free, <1s) +│ ├── gen-skill-docs.test.ts # Tier 1: generator quality (free, <1s) +│ ├── skill-llm-eval.test.ts # Tier 3: LLM-as-judge (~$0.15/run) +│ └── skill-e2e-*.test.ts # Tier 2: E2E via claude -p (~$3.85/run, split by category) +├── qa-only/ # /qa-only skill (report-only QA, no fixes) +├── plan-design-review/ # /plan-design-review skill (report-only design audit) +├── design-review/ # /design-review skill (design audit + fix loop) +├── ship/ # Ship workflow skill +├── review/ # PR review skill +├── plan-ceo-review/ # /plan-ceo-review skill +├── plan-eng-review/ # /plan-eng-review skill +├── autoplan/ # /autoplan skill (auto-review pipeline: CEO → design → eng) +├── benchmark/ # /benchmark skill (performance regression detection) +├── canary/ # /canary skill (post-deploy monitoring loop) +├── codex/ # /codex skill (multi-AI second opinion via OpenAI Codex CLI) +├── land-and-deploy/ # /land-and-deploy skill (merge → deploy → canary verify) +├── office-hours/ # /office-hours skill (YC Office Hours — startup diagnostic + builder brainstorm) +├── investigate/ # /investigate skill (systematic root-cause debugging) +├── retro/ # Retrospective skill (includes /retro global cross-project mode) +├── bin/ # CLI utilities (gstack-repo-mode, gstack-slug, gstack-config, etc.) +├── document-release/ # /document-release skill (post-ship doc updates) +├── cso/ # /cso skill (OWASP Top 10 + STRIDE security audit) +├── design-consultation/ # /design-consultation skill (design system from scratch) +├── setup-deploy/ # /setup-deploy skill (one-time deploy config) +├── .github/ # CI workflows + Docker image +│ ├── workflows/ # evals.yml (E2E on Ubicloud), skill-docs.yml, actionlint.yml +│ └── docker/ # Dockerfile.ci (pre-baked toolchain + Playwright/Chromium) +├── setup # One-time setup: build binary + symlink skills +├── SKILL.md # Generated from SKILL.md.tmpl (don't edit directly) +├── SKILL.md.tmpl # Template: edit this, run gen:skill-docs +├── ETHOS.md # Builder philosophy (Boil the Lake, Search Before Building) +└── package.json # Build scripts for browse +``` + +## SKILL.md workflow + +SKILL.md files are **generated** from `.tmpl` templates. To update docs: + +1. Edit the `.tmpl` file (e.g. `SKILL.md.tmpl` or `browse/SKILL.md.tmpl`) +2. Run `bun run gen:skill-docs` (or `bun run build` which does it automatically) +3. Commit both the `.tmpl` and generated `.md` files + +To add a new browse command: add it to `browse/src/commands.ts` and rebuild. +To add a snapshot flag: add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts` and rebuild. + +**Merge conflicts on SKILL.md files:** NEVER resolve conflicts on generated SKILL.md +files by accepting either side. Instead: (1) resolve conflicts on the `.tmpl` templates +and `scripts/gen-skill-docs.ts` (the sources of truth), (2) run `bun run gen:skill-docs` +to regenerate all SKILL.md files, (3) stage the regenerated files. Accepting one side's +generated output silently drops the other side's template changes. + +## Platform-agnostic design + +Skills must NEVER hardcode framework-specific commands, file patterns, or directory +structures. Instead: + +1. **Read GEMINI.md** for project-specific config (test commands, eval commands, etc.) +2. **If missing, AskUserQuestion** — let the user tell you or let gstack search the repo +3. **Persist the answer to GEMINI.md** so we never have to ask again + +This applies to test commands, eval commands, deploy commands, and any other +project-specific behavior. The project owns its config; gstack reads it. + +## Writing SKILL templates + +SKILL.md.tmpl files are **prompt templates read by Claude**, not bash scripts. +Each bash code block runs in a separate shell — variables do not persist between blocks. + +Rules: +- **Use natural language for logic and state.** Don't use shell variables to pass + state between code blocks. Instead, tell Claude what to remember and reference + it in prose (e.g., "the base branch detected in Step 0"). +- **Don't hardcode branch names.** Detect `main`/`master`/etc dynamically via + `gh pr view` or `gh repo view`. Use `{{BASE_BRANCH_DETECT}}` for PR-targeting + skills. Use "the base branch" in prose, `` in code block placeholders. +- **Keep bash blocks self-contained.** Each code block should work independently. + If a block needs context from a previous step, restate it in the prose above. +- **Express conditionals as English.** Instead of nested `if/elif/else` in bash, + write numbered decision steps: "1. If X, do Y. 2. Otherwise, do Z." + +## Browser interaction + +When you need to interact with a browser (QA, dogfooding, cookie setup), use the +`/browse` skill or run the browse binary directly via `$B `. NEVER use +`mcp__claude-in-chrome__*` tools — they are slow, unreliable, and not what this +project uses. + +## Vendored symlink awareness + +When developing gstack, `.claude/skills/gstack` may be a symlink back to this +working directory (gitignored). This means skill changes are **live immediately** — +great for rapid iteration, risky during big refactors where half-written skills +could break other Claude Code sessions using gstack concurrently. + +**Check once per session:** Run `ls -la .claude/skills/gstack` to see if it's a +symlink or a real copy. If it's a symlink to your working directory, be aware that: +- Template changes + `bun run gen:skill-docs` immediately affect all gstack invocations +- Breaking changes to SKILL.md.tmpl files can break concurrent gstack sessions +- During large refactors, remove the symlink (`rm .claude/skills/gstack`) so the + global install at `~/.claude/skills/gstack/` is used instead + +**For plan reviews:** When reviewing plans that modify skill templates or the +gen-skill-docs pipeline, consider whether the changes should be tested in isolation +before going live (especially if the user is actively using gstack in other windows). + +## Compiled binaries — NEVER commit browse/dist/ + +The `browse/dist/` directory contains compiled Bun binaries (`browse`, `find-browse`, +~58MB each). These are Mach-O arm64 only — they do NOT work on Linux, Windows, or +Intel Macs. The `./setup` script already builds from source for every platform, so +the checked-in binaries are redundant. They are tracked by git due to a historical +mistake and should eventually be removed with `git rm --cached`. + +**NEVER stage or commit these files.** They show up as modified in `git status` +because they're tracked despite `.gitignore` — ignore them. When staging files, +always use specific filenames (`git add file1 file2`) — never `git add .` or +`git add -A`, which will accidentally include the binaries. + +## Commit style + +**Always bisect commits.** Every commit should be a single logical change. When +you've made multiple changes (e.g., a rename + a rewrite + new tests), split them +into separate commits before pushing. Each commit should be independently +understandable and revertable. + +Examples of good bisection: +- Rename/move separate from behavior changes +- Test infrastructure (touchfiles, helpers) separate from test implementations +- Template changes separate from generated file regeneration +- Mechanical refactors separate from new features + +When the user says "bisect commit" or "bisect and push," split staged/unstaged +changes into logical commits and push. + +## CHANGELOG + VERSION style + +**VERSION and CHANGELOG are branch-scoped.** Every feature branch that ships gets its +own version bump and CHANGELOG entry. The entry describes what THIS branch adds — +not what was already on main. + +**When to write the CHANGELOG entry:** +- At `/ship` time (Step 5), not during development or mid-branch. +- The entry covers ALL commits on this branch vs the base branch. +- Never fold new work into an existing CHANGELOG entry from a prior version that + already landed on main. If main has v0.10.0.0 and your branch adds features, + bump to v0.10.1.0 with a new entry — don't edit the v0.10.0.0 entry. + +**Key questions before writing:** +1. What branch am I on? What did THIS branch change? +2. Is the base branch version already released? (If yes, bump and create new entry.) +3. Does an existing entry on this branch already cover earlier work? (If yes, replace + it with one unified entry for the final version.) + +CHANGELOG.md is **for users**, not contributors. Write it like product release notes: + +- Lead with what the user can now **do** that they couldn't before. Sell the feature. +- Use plain language, not implementation details. "You can now..." not "Refactored the..." +- **Never mention TODOS.md, internal tracking, eval infrastructure, or contributor-facing + details.** These are invisible to users and meaningless to them. +- Put contributor/internal changes in a separate "For contributors" section at the bottom. +- Every entry should make someone think "oh nice, I want to try that." +- No jargon: say "every question now tells you which project and branch you're in" not + "AskUserQuestion format standardized across skill templates via preamble resolver." + +## AI effort compression + +When estimating or discussing effort, always show both human-team and CC+gstack time: + +| Task type | Human team | CC+gstack | Compression | +|-----------|-----------|-----------|-------------| +| Boilerplate / scaffolding | 2 days | 15 min | ~100x | +| Test writing | 1 day | 15 min | ~50x | +| Feature implementation | 1 week | 30 min | ~30x | +| Bug fix + regression test | 4 hours | 15 min | ~20x | +| Architecture / design | 2 days | 4 hours | ~5x | +| Research / exploration | 1 day | 3 hours | ~3x | + +Completeness is cheap. Don't recommend shortcuts when the complete implementation +is a "lake" (achievable) not an "ocean" (multi-quarter migration). See the +Completeness Principle in the skill preamble for the full philosophy. + +## Search before building + +Before designing any solution that involves concurrency, unfamiliar patterns, +infrastructure, or anything where the runtime/framework might have a built-in: + +1. Search for "{runtime} {thing} built-in" +2. Search for "{thing} best practice {current year}" +3. Check official runtime/framework docs + +Three layers of knowledge: tried-and-true (Layer 1), new-and-popular (Layer 2), +first-principles (Layer 3). Prize Layer 3 above all. See ETHOS.md for the full +builder philosophy. + +## Local plans + +Contributors can store long-range vision docs and design documents in `~/.gstack-dev/plans/`. +These are local-only (not checked in). When reviewing TODOS.md, check `plans/` for candidates +that may be ready to promote to TODOs or implement. + +## E2E eval failure blame protocol + +When an E2E eval fails during `/ship` or any other workflow, **never claim "not +related to our changes" without proving it.** These systems have invisible couplings — +a preamble text change affects agent behavior, a new helper changes timing, a +regenerated SKILL.md shifts prompt context. + +**Required before attributing a failure to "pre-existing":** +1. Run the same eval on main (or base branch) and show it fails there too +2. If it passes on main but fails on the branch — it IS your change. Trace the blame. +3. If you can't run on main, say "unverified — may or may not be related" and flag it + as a risk in the PR body + +"Pre-existing" without receipts is a lazy claim. Prove it or don't say it. + +## Long-running tasks: don't give up + +When running evals, E2E tests, or any long-running background task, **poll until +completion**. Use `sleep 180 && echo "ready"` + `TaskOutput` in a loop every 3 +minutes. Never switch to blocking mode and give up when the poll times out. Never +say "I'll be notified when it completes" and stop checking — keep the loop going +until the task finishes or the user tells you to stop. + +The full E2E suite can take 30-45 minutes. That's 10-15 polling cycles. Do all of +them. Report progress at each check (which tests passed, which are running, any +failures so far). The user wants to see the run complete, not a promise that +you'll check later. + +## Deploying to the active skill + +The active skill lives at `~/.claude/skills/gstack/`. After making changes: + +1. Push your branch +2. Fetch and reset in the skill directory: `cd ~/.claude/skills/gstack && git fetch origin && git reset --hard origin/main` +3. Rebuild: `cd ~/.claude/skills/gstack && bun run build` + +Or copy the binary directly: `cp browse/dist/browse ~/.claude/skills/gstack/browse/dist/browse` diff --git a/bin/gstack-global-discover b/bin/gstack-global-discover index ebffeeb9e..015125303 100755 Binary files a/bin/gstack-global-discover and b/bin/gstack-global-discover differ diff --git a/browse/bin/find-browse b/browse/bin/find-browse index 8f441b499..639a31139 100755 --- a/browse/bin/find-browse +++ b/browse/bin/find-browse @@ -7,7 +7,7 @@ if test -x "$DIR/find-browse"; then fi # Fallback: basic discovery with priority chain ROOT=$(git rev-parse --show-toplevel 2>/dev/null) -for MARKER in .codex .agents .claude; do +for MARKER in .gemini .codex .agents .claude; do if [ -n "$ROOT" ] && test -x "$ROOT/$MARKER/skills/gstack/browse/dist/browse"; then echo "$ROOT/$MARKER/skills/gstack/browse/dist/browse" exit 0 diff --git a/browse/src/find-browse.ts b/browse/src/find-browse.ts index 93c4a26e7..5822b7030 100644 --- a/browse/src/find-browse.ts +++ b/browse/src/find-browse.ts @@ -27,7 +27,7 @@ function getGitRoot(): string | null { export function locateBinary(): string | null { const root = getGitRoot(); const home = homedir(); - const markers = ['.codex', '.agents', '.claude']; + const markers = ['.gemini', '.codex', '.agents', '.claude']; // Workspace-local takes priority (for development) if (root) { diff --git a/browse/test/find-browse.test.ts b/browse/test/find-browse.test.ts index 2f1cdc0e2..4c3bed0db 100644 --- a/browse/test/find-browse.test.ts +++ b/browse/test/find-browse.test.ts @@ -22,24 +22,23 @@ describe('locateBinary', () => { } }); - test('priority chain checks .codex, .agents, .claude markers', () => { + test('priority chain checks .gemini, .codex, .agents, .claude markers', () => { // Verify the source code implements the correct priority order. // We read the function source to confirm the markers array order. const src = require('fs').readFileSync(require('path').join(__dirname, '../src/find-browse.ts'), 'utf-8'); - // The markers array should list .codex first, then .agents, then .claude + // The markers array should list .gemini first, then .codex, then .agents, then .claude const markersMatch = src.match(/const markers = \[([^\]]+)\]/); expect(markersMatch).not.toBeNull(); - const markers = markersMatch![1]; + const markers = JSON.parse(`[${markersMatch![1]}]`); + const geminiIdx = markers.indexOf('.gemini'); const codexIdx = markers.indexOf('.codex'); const agentsIdx = markers.indexOf('.agents'); const claudeIdx = markers.indexOf('.claude'); - // All three must be present - expect(codexIdx).toBeGreaterThanOrEqual(0); - expect(agentsIdx).toBeGreaterThanOrEqual(0); - expect(claudeIdx).toBeGreaterThanOrEqual(0); - // .codex before .agents before .claude - expect(codexIdx).toBeLessThan(agentsIdx); - expect(agentsIdx).toBeLessThan(claudeIdx); + // All four must be present + expect(geminiIdx).toBe(0); + expect(codexIdx).toBe(1); + expect(agentsIdx).toBe(2); + expect(claudeIdx).toBe(3); }); test('function signature accepts no arguments', () => { diff --git a/scripts/gen-skill-docs.ts b/scripts/gen-skill-docs.ts index 970e5a3f3..72020f070 100644 --- a/scripts/gen-skill-docs.ts +++ b/scripts/gen-skill-docs.ts @@ -30,8 +30,10 @@ const HOST: Host = (() => { if (!HOST_ARG) return 'claude'; const val = HOST_ARG.includes('=') ? HOST_ARG.split('=')[1] : process.argv[process.argv.indexOf(HOST_ARG) + 1]; if (val === 'codex' || val === 'agents') return 'codex'; + if (val === 'gemini') return 'gemini'; + if (val === 'claude') return 'claude'; - throw new Error(`Unknown host: ${val}. Use claude, codex, or agents.`); + throw new Error(`Unknown host: ${val}. Use claude, codex, gemini, or agents.`); })(); // HostPaths, HOST_PATHS, and TemplateContext imported from ./resolvers/types (line 7-8) @@ -2237,7 +2239,7 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath: let outputDir: string | null = null; // For codex host, route output to .agents/skills/{codexSkillName}/SKILL.md - if (host === 'codex') { + if (host === 'codex' || host === 'gemini') { const codexName = codexSkillName(skillDir === '.' ? '' : skillDir); outputDir = path.join(ROOT, '.agents', 'skills', codexName); fs.mkdirSync(outputDir, { recursive: true }); @@ -2274,7 +2276,7 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath: } // For codex host: transform frontmatter and replace Claude-specific paths - if (host === 'codex') { + if (host === 'codex' || host === 'gemini') { // Extract hook safety prose BEFORE transforming frontmatter (which strips hooks) const safetyProse = extractHookSafetyProse(tmplContent); @@ -2292,6 +2294,10 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath: content = content.replace(/\.claude\/skills\/gstack/g, ctx.paths.localSkillRoot); content = content.replace(/\.claude\/skills\/review/g, '.agents/skills/gstack/review'); content = content.replace(/\.claude\/skills/g, '.agents/skills'); + if (ctx.paths.configFile !== 'CLAUDE.md') { + content = content.replace(/CLAUDE\.md/g, ctx.paths.configFile); + } + if (outputDir) { const codexName = codexSkillName(skillDir === '.' ? '' : skillDir); @@ -2327,7 +2333,7 @@ const tokenBudget: Array<{ skill: string; lines: number; tokens: number }> = []; for (const tmplPath of findTemplates()) { // Skip /codex skill for codex host (self-referential — it's a Claude wrapper around codex exec) - if (HOST === 'codex') { + if (HOST === 'codex' || HOST === 'gemini') { const dir = path.basename(path.dirname(tmplPath)); if (dir === 'codex') continue; } diff --git a/scripts/resolvers/index.ts b/scripts/resolvers/index.ts index d4536312c..69d711c8c 100644 --- a/scripts/resolvers/index.ts +++ b/scripts/resolvers/index.ts @@ -1,10 +1,4 @@ -/** - * RESOLVERS record — maps {{PLACEHOLDER}} names to generator functions. - * Each resolver takes a TemplateContext and returns the replacement string. - */ - import type { TemplateContext } from './types'; - // Domain modules import { generatePreamble } from './preamble'; import { generateTestFailureTriage } from './preamble'; @@ -12,8 +6,7 @@ import { generateCommandReference, generateSnapshotFlags, generateBrowseSetup } import { generateDesignMethodology, generateDesignHardRules, generateDesignOutsideVoices, generateDesignReviewLite, generateDesignSketch } from './design'; import { generateTestBootstrap, generateTestCoverageAuditPlan, generateTestCoverageAuditShip, generateTestCoverageAuditReview } from './testing'; import { generateReviewDashboard, generatePlanFileReviewReport, generateSpecReviewLoop, generateBenefitsFrom, generateCodexSecondOpinion, generateAdversarialStep, generateCodexPlanReview, generatePlanCompletionAuditShip, generatePlanCompletionAuditReview, generatePlanVerificationExec } from './review'; -import { generateSlugEval, generateSlugSetup, generateBaseBranchDetect, generateDeployBootstrap, generateQAMethodology, generateCoAuthorTrailer } from './utility'; - +import { generateSlugEval, generateSlugSetup, generateBaseBranchDetect, generateDeployBootstrap, generateQAMethodology, generateCoAuthorTrailer, generateConfigFile } from './utility'; export const RESOLVERS: Record string> = { SLUG_EVAL: generateSlugEval, SLUG_SETUP: generateSlugSetup, @@ -45,4 +38,5 @@ export const RESOLVERS: Record string> = { PLAN_COMPLETION_AUDIT_REVIEW: generatePlanCompletionAuditReview, PLAN_VERIFICATION_EXEC: generatePlanVerificationExec, CO_AUTHOR_TRAILER: generateCoAuthorTrailer, + CONFIG_FILE: generateConfigFile, }; diff --git a/scripts/resolvers/preamble.ts b/scripts/resolvers/preamble.ts index fe0ba77e8..cb4fabb42 100644 --- a/scripts/resolvers/preamble.ts +++ b/scripts/resolvers/preamble.ts @@ -1,14 +1,22 @@ import type { TemplateContext } from './types'; function generatePreambleBash(ctx: TemplateContext): string { - const runtimeRoot = ctx.host === 'codex' - ? `_ROOT=$(git rev-parse --show-toplevel 2>/dev/null) + let runtimeRoot = ''; + if (ctx.host === 'codex') { + runtimeRoot = `_ROOT=$(git rev-parse --show-toplevel 2>/dev/null) GSTACK_ROOT="$HOME/.codex/skills/gstack" [ -n "$_ROOT" ] && [ -d "$_ROOT/.agents/skills/gstack" ] && GSTACK_ROOT="$_ROOT/.agents/skills/gstack" GSTACK_BIN="$GSTACK_ROOT/bin" GSTACK_BROWSE="$GSTACK_ROOT/browse/dist" -` - : ''; +`; + } else if (ctx.host === 'gemini') { + runtimeRoot = `_ROOT=$(git rev-parse --show-toplevel 2>/dev/null) +GSTACK_ROOT="$HOME/.gemini/skills/gstack" +[ -n "$_ROOT" ] && [ -d "$_ROOT/.agents/skills/gstack" ] && GSTACK_ROOT="$_ROOT/.agents/skills/gstack" +GSTACK_BIN="$GSTACK_ROOT/bin" +GSTACK_BROWSE="$GSTACK_ROOT/browse/dist" +`; + } return `## Preamble (run first) diff --git a/scripts/resolvers/testing.ts b/scripts/resolvers/testing.ts index da1381c20..ac9bbaca7 100644 --- a/scripts/resolvers/testing.ts +++ b/scripts/resolvers/testing.ts @@ -1,6 +1,6 @@ import type { TemplateContext } from './types'; -export function generateTestBootstrap(_ctx: TemplateContext): string { +export function generateTestBootstrap(ctx: TemplateContext): string { return `## Test Framework Bootstrap **Detect existing test framework and project runtime:** @@ -129,9 +129,9 @@ Write TESTING.md with: - Test layers: Unit tests (what, where, when), Integration tests, Smoke tests, E2E tests - Conventions: file naming, assertion style, setup/teardown patterns -### B7. Update CLAUDE.md +### B7. Update ${ctx.paths.configFile} -First check: If CLAUDE.md already has a \`## Testing\` section → skip. Don't duplicate. +First check: If ${ctx.paths.configFile} already has a \`## Testing\` section → skip. Don't duplicate. Append a \`## Testing\` section: - Run command and test directory @@ -150,7 +150,7 @@ Append a \`## Testing\` section: git status --porcelain \`\`\` -Only commit if there are changes. Stage all bootstrap files (config, test directory, TESTING.md, CLAUDE.md, .github/workflows/test.yml if created): +Only commit if there are changes. Stage all bootstrap files (config, test directory, TESTING.md, ${ctx.paths.configFile}, .github/workflows/test.yml if created): \`git commit -m "chore: bootstrap test framework ({framework name})"\` ---`; @@ -179,7 +179,7 @@ Only commit if there are changes. Stage all bootstrap files (config, test direct type CoverageAuditMode = 'plan' | 'ship' | 'review'; -function generateTestCoverageAuditInner(mode: CoverageAuditMode): string { +function generateTestCoverageAuditInner(mode: CoverageAuditMode, ctx: TemplateContext): string { const sections: string[] = []; // ── Intro (mode-specific) ── @@ -197,8 +197,8 @@ function generateTestCoverageAuditInner(mode: CoverageAuditMode): string { Before analyzing coverage, detect the project's test framework: -1. **Read CLAUDE.md** — look for a \`## Testing\` section with test command and framework name. If found, use that as the authoritative source. -2. **If CLAUDE.md has no testing section, auto-detect:** +1. **Read ${ctx.paths.configFile}** — look for a \`## Testing\` section with test command and framework name. If found, use that as the authoritative source. +2. **If ${ctx.paths.configFile} has no testing section, auto-detect:** \`\`\`bash setopt +o nomatch 2>/dev/null || true # zsh compat @@ -460,7 +460,7 @@ Coverage line: \`Test Coverage Audit: N new code paths. M covered (X%). K tests **7. Coverage gate:** -Before proceeding, check CLAUDE.md for a \`## Test Coverage\` section with \`Minimum:\` and \`Target:\` fields. If found, use those percentages. Otherwise use defaults: Minimum = 60%, Target = 80%. +Before proceeding, check ${ctx.paths.configFile} for a \`## Test Coverage\` section with \`Minimum:\` and \`Target:\` fields. If found, use those percentages. Otherwise use defaults: Minimum = 60%, Target = 80%. Using the coverage percentage from the diagram in substep 4 (the \`COVERAGE: X/Y (Z%)\` line): @@ -543,7 +543,7 @@ If no test framework detected → include gaps as INFORMATIONAL findings only, n ### Coverage Warning -After producing the coverage diagram, check the coverage percentage. Read CLAUDE.md for a \`## Test Coverage\` section with a \`Minimum:\` field. If not found, use default: 60%. +After producing the coverage diagram, check the coverage percentage. Read ${ctx.paths.configFile} for a \`## Test Coverage\` section with a \`Minimum:\` field. If not found, use default: 60%. If coverage is below the minimum threshold, output a prominent warning **before** the regular review findings: @@ -560,14 +560,14 @@ If coverage percentage cannot be determined, skip the warning silently.`); return sections.join('\n'); } -export function generateTestCoverageAuditPlan(_ctx: TemplateContext): string { - return generateTestCoverageAuditInner('plan'); +export function generateTestCoverageAuditPlan(ctx: TemplateContext): string { + return generateTestCoverageAuditInner('plan', ctx); } -export function generateTestCoverageAuditShip(_ctx: TemplateContext): string { - return generateTestCoverageAuditInner('ship'); +export function generateTestCoverageAuditShip(ctx: TemplateContext): string { + return generateTestCoverageAuditInner('ship', ctx); } -export function generateTestCoverageAuditReview(_ctx: TemplateContext): string { - return generateTestCoverageAuditInner('review'); +export function generateTestCoverageAuditReview(ctx: TemplateContext): string { + return generateTestCoverageAuditInner('review', ctx); } diff --git a/scripts/resolvers/types.ts b/scripts/resolvers/types.ts index 8fd17eece..df1ea9f86 100644 --- a/scripts/resolvers/types.ts +++ b/scripts/resolvers/types.ts @@ -1,10 +1,11 @@ -export type Host = 'claude' | 'codex'; +export type Host = 'claude' | 'codex' | 'gemini'; export interface HostPaths { skillRoot: string; localSkillRoot: string; binDir: string; browseDir: string; + configFile: string; } export const HOST_PATHS: Record = { @@ -13,12 +14,21 @@ export const HOST_PATHS: Record = { localSkillRoot: '.claude/skills/gstack', binDir: '~/.claude/skills/gstack/bin', browseDir: '~/.claude/skills/gstack/browse/dist', + configFile: 'CLAUDE.md', }, codex: { skillRoot: '$GSTACK_ROOT', localSkillRoot: '.agents/skills/gstack', binDir: '$GSTACK_BIN', browseDir: '$GSTACK_BROWSE', + configFile: 'CLAUDE.md', + }, + gemini: { + skillRoot: '$GSTACK_ROOT', + localSkillRoot: '.agents/skills/gstack', + binDir: '$GSTACK_BIN', + browseDir: '$GSTACK_BROWSE', + configFile: 'GEMINI.md', }, }; diff --git a/scripts/resolvers/utility.ts b/scripts/resolvers/utility.ts index 48e9c0d82..cfa109c81 100644 --- a/scripts/resolvers/utility.ts +++ b/scripts/resolvers/utility.ts @@ -49,10 +49,11 @@ branch name wherever the instructions say "the base branch" or \`\`. ---`; } -export function generateDeployBootstrap(_ctx: TemplateContext): string { +export function generateDeployBootstrap(ctx: TemplateContext): string { + const CFG = ctx.paths.configFile; return `\`\`\`bash -# Check for persisted deploy config in CLAUDE.md -DEPLOY_CONFIG=$(grep -A 20 "## Deploy Configuration" CLAUDE.md 2>/dev/null || echo "NO_CONFIG") +# Check for persisted deploy config in ${CFG} +DEPLOY_CONFIG=$(grep -A 20 "## Deploy Configuration" ${CFG} 2>/dev/null || echo "NO_CONFIG") echo "$DEPLOY_CONFIG" # If config exists, parse it @@ -78,7 +79,7 @@ for f in $(find .github/workflows -maxdepth 1 \\( -name '*.yml' -o -name '*.yaml done \`\`\` -If \`PERSISTED_PLATFORM\` and \`PERSISTED_URL\` were found in CLAUDE.md, use them directly +If \`PERSISTED_PLATFORM\` and \`PERSISTED_URL\` were found in ${CFG}, use them directly and skip manual detection. If no persisted config exists, use the auto-detected platform to guide deploy verification. If nothing is detected, ask the user via AskUserQuestion in the decision tree below. @@ -371,4 +372,6 @@ export function generateCoAuthorTrailer(ctx: TemplateContext): string { return 'Co-Authored-By: OpenAI Codex '; } return 'Co-Authored-By: Claude Opus 4.6 '; +export function generateConfigFile(ctx: TemplateContext): string { + return ctx.paths.configFile; } diff --git a/setup b/setup index 71306839a..2fffb817a 100755 --- a/setup +++ b/setup @@ -14,6 +14,8 @@ INSTALL_SKILLS_DIR="$(dirname "$INSTALL_GSTACK_DIR")" BROWSE_BIN="$SOURCE_GSTACK_DIR/browse/dist/browse" CODEX_SKILLS="$HOME/.codex/skills" CODEX_GSTACK="$CODEX_SKILLS/gstack" +GEMINI_SKILLS="$HOME/.gemini/skills" +GEMINI_GSTACK="$GEMINI_SKILLS/gstack" IS_WINDOWS=0 case "$(uname -s)" in @@ -26,7 +28,7 @@ LOCAL_INSTALL=0 SKILL_PREFIX=1 while [ $# -gt 0 ]; do case "$1" in - --host) [ -z "$2" ] && echo "Missing value for --host (expected claude, codex, kiro, or auto)" >&2 && exit 1; HOST="$2"; shift 2 ;; + --host) [ -z "$2" ] && echo "Missing value for --host (expected claude, codex, gemini, kiro, or auto)" >&2 && exit 1; HOST="$2"; shift 2 ;; --host=*) HOST="${1#--host=}"; shift ;; --local) LOCAL_INSTALL=1; shift ;; --no-prefix) SKILL_PREFIX=0; shift ;; @@ -35,38 +37,43 @@ while [ $# -gt 0 ]; do done case "$HOST" in - claude|codex|kiro|auto) ;; - *) echo "Unknown --host value: $HOST (expected claude, codex, kiro, or auto)" >&2; exit 1 ;; + claude|codex|gemini|kiro|auto) ;; + *) echo "Unknown --host value: $HOST (expected claude, codex, gemini, kiro, or auto)" >&2; exit 1 ;; esac # --local: install to .claude/skills/ in the current working directory if [ "$LOCAL_INSTALL" -eq 1 ]; then - if [ "$HOST" = "codex" ]; then - echo "Error: --local is only supported for Claude Code (not Codex)." >&2 + if [ "$HOST" = "codex" ] || [ "$HOST" = "gemini" ]; then + echo "Error: --local is only supported for Claude Code (not Codex/Gemini)." >&2 exit 1 fi INSTALL_SKILLS_DIR="$(pwd)/.claude/skills" mkdir -p "$INSTALL_SKILLS_DIR" HOST="claude" INSTALL_CODEX=0 + INSTALL_GEMINI=0 fi # For auto: detect which agents are installed INSTALL_CLAUDE=0 INSTALL_CODEX=0 +INSTALL_GEMINI=0 INSTALL_KIRO=0 if [ "$HOST" = "auto" ]; then command -v claude >/dev/null 2>&1 && INSTALL_CLAUDE=1 command -v codex >/dev/null 2>&1 && INSTALL_CODEX=1 + command -v gemini >/dev/null 2>&1 && INSTALL_GEMINI=1 command -v kiro-cli >/dev/null 2>&1 && INSTALL_KIRO=1 # If none found, default to claude - if [ "$INSTALL_CLAUDE" -eq 0 ] && [ "$INSTALL_CODEX" -eq 0 ] && [ "$INSTALL_KIRO" -eq 0 ]; then + if [ "$INSTALL_CLAUDE" -eq 0 ] && [ "$INSTALL_CODEX" -eq 0 ] && [ "$INSTALL_GEMINI" -eq 0 ] && [ "$INSTALL_KIRO" -eq 0 ]; then INSTALL_CLAUDE=1 fi elif [ "$HOST" = "claude" ]; then INSTALL_CLAUDE=1 elif [ "$HOST" = "codex" ]; then INSTALL_CODEX=1 +elif [ "$HOST" = "gemini" ]; then + INSTALL_GEMINI=1 elif [ "$HOST" = "kiro" ]; then INSTALL_KIRO=1 fi @@ -166,7 +173,7 @@ if ! ensure_playwright_browser; then echo "Installing Playwright Chromium..." ( cd "$SOURCE_GSTACK_DIR" - bunx playwright install chromium + bun x playwright install chromium ) if [ "$IS_WINDOWS" -eq 1 ]; then @@ -383,6 +390,45 @@ create_codex_runtime_root() { fi } +# ─── Helper: create a minimal ~/.gemini/skills/gstack runtime root ────────── +create_gemini_runtime_root() { + local gstack_dir="$1" + local gemini_gstack="$2" + local agents_dir="$gstack_dir/.agents/skills" + + if [ -L "$gemini_gstack" ]; then + rm -f "$gemini_gstack" + elif [ -d "$gemini_gstack" ] && [ "$gemini_gstack" != "$gstack_dir" ]; then + rm -rf "$gemini_gstack" + fi + + mkdir -p "$gemini_gstack" "$gemini_gstack/browse" "$gemini_gstack/gstack-upgrade" "$gemini_gstack/review" + + if [ -f "$agents_dir/gstack/SKILL.md" ]; then + ln -snf "$agents_dir/gstack/SKILL.md" "$gemini_gstack/SKILL.md" + fi + if [ -d "$gstack_dir/bin" ]; then + ln -snf "$gstack_dir/bin" "$gemini_gstack/bin" + fi + if [ -d "$gstack_dir/browse/dist" ]; then + ln -snf "$gstack_dir/browse/dist" "$gemini_gstack/browse/dist" + fi + if [ -d "$gstack_dir/browse/bin" ]; then + ln -snf "$gstack_dir/browse/bin" "$gemini_gstack/browse/bin" + fi + if [ -f "$agents_dir/gstack-upgrade/SKILL.md" ]; then + ln -snf "$agents_dir/gstack-upgrade/SKILL.md" "$gemini_gstack/gstack-upgrade/SKILL.md" + fi + for f in checklist.md design-checklist.md greptile-triage.md TODOS-format.md; do + if [ -f "$gstack_dir/review/$f" ]; then + ln -snf "$gstack_dir/review/$f" "$gemini_gstack/review/$f" + fi + done + if [ -f "$gstack_dir/ETHOS.md" ]; then + ln -snf "$gstack_dir/ETHOS.md" "$gemini_gstack/ETHOS.md" + fi +} + # 4. Install for Claude (default) SKILLS_BASENAME="$(basename "$INSTALL_SKILLS_DIR")" SKILLS_PARENT_BASENAME="$(basename "$(dirname "$INSTALL_SKILLS_DIR")")" @@ -433,6 +479,49 @@ if [ "$INSTALL_CODEX" -eq 1 ]; then echo " codex skills: $CODEX_SKILLS" fi +# 5.5 Install for Gemini +if [ "$INSTALL_GEMINI" -eq 1 ]; then + # Gemini can also use repo-local installs from .agents/skills/ + if [ "$CODEX_REPO_LOCAL" -eq 1 ]; then + GEMINI_SKILLS="$INSTALL_SKILLS_DIR" + GEMINI_GSTACK="$INSTALL_GSTACK_DIR" + fi + mkdir -p "$GEMINI_SKILLS" + + if [ "$CODEX_REPO_LOCAL" -eq 0 ]; then + create_gemini_runtime_root "$SOURCE_GSTACK_DIR" "$GEMINI_GSTACK" + fi + + # Gemini uses same SKILL.md format as Codex (the open SKILL.md standard). + # If this is a global install, we must rewrite paths to use ~/.gemini. + if [ "$CODEX_REPO_LOCAL" -eq 0 ]; then + if [ ! -d "$SOURCE_GSTACK_DIR/.agents/skills" ]; then + echo " Generating .agents/ skill docs..." + ( cd "$SOURCE_GSTACK_DIR" && bun run gen:skill-docs --host codex ) + fi + for skill_dir in "$SOURCE_GSTACK_DIR/.agents/skills"/gstack*/; do + [ -f "$skill_dir/SKILL.md" ] || continue + skill_name="$(basename "$skill_dir")" + [ "$skill_name" = "gstack" ] && continue + target_dir="$GEMINI_SKILLS/$skill_name" + mkdir -p "$target_dir" + # Rewrite codex/claude paths to gemini + sed -e 's|\$HOME/.codex/skills/gstack|$HOME/.gemini/skills/gstack|g' \ + -e "s|~/.codex/skills/gstack|~/.gemini/skills/gstack|g" \ + -e "s|~/.claude/skills/gstack|~/.gemini/skills/gstack|g" \ + -e "s|CLAUDE\.md|GEMINI.md|g" \ + "$skill_dir/SKILL.md" > "$target_dir/SKILL.md" + done + else + # Repo-local: just link the generated skills + link_codex_skill_dirs "$SOURCE_GSTACK_DIR" "$GEMINI_SKILLS" + fi + + echo "gstack ready (gemini)." + echo " browse: $BROWSE_BIN" + echo " gemini skills: $GEMINI_SKILLS" +fi + # 6. Install for Kiro CLI (copy from .agents/skills, rewrite paths) if [ "$INSTALL_KIRO" -eq 1 ]; then KIRO_SKILLS="$HOME/.kiro/skills" @@ -489,10 +578,10 @@ if [ "$INSTALL_KIRO" -eq 1 ]; then fi fi -# 7. Create .agents/ sidecar symlinks for the real Codex skill target. -# The root Codex skill ends up pointing at $SOURCE_GSTACK_DIR/.agents/skills/gstack, +# 7. Create .agents/ sidecar symlinks for the real Codex/Gemini skill target. +# The root skill ends up pointing at $SOURCE_GSTACK_DIR/.agents/skills/gstack, # so the runtime assets must live there for both global and repo-local installs. -if [ "$INSTALL_CODEX" -eq 1 ]; then +if [ "$INSTALL_CODEX" -eq 1 ] || [ "$INSTALL_GEMINI" -eq 1 ]; then create_agents_sidecar "$SOURCE_GSTACK_DIR" fi