Skip to content

feat: Option B PR workflow — Codex review for code, direct-commit for data#1

Merged
gorajing merged 3 commits into
mainfrom
feat/option-b-pr-workflow
Apr 26, 2026
Merged

feat: Option B PR workflow — Codex review for code, direct-commit for data#1
gorajing merged 3 commits into
mainfrom
feat/option-b-pr-workflow

Conversation

@gorajing
Copy link
Copy Markdown
Owner

@gorajing gorajing commented Apr 26, 2026

What this changes

Establishes Option B: data updates (knowledge-base/) continue to auto-commit
to main via post-ingest, while code, briefs, docs, schemas, hooks, and CI
flow through branches + PRs reviewed by the chatgpt-codex-connector
GitHub App.

Why

Today's pipeline auto-commits everything to main directly, so code changes
bypass review entirely. This PR splits the surface: data stays on the fast
path (regenerable, several hundred file changes per post-ingest run), code
goes through the App's review.

Scope

  • Code (scripts/post-ingest.ts auto-git guard + isDataPath helper)
  • PR template (.github/pull_request_template.md)
  • CI / GitHub workflows — none added (the App handles review via webhook,
    no workflow file needed)

Risk surface

The auto-git guard fails loud if the user manually stages non-data files
before running post-ingest. This is an intentional refusal that changes
prior behavior (where mixed staging would have been silently committed
to main). Error message points the user at the resolution.

Tests

  • 464 existing tests pass on each commit.
  • No new tests added; isDataPath is exercised by every post-ingest run on
    the common path. Codex's review caught a Unicode-quoting bug in the first
    pass (now fixed via -z null-terminated git output).

Out of scope

  • Branch protection on main (must be enabled in GitHub Settings).
  • Any migration of in-flight code commits from main to PR-based flow.

Verification

  • npm test → 464 / 464 passing
  • npx tsc --noEmit scripts/post-ingest.ts → clean
  • This PR itself uses the new convention (feature branch → PR → Codex
    review). Codex caught two real bugs across the first two commits:
    1. openai/codex-action@v1 accepted prompt/prompt-file, not
      review-instructions → silent no-op risk. Resolved by removing
      the workflow entirely (the App is the right delivery mechanism).
    2. git diff --cached --name-only C-quotes non-ASCII paths,
      defeating isDataPath's prefix check on Unicode filenames.
      Fixed via -z null-terminated output.

🤖 Generated with Claude Code

… data

Establishes the Option B architecture:
  - Pure data updates (knowledge-base/) continue to flow through
    post-ingest's auto-git step directly to main.
  - Code, briefs, docs, schemas, hooks, and GitHub workflows now go
    through branches + PRs and get reviewed by Codex before merge.

Three pieces:

1. .github/workflows/codex-review.yml
   Triggers on pull_request for code-touching paths only. Uses the
   openai/codex-action (action ref is a placeholder — adjust to match
   the install path: GitHub App vs action). Skips data paths to avoid
   wasting review cycles on regenerable insight files.

2. .github/pull_request_template.md
   Scaffold for PR descriptions: change summary, scope checklist, risk
   surface, tests, out-of-scope notes, verification. Helps Codex
   review against intent rather than just diff.

3. scripts/post-ingest.ts
   Adds isDataPath() helper and a guard in the auto-git step. If any
   non-data file is staged when post-ingest runs (because the user
   staged it manually), the auto-commit aborts with a clear message
   directing the user to either unstage or open a PR. Preserves the
   data-only invariant on auto-commits to main.

Setup follow-up (manual, in repo Settings):
  - Add OPENAI_API_KEY secret for the Codex action.
  - Enable branch protection on main: require PR + status checks.
  - Confirm openai/codex-action ref or install the GitHub App.

Tests: all 464 existing tests pass. No new tests added — the guard's
behavior is exercised by the smoke test of running post-ingest with
nothing staged outside knowledge-base/ (the common path).
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2e7cd45aa9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/codex-review.yml Outdated
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
# Review focus — tells Codex what to prioritize
review-instructions: |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Provide a prompt to codex-action

openai/codex-action@v1 executes codex exec only when prompt or prompt-file is provided, but this workflow passes review-instructions, which is not a defined input for that action. That means the job can succeed while never running an actual review or posting findings, so PRs appear reviewed when no Codex analysis occurred.

Useful? React with 👍 / 👎.

Comment thread .github/workflows/codex-review.yml Outdated
- "README.md"
- "CLAUDE.md"
- "AGENTS.md"
paths-ignore:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use a valid path filter for pull_request

This trigger defines both paths and paths-ignore for the same pull_request event. GitHub Actions workflow syntax requires choosing one approach per event, so this configuration is invalid and the Codex review workflow will not trigger as intended. Replace it with a single paths block (using ! exclusions) or only paths-ignore.

Useful? React with 👍 / 👎.

Two issues with the original workflow:

1. Invalid YAML — GitHub rejects `paths` + `paths-ignore` on the same
   trigger (Line 38, Col 5: "you may only define one of paths and
   paths-ignore for a single event").

2. The workflow was unnecessary in the first place. The user's other
   repos use the chatgpt-codex-connector GitHub App, which listens
   to pull_request events directly and posts reviews as a bot account.
   No workflow file is needed; the openai/codex-action@v1 reference
   was speculative and not a real published action.

Fix: delete the workflow. Update the auto-git comment in post-ingest
to point at the App instead. PR template stays as-is since its Codex
references work for either delivery mechanism.

Setup follow-up (manual, in repo Settings):
  - Install chatgpt-codex-connector GitHub App on gorajing/zuhn
    (probably already authorized at the org level if installed on
    other gorajing repos like rezn).
  - Branch protection on main still recommended.
@gorajing
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e6ab051e48

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/post-ingest.ts
function isDataPath(path: string): boolean {
// Normalize POSIX-style separators (git output) regardless of host
const normalized = path.replace(/\\/g, "/");
return normalized.startsWith("knowledge-base/");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle quoted git paths before data-only classification

isDataPath assumes git diff --cached --name-only emits raw knowledge-base/... paths, but Git can C-quote non-ASCII filenames (default core.quotePath=true), producing entries like "knowledge-base/domains/tmp-\303\261.md". In that case startsWith("knowledge-base/") fails, so a pure data run is misclassified as containing non-data files and auto-git aborts unexpectedly. This regresses ingestion for valid Unicode-named files under knowledge-base/.

Useful? React with 👍 / 👎.

… paths

Codex review on PR #1 caught a regression: `git diff --cached --name-only`
C-quotes non-ASCII filenames by default (core.quotePath=true), turning
`knowledge-base/foo-ñ.md` into `"knowledge-base/foo-\303\261.md"`. The
leading quote defeats `isDataPath`'s `startsWith("knowledge-base/")`
check, so legitimate Unicode-named data files would be misclassified
as non-data and abort the auto-git step on pure data runs.

Fix: pass `-z` to git diff for null-terminated raw paths, then split
on `\0`. No quoting, no C-escapes. Removes the `.trim()` call too
since -z output has no surrounding whitespace.

Tests: 464 / 464 passing. The fix is internal to auto-git and does
not change the predicate's contract.
@gorajing gorajing merged commit 44ee233 into main Apr 26, 2026
1 check passed
@gorajing gorajing deleted the feat/option-b-pr-workflow branch April 26, 2026 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant