feat: Option B PR workflow — Codex review for code, direct-commit for data#1
Conversation
… data
Establishes the Option B architecture:
- Pure data updates (knowledge-base/) continue to flow through
post-ingest's auto-git step directly to main.
- Code, briefs, docs, schemas, hooks, and GitHub workflows now go
through branches + PRs and get reviewed by Codex before merge.
Three pieces:
1. .github/workflows/codex-review.yml
Triggers on pull_request for code-touching paths only. Uses the
openai/codex-action (action ref is a placeholder — adjust to match
the install path: GitHub App vs action). Skips data paths to avoid
wasting review cycles on regenerable insight files.
2. .github/pull_request_template.md
Scaffold for PR descriptions: change summary, scope checklist, risk
surface, tests, out-of-scope notes, verification. Helps Codex
review against intent rather than just diff.
3. scripts/post-ingest.ts
Adds isDataPath() helper and a guard in the auto-git step. If any
non-data file is staged when post-ingest runs (because the user
staged it manually), the auto-commit aborts with a clear message
directing the user to either unstage or open a PR. Preserves the
data-only invariant on auto-commits to main.
Setup follow-up (manual, in repo Settings):
- Add OPENAI_API_KEY secret for the Codex action.
- Enable branch protection on main: require PR + status checks.
- Confirm openai/codex-action ref or install the GitHub App.
Tests: all 464 existing tests pass. No new tests added — the guard's
behavior is exercised by the smoke test of running post-ingest with
nothing staged outside knowledge-base/ (the common path).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2e7cd45aa9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| with: | ||
| openai-api-key: ${{ secrets.OPENAI_API_KEY }} | ||
| # Review focus — tells Codex what to prioritize | ||
| review-instructions: | |
There was a problem hiding this comment.
Provide a prompt to codex-action
openai/codex-action@v1 executes codex exec only when prompt or prompt-file is provided, but this workflow passes review-instructions, which is not a defined input for that action. That means the job can succeed while never running an actual review or posting findings, so PRs appear reviewed when no Codex analysis occurred.
Useful? React with 👍 / 👎.
| - "README.md" | ||
| - "CLAUDE.md" | ||
| - "AGENTS.md" | ||
| paths-ignore: |
There was a problem hiding this comment.
Use a valid path filter for pull_request
This trigger defines both paths and paths-ignore for the same pull_request event. GitHub Actions workflow syntax requires choosing one approach per event, so this configuration is invalid and the Codex review workflow will not trigger as intended. Replace it with a single paths block (using ! exclusions) or only paths-ignore.
Useful? React with 👍 / 👎.
Two issues with the original workflow:
1. Invalid YAML — GitHub rejects `paths` + `paths-ignore` on the same
trigger (Line 38, Col 5: "you may only define one of paths and
paths-ignore for a single event").
2. The workflow was unnecessary in the first place. The user's other
repos use the chatgpt-codex-connector GitHub App, which listens
to pull_request events directly and posts reviews as a bot account.
No workflow file is needed; the openai/codex-action@v1 reference
was speculative and not a real published action.
Fix: delete the workflow. Update the auto-git comment in post-ingest
to point at the App instead. PR template stays as-is since its Codex
references work for either delivery mechanism.
Setup follow-up (manual, in repo Settings):
- Install chatgpt-codex-connector GitHub App on gorajing/zuhn
(probably already authorized at the org level if installed on
other gorajing repos like rezn).
- Branch protection on main still recommended.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e6ab051e48
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| function isDataPath(path: string): boolean { | ||
| // Normalize POSIX-style separators (git output) regardless of host | ||
| const normalized = path.replace(/\\/g, "/"); | ||
| return normalized.startsWith("knowledge-base/"); |
There was a problem hiding this comment.
Handle quoted git paths before data-only classification
isDataPath assumes git diff --cached --name-only emits raw knowledge-base/... paths, but Git can C-quote non-ASCII filenames (default core.quotePath=true), producing entries like "knowledge-base/domains/tmp-\303\261.md". In that case startsWith("knowledge-base/") fails, so a pure data run is misclassified as containing non-data files and auto-git aborts unexpectedly. This regresses ingestion for valid Unicode-named files under knowledge-base/.
Useful? React with 👍 / 👎.
… paths Codex review on PR #1 caught a regression: `git diff --cached --name-only` C-quotes non-ASCII filenames by default (core.quotePath=true), turning `knowledge-base/foo-ñ.md` into `"knowledge-base/foo-\303\261.md"`. The leading quote defeats `isDataPath`'s `startsWith("knowledge-base/")` check, so legitimate Unicode-named data files would be misclassified as non-data and abort the auto-git step on pure data runs. Fix: pass `-z` to git diff for null-terminated raw paths, then split on `\0`. No quoting, no C-escapes. Removes the `.trim()` call too since -z output has no surrounding whitespace. Tests: 464 / 464 passing. The fix is internal to auto-git and does not change the predicate's contract.
What this changes
Establishes Option B: data updates (
knowledge-base/) continue to auto-committo main via post-ingest, while code, briefs, docs, schemas, hooks, and CI
flow through branches + PRs reviewed by the
chatgpt-codex-connectorGitHub App.
Why
Today's pipeline auto-commits everything to main directly, so code changes
bypass review entirely. This PR splits the surface: data stays on the fast
path (regenerable, several hundred file changes per post-ingest run), code
goes through the App's review.
Scope
scripts/post-ingest.tsauto-git guard +isDataPathhelper).github/pull_request_template.md)no workflow file needed)
Risk surface
The auto-git guard fails loud if the user manually stages non-data files
before running post-ingest. This is an intentional refusal that changes
prior behavior (where mixed staging would have been silently committed
to main). Error message points the user at the resolution.
Tests
isDataPathis exercised by every post-ingest run onthe common path. Codex's review caught a Unicode-quoting bug in the first
pass (now fixed via
-znull-terminated git output).Out of scope
main(must be enabled in GitHub Settings).Verification
npm test→ 464 / 464 passingnpx tsc --noEmit scripts/post-ingest.ts→ cleanreview). Codex caught two real bugs across the first two commits:
openai/codex-action@v1acceptedprompt/prompt-file, notreview-instructions→ silent no-op risk. Resolved by removingthe workflow entirely (the App is the right delivery mechanism).
git diff --cached --name-onlyC-quotes non-ASCII paths,defeating
isDataPath's prefix check on Unicode filenames.Fixed via
-znull-terminated output.🤖 Generated with Claude Code