feat(code-audit): add PR auditor as second code-audit slice#18
Merged
Conversation
`openworkers audit pr <github-url>` extracts atomic claims from a PR description and verdicts each against the actual unified diff. Same pipeline shape as the README auditor (planner → deterministic researcher → checker + trust gate → critic), proving that the `SourceAdapter` abstraction generalises to a second backend. New modules: - core/sources/github.py — GitHubAdapter over a unified diff via a PrSpec value object; live fetch (fetch_pr_from_github) is a sibling helper, decoupled from the adapter so tests stay network-free. parse_pr_url + load_pr_fixture as supporting helpers. - core/orchestrator/pr_flow.py — PrAuditOrchestrator. - core/orchestrator/audit_prompts.py — shared audit-prompt loader (extracted from readme_flow.py so each new auditor registers its templates in one place). - prompts/code_audit/pr_planner.md + pr_checker.md — PR-specific claim types (add | remove | fix | refactor | test | behavior | doc | other) and diff-aware verdict rules. Schema: - core/schemas_audit.py exposes AuditClaim / AuditClaimList as aliases of ReadmeClaim / ReadmeClaimList so PR code reads cleanly without churning the README slice. AuditReport.target captures the audited artefact id (PR URL, etc.). Agents: - providers/code_audit_agents.py adds PrPlannerAgent + PrCheckerAgent alongside the README versions. The same _enforce_trust_gate runs after the LLM responds, so any claim with no diff evidence is forced to `unsupported` regardless of LLM output. AuditCriticAgent is reused as-is. CLI: - `openworkers audit pr <url>` (or `--fixture <dir>` for offline runs) with optional `--token` falling back to GITHUB_TOKEN / GH_TOKEN. Tests: - tests/fixtures/sample_pr/ — canned PR JSON + unified diff with one verified, one drifted, one contradicted, one fabricated (no-evidence) claim. - tests/code_audit/test_pr_flow.py — URL parser, fixture loader, adapter grep, end-to-end audit with verdict distribution, and an explicit trust-gate-override assertion for the hallucinated WIDGETLIB_DEBUG claim. Docs: README.md, ROADMAP.md, CHANGELOG.md, AGENTS.md updated with the new slice, shared pipeline section, and PR-audit-flow walkthrough. Verification: 159/159 tests pass (153 existing + 6 new PR tests), mypy strict clean on new modules, ruff clean on new files, black formatted. CLI smoke-runs in text and JSON modes against the fixture. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
openworkers audit pr <github-url>extracts atomic claims from a PR description and verdicts each against the actual unified diff. Same pipeline shape as the README auditor (planner → deterministic researcher → checker + trust gate → critic), proving that theSourceAdapterabstraction generalises to a second backend.New modules:
Schema:
Agents:
unsupportedregardless of LLM output. AuditCriticAgent is reused as-is.CLI:
openworkers audit pr <url>(or--fixture <dir>for offline runs) with optional--tokenfalling back to GITHUB_TOKEN / GH_TOKEN.Tests:
Docs: README.md, ROADMAP.md, CHANGELOG.md, AGENTS.md updated with the new slice, shared pipeline section, and PR-audit-flow walkthrough.
Verification: 159/159 tests pass (153 existing + 6 new PR tests), mypy strict clean on new modules, ruff clean on new files, black formatted. CLI smoke-runs in text and JSON modes against the fixture.
Summary
Changes
Testing
pytest tests/ -vpassesruff check .passesblack --check .passesmypy core/ providers/ --strict --ignore-missing-importspassesChecklist