Skip to content

Provision Copilot CLI for code-review judge in Claude workflow#701

Merged
gggdttt merged 4 commits into
mainfrom
private/wenjiefan/claude-codereview-judge-copilot
Jun 26, 2026
Merged

Provision Copilot CLI for code-review judge in Claude workflow#701
gggdttt merged 4 commits into
mainfrom
private/wenjiefan/claude-codereview-judge-copilot

Conversation

@gggdttt

@gggdttt gggdttt commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Problem

Running the code-review category through the Claude evaluation workflow fails in the scoring step with:

LLMJudgeError: Copilot CLI not found; cannot run the semantic judge

The code-review semantic judge (judge_verdicts in src/bcbench/evaluate/codereview_judge.py) always runs on Copilot CLI as a fixed, agent-independent judge. But claude-evaluation.yml only installs Claude Code — it never provisions Copilot CLI, the copilot-requests permission, or COPILOT_GITHUB_TOKEN — so judging dies for any Claude code-review run.

This is a pre-existing gap in the Claude workflow, unrelated to the BCQuality work in #696.

Fix

In .github/workflows/claude-evaluation.yml:

  • Add copilot-requests: write to the evaluation job permissions.
  • Install @github/copilot@1.0.57, gated on category == 'code-review' (skipped for bug-fix / test-generation to save time).
  • Pass COPILOT_GITHUB_TOKEN: ${{ github.token }} to the run step and mask it.

This mirrors the judge environment already present in copilot-evaluation.yml.

Testing

Workflow-only change. To validate: run the Claude evaluation workflow with category: code-review (test-run) and confirm the judge step no longer errors.

@gggdttt gggdttt changed the title ci: provision Copilot CLI for code-review judge in Claude workflow Provision Copilot CLI for code-review judge in Claude workflow Jun 26, 2026
@gggdttt gggdttt marked this pull request as ready for review June 26, 2026 11:47
@gggdttt gggdttt enabled auto-merge (squash) June 26, 2026 11:49
Comment thread .github/workflows/claude-evaluation.yml Outdated
Comment thread .github/actions/install-eval-clis/action.yml Outdated
Comment thread .github/workflows/claude-evaluation.yml Outdated
gggdttt and others added 2 commits June 26, 2026 14:55
Co-authored-by: Sun Haoran <haoransun@microsoft.com>
Co-authored-by: Sun Haoran <haoransun@microsoft.com>
@gggdttt gggdttt merged commit eb546e4 into main Jun 26, 2026
7 checks passed
@gggdttt gggdttt deleted the private/wenjiefan/claude-codereview-judge-copilot branch June 26, 2026 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants