fix: improve agent-plugin-review skill to pass 9/9 evals (#779) by christso · Pull Request #783 · EntityProcess/agentv

christso · 2026-03-26T08:31:07Z

Summary

Skill improvements: Added explicit guidance for detecting relative file paths (missing leading /), repeated inputs across test cases, and missing hard gates in multi-phase workflows
Pi-cli env isolation: When a subprovider is explicitly configured, strip competing provider env vars from the spawned process (AZURE_OPENAI_* was overriding --provider openrouter)
Target config: Added subprovider: openrouter to pi-cli target so it uses the intended provider
Lint fix: Fixed pre-existing biome lint errors in workspace setup script

Test results

Before: 0/9 pass (pi-cli was silently using Azure OpenAI, returning empty responses)
After: 9/9 pass, mean score 1.000 (verified across 2 consecutive runs)

Test plan

All 9 agent-plugin-review eval tests pass with --target pi-cli
Verified stability across 2 consecutive full runs
Unit tests pass (1598 tests, 0 failures)
Pre-push hooks pass (build, typecheck, lint, test, validate:examples)

Closes #779

🤖 Generated with Claude Code

…779) Skill improvements: - Add explicit checks for file path format (leading slash) and repeated inputs in eval YAML - Add hard gate detection recipe for multi-phase workflows - Update workflow-checklist example to use concrete deploy-plan artifact Pi-cli fix: - Strip competing provider env vars when subprovider is explicitly configured (AZURE_OPENAI_* vars were overriding --provider flag) - Add subprovider: openrouter to pi-cli target config All 9 agent-plugin-review eval tests now pass (was 6/9). Closes #779 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-03-26T08:31:47Z

Deploying agentv with Cloudflare Pages

Latest commit:	`9480f01`
Status:	✅ Deploy successful!
Preview URL:	https://41830288.agentv.pages.dev
Branch Preview URL:	https://fix-779-plugin-review-evals.agentv.pages.dev

View logs

christso and others added 2 commits March 26, 2026 08:28

style: fix lint errors in workspace setup script

9480f01

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

christso merged commit c5c7a11 into main Mar 26, 2026
2 checks passed

christso deleted the fix/779-plugin-review-evals branch March 26, 2026 08:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improve agent-plugin-review skill to pass 9/9 evals (#779)#783

fix: improve agent-plugin-review skill to pass 9/9 evals (#779)#783
christso merged 2 commits intomainfrom
fix/779-plugin-review-evals

christso commented Mar 26, 2026

Uh oh!

cloudflare-workers-and-pages bot commented Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christso commented Mar 26, 2026

Summary

Test results

Test plan

Uh oh!

cloudflare-workers-and-pages bot commented Mar 26, 2026

Deploying agentv with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant