-
Notifications
You must be signed in to change notification settings - Fork 0
skill-trigger evaluator cannot detect pi-cli skill loading #780
Description
Summary
The skill-trigger evaluator fails for pi-cli because pi's JSONL output doesn't include tool call names in a format the evaluator recognizes.
Evidence
The skill IS loaded (scores improve from ~4/9 to 6/9 when skills are in the workspace), but the evaluator reports:
Skill "agent-plugin-review" not found in 1 tool call(s)
Pi's stream log shows toolcall_start/toolcall_delta/toolcall_end events without the tool name or arguments. The extractToolCalls function in pi-cli.ts looks for type: "tool_use" or type: "toolCall" with a name field in msg.content, but pi may structure tool calls differently.
Current behavior
skill-triggerworks for Claude Code (checksSkilltool use) and Copilot (checksreadFileof SKILL.md)skill-triggersilently fails for pi-cli — tool calls are detected but skill name is not found
Expected behavior
The evaluator should detect when pi reads a SKILL.md file (via read_file or similar tool) that matches the skill name.
Workaround
Removed skill-trigger assertions from the agentic-engineering eval. Content assertions still validate review quality.
Related
- PR feat: add workspace skills for pi-cli eval execution #776 — eval workspace setup
- Issue Improve agent-plugin-review skill to pass remaining 3 eval tests #779 — skill improvement iteration
Metadata
Metadata
Assignees
Labels
Type
Projects
Status