Skip to content

skill-trigger evaluator cannot detect pi-cli skill loading #780

@christso

Description

@christso

Summary

The skill-trigger evaluator fails for pi-cli because pi's JSONL output doesn't include tool call names in a format the evaluator recognizes.

Evidence

The skill IS loaded (scores improve from ~4/9 to 6/9 when skills are in the workspace), but the evaluator reports:

Skill "agent-plugin-review" not found in 1 tool call(s)

Pi's stream log shows toolcall_start/toolcall_delta/toolcall_end events without the tool name or arguments. The extractToolCalls function in pi-cli.ts looks for type: "tool_use" or type: "toolCall" with a name field in msg.content, but pi may structure tool calls differently.

Current behavior

  • skill-trigger works for Claude Code (checks Skill tool use) and Copilot (checks readFile of SKILL.md)
  • skill-trigger silently fails for pi-cli — tool calls are detected but skill name is not found

Expected behavior

The evaluator should detect when pi reads a SKILL.md file (via read_file or similar tool) that matches the skill name.

Workaround

Removed skill-trigger assertions from the agentic-engineering eval. Content assertions still validate review quality.

Related

Metadata

Metadata

Assignees

Labels

in-progressClaimed by an agent — do not duplicate work

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions