fix(core): extract pi-cli tool calls from streaming events#782
Merged
fix(core): extract pi-cli tool calls from streaming events#782
Conversation
…trigger Pi CLI emits tool_execution_start/end events in JSONL output, but the provider only extracted tool calls from message content arrays. This caused the skill-trigger evaluator to miss pi's skill file reads. Now extractMessages() also scans for tool_execution_start/end events and injects reconstructed tool calls into assistant messages. Also handles tool_call (snake_case) content type variant. Closes #780 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace target message with a new object instead of casting to bypass readonly constraint, per code review feedback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Re-adds the skill-trigger assertion that was removed as a workaround for #780. Now that pi-cli tool call extraction is fixed, the evaluator can detect when pi loads the agent-plugin-review skill. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pi-cli target needs subprovider/model/api_key to produce meaningful output. Without them, pi uses its default which returns empty responses. Also removes workers: 1 from agent-plugin-review eval since all test cases are read-only reviews that can safely run in parallel. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2989cbd to
a9f27f5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tool_execution_start/tool_execution_endevents in its JSONL output, but the provider only extracted tool calls from messagecontentarrays (tool_use/toolCallformat). This caused theskill-triggerevaluator to reportSkill "X" not found in N tool call(s)even when pi successfully loaded the skill.extractMessages()now also scans fortool_execution_start/tool_execution_endevents and injects reconstructedToolCallobjects into assistant messages, deduplicating against any tool calls already present in message content.tool_call(snake_case) content type variant and logstool_start/tool_endevents in the stream logger for better debugging.Closes #780
Test plan
extractToolCallsFromEventsandextractMessagescovering: event-only tool calls, deduplication, multiple tools, synthetic assistant message creation, turn_end fallback, andtool_callcontent typeopenai/gpt-5.1-codexvia OpenRouter. Skill-trigger correctly detectsreadtool loading.agents/skills/csv-analyzer/SKILL.md(score 1.000). Confirmedtool_execution_start/tool_execution_endevents present in JSONL and newtool_start/tool_endlog lines visible.🤖 Generated with Claude Code