feat(eval): code grader multimodal — structured Content in CodeGraderInput by christso · Pull Request #844 · EntityProcess/agentv

christso · 2026-03-29T04:35:46Z

Closes #821

Changes

Extends CodeGraderInput with typed Content[] so code graders can inspect structured multimodal output. ContentImage blocks carry file paths (not inline base64).

Added ContentTextSchema, ContentImageSchema, ContentFileSchema, ContentSchema to @agentv/eval
Updated MessageSchema.content from loose unknown to typed string | Content[]
materializeContentForGrader() converts data URIs to temp file paths for the grader payload
Exported all Content schemas from eval package
21 new tests

…th industry patterns - {{output}}, {{input}}, {{expected_output}} now resolve to human-readable text instead of JSON.stringify'd message arrays - Deprecated _text aliases ({{input_text}}, {{output_text}}, {{expected_output_text}}) still work but emit a stderr warning - Removed outputText, inputText, expectedOutputText from CodeGraderInput schema — code graders should extract text from Message.content using getTextContent() from @agentv/core - Removed EnrichedCodeGraderInput type (no longer needed) - Updated default evaluator template to use new variable names - Updated prompt-validator to accept both new and deprecated variable names Closes #825 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Update Claude and Pi providers to preserve non-text content blocks (images) in Message.content instead of discarding them via extractTextContent(). This enables multimodal content to flow from provider response through to evaluators. Changes: - Create shared claude-content.ts with toContentArray() and extractTextContent() used by all 3 Claude providers - Update claude-cli, claude-sdk, claude providers to use structuredContent ?? textContent pattern - Add toPiContentArray() to pi-utils.ts for Pi provider - Update pi-coding-agent convertAgentMessage() to preserve structured content - Add 23 unit tests covering content preservation, backward compat, and end-to-end multimodal flow Text-only responses still produce plain strings (no unnecessary wrapping). extractTextContent() remains available for backward compatibility. Closes #818 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…Input - Add ContentTextSchema, ContentImageSchema, ContentFileSchema, ContentSchema as Zod discriminated union in packages/eval/src/schemas.ts - Update MessageSchema.content to accept string | Content[] (typed blocks) - Add materializeContentForGrader() in code-evaluator.ts: - Data URI images decoded and written to temp files (path, not base64) - Non-URI images pass source through as path field - Text/file blocks unchanged; string content unchanged - Lazy temp dir creation for image files, cleaned up in finally block - Export Content schemas and types from @agentv/eval - Add comprehensive unit tests for schema validation and materialization - Add integration tests for CodeEvaluator with multimodal output Closes #821 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

christso and others added 3 commits March 29, 2026 04:34

christso merged commit 468ff01 into main Mar 29, 2026
1 of 2 checks passed

christso deleted the feat/821-code-grader-mm branch March 29, 2026 04:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(eval): code grader multimodal — structured Content in CodeGraderInput#844

feat(eval): code grader multimodal — structured Content in CodeGraderInput#844
christso merged 3 commits intomainfrom
feat/821-code-grader-mm

christso commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christso commented Mar 29, 2026

Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant