You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(pipeline): agent-mode artifacts align with CLI-mode schema (#796)
* feat(pipeline): agent-mode artifacts align with CLI-mode schema
- pipeline run/input: --out now optional, defaults to .agentv/results/runs/eval_<timestamp>
- pipeline bench: index.jsonl now includes scores[], execution_status, response_path to match CLI-mode dashboard schema
- results validate: new command to check run dir naming, index.jsonl fields, artifact presence, and score bounds
- skill: update agent-mode workflow docs to use default --out, add validate step, clarify llm_scores.json -> index.jsonl flow; user-stated mode overrides .env
* fix: address code review issues for pipeline artifact alignment
1. execution_status: run.ts now writes status into timing.json ('ok' or
'execution_error'), bench.ts reads it back instead of hardcoding 'ok'
2. response_path: use null instead of undefined so the field is always
present in index.jsonl
3. --workers concurrency: implement actual concurrency limiter using
Promise.race instead of unbounded Promise.all
4. validate.ts: validate scores[] entry structure (name, type, score,
verdict) and warn on unknown execution_status values
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0 commit comments