feat(web): pipeline filters, drift KPIs, and trace prompt inspector#37
Draft
cursor[bot] wants to merge 1 commit into
Draft
feat(web): pipeline filters, drift KPIs, and trace prompt inspector#37cursor[bot] wants to merge 1 commit into
cursor[bot] wants to merge 1 commit into
Conversation
- Add InsightFilterCard to agent stats and pipeline timing pages - Give preset leaderboard full filter parity with model leaderboard - Show all three confidence drift means on the overview dashboard - Expose LLM request prompts and parse retries in run trace steps Co-authored-by: Eamon Boyle <eamonboyle@users.noreply.github.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds four product improvements to the LLM debate research dashboard, grounded in existing patterns and data already present in run artifacts.
Features
1. Pipeline insight filters (Agents + Timing)
/agentsand/timingnow useInsightFilterCardwith the same search/model/preset/fast/date filters as other insight pages2. Preset leaderboard filter parity
/presetsupgraded from fast-mode-only filtering to fullInsightFilterCardsupport viaapplyIndexFilters3. Overview confidence drift KPIs
4. Trace prompt & retry inspector
requestandrawAttemptsTesting
pnpm test— 172 tests passedpnpm typecheck— passedpnpm web:typecheck— passedpnpm web:build— passedpnpm format:check— passedManual verification
/agentsor/timingand apply model/preset filters — table counts should update/presetsand compare filter behavior with/leaderboard/with analysis index loaded — confirm three drift KPI cards/runs/[id]) and expand "LLM request (prompt)" on agent steps