Skip to content

feat(web): pipeline filters, drift KPIs, and trace prompt inspector#37

Draft
cursor[bot] wants to merge 1 commit into
mainfrom
cursor/product-feature-opportunities-1538
Draft

feat(web): pipeline filters, drift KPIs, and trace prompt inspector#37
cursor[bot] wants to merge 1 commit into
mainfrom
cursor/product-feature-opportunities-1538

Conversation

@cursor

@cursor cursor Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds four product improvements to the LLM debate research dashboard, grounded in existing patterns and data already present in run artifacts.

Features

1. Pipeline insight filters (Agents + Timing)

  • /agents and /timing now use InsightFilterCard with the same search/model/preset/fast/date filters as other insight pages
  • Stats and timing tables respect the filtered run subset, with clear empty states when filters match nothing

2. Preset leaderboard filter parity

  • /presets upgraded from fast-mode-only filtering to full InsightFilterCard support via applyIndexFilters
  • "Filter runs" links preserve active filter context (matching model leaderboard behavior)

3. Overview confidence drift KPIs

  • Overview dashboard now shows all three aggregate confidence drift means: solver→revision, revision→synth, and calibrated−synth
  • Reorganizes KPI rows to avoid duplication and surface evidence-plan risk alongside drift metrics

4. Trace prompt & retry inspector

  • Run trace steps expose collapsible LLM request (prompt) and Parse retries panels when artifact steps include request and rawAttempts
  • Human-readable summary of model, temperature, schema, and message previews before raw JSON

Testing

  • pnpm test — 172 tests passed
  • pnpm typecheck — passed
  • pnpm web:typecheck — passed
  • pnpm web:build — passed
  • pnpm format:check — passed

Manual verification

  1. Visit /agents or /timing and apply model/preset filters — table counts should update
  2. Visit /presets and compare filter behavior with /leaderboard
  3. Visit / with analysis index loaded — confirm three drift KPI cards
  4. Open any run trace (/runs/[id]) and expand "LLM request (prompt)" on agent steps
Open in Web View Automation 

- Add InsightFilterCard to agent stats and pipeline timing pages
- Give preset leaderboard full filter parity with model leaderboard
- Show all three confidence drift means on the overview dashboard
- Expose LLM request prompts and parse retries in run trace steps

Co-authored-by: Eamon Boyle <eamonboyle@users.noreply.github.com>
@vercel

vercel Bot commented Jun 12, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
llm-debate-research Ready Ready Preview, Comment Jun 12, 2026 8:08am

Request Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant