Add benchmark aggregates, activity sort, agent filters, and search facets by cursor[bot] · Pull Request #38 · eamonboyle/llm-debate-engine

cursor · 2026-06-13T08:07:55Z

Summary

Implements four product improvements grounded in existing artifact data and UI patterns for the LLM Debate Research dashboard.

Features

1. Benchmark run-level aggregates (detail page)

Surfaces payload.summary metrics that were stored in artifacts but not shown in the UI:

Consensus mean ± stddev
Critique max severity mean ± stddev
Stability mean, stddev, and min–max range

2. Activity feed sort

Adds newest first / oldest first sort to /activity and /api/activity, preserving sort in pagination and exports.

3. Agent pipeline stats filters

Scopes /agents stats to filtered runs using the same model/preset/fast/date filters as runs and activity pages. Includes empty state when filters match nothing and a link to browse matching runs.

4. Search advanced filters

Extends /search with model, preset, fast mode, and date filters. Supports filter-only searches (no text query required).

Testing

pnpm test — 176 tests passed
pnpm typecheck — passed
pnpm web:typecheck — passed
pnpm web:build — passed

Manual testing

Open a benchmark detail page (e.g. /benchmarks/benchmark_1771342676099_703f78c9eaf418) — verify "Run-level aggregates" section
Visit /activity?sort=oldest — confirm chronological order reverses
Visit /agents?preset=research_deep — confirm stats scope to filtered runs
Visit /search?model=gpt or /search?preset=standard — confirm filtered results without text query

…cets - Show consensus, critique severity, and stability aggregates on benchmark detail - Add oldest-first sort option to activity feed and API - Scope agent pipeline stats with model/preset/fast/date filters - Extend search page with model, preset, fast mode, and date filters Co-authored-by: Eamon Boyle <eamonboyle@users.noreply.github.com>

vercel · 2026-06-13T08:07:57Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
llm-debate-research	Ready	Preview, Comment	Jun 13, 2026 8:08am

vercel Bot deployed to Preview June 13, 2026 08:08 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark aggregates, activity sort, agent filters, and search facets#38

Add benchmark aggregates, activity sort, agent filters, and search facets#38
cursor[bot] wants to merge 1 commit into
mainfrom
cursor/product-feature-opportunities-408b

cursor Bot commented Jun 13, 2026

Uh oh!

vercel Bot commented Jun 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cursor Bot commented Jun 13, 2026

Summary

Features

1. Benchmark run-level aggregates (detail page)

2. Activity feed sort

3. Agent pipeline stats filters

4. Search advanced filters

Testing

Manual testing

Uh oh!

vercel Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 13, 2026 •

edited

Loading