selftune-dev · WellDunDun · Mar 14, 2026 · Mar 12, 2026 · Mar 12, 2026 · Mar 12, 2026
@@ -1,5 +1,5 @@
 node_modules/
-bun.lock
+pnpm-lock.yaml
 *.tsbuildinfo
 .context/
 .claude/worktrees/
@@ -16,6 +16,7 @@ Thumbs.db
 coverage/
 tests/sandbox/results/
 .test-data/
+.playwright-cli/
 
 # Internal business strategy (kept locally, not in public repo)
 docs/strategy/

@@ -18,7 +18,7 @@
 | Auto-Activation | `cli/selftune/hooks/auto-activate.ts`, `cli/selftune/activation-rules.ts` | UserPromptSubmit hook with configurable trigger rules | B |
 | Memory & Context | `cli/selftune/memory/writer.ts` | 3-file evolution memory persistence (~/.selftune/memory/) | B |
 | Enforcement Guardrails | `cli/selftune/hooks/evolution-guard.ts`, `cli/selftune/hooks/skill-change-guard.ts` | PreToolUse hooks blocking unguarded SKILL.md edits | B |
-| Dashboard | `cli/selftune/dashboard.ts`, `cli/selftune/dashboard-server.ts`, `dashboard/` | HTML dashboard + live Bun.serve server with SSE | B |
+| Dashboard | `cli/selftune/dashboard.ts`, `cli/selftune/dashboard-server.ts`, `apps/local-dashboard/` | React SPA dashboard + live Bun.serve server with SQLite-backed v2 API | B |
 | Specialized Agents | `.claude/agents/*.md` | Purpose-built agents (diagnosis, pattern, reviewer, integration) | B |
 | Skill | `skill/` | Agent-facing skill (routing table + workflows + references) | B |
 
@@ -91,8 +91,17 @@ cli/selftune/
 ├── evolution-reviewer.md Review proposed skill evolutions
 └── integration-guide.md  Guide project integration setup
 
-dashboard/                HTML dashboard template
-└── index.html            Skill-health-centric SPA with embedded JSON data
+apps/local-dashboard/     React SPA dashboard (Vite + TypeScript + shadcn/ui)
+├── src/
+│   ├── pages/            Overview + SkillReport routes
+│   ├── components/       Sidebar, skill grid, evidence viewer, evolution timeline
+│   ├── hooks/            useOverview (polling), useSkillReport
+│   └── types.ts          TypeScript interfaces matching v2 API payloads
+├── vite.config.ts        Dev proxy → dashboard-server, build to dist/
+└── package.json          React 19, Tailwind v4, shadcn/ui, recharts
+
+dashboard/                Legacy HTML dashboard (served at /legacy/)
+└── index.html            Original embedded-JSON dashboard (v1 endpoints)
 
 templates/                Settings and config templates
 ├── single-skill-settings.json
@@ -130,7 +139,7 @@ tests/sandbox/
 | Monitoring | `cli/selftune/monitoring/` | `watch.ts` | Post-deploy regression detection | Shared, Evolution/audit |
 | Status | `cli/selftune/` | `status.ts` | Skill health summary CLI | Shared, Monitoring, Evolution/audit |
 | Last | `cli/selftune/` | `last.ts` | Last session insight CLI | Shared only |
-| Dashboard | `cli/selftune/` | `dashboard.ts`, `dashboard-server.ts` | HTML dashboard builder + live SSE server | Shared, Monitoring, Evolution/audit |
+| Dashboard | `cli/selftune/`, `apps/local-dashboard/` | `dashboard.ts`, `dashboard-server.ts`, React SPA | React SPA with SQLite-backed v2 API + legacy HTML builder + live server | Shared, Monitoring, Evolution/audit, LocalDB |
 | Agents | `.claude/agents/` | `diagnosis-analyst.md`, `pattern-analyst.md`, `evolution-reviewer.md`, `integration-guide.md` | Specialized Claude Code agents | Reads log schema + config |
 | Skill | `skill/` | `SKILL.md`, `Workflows/*.md`, `references/*.md`, `settings_snippet.json` | Agent-facing routing, workflows, domain knowledge | Reads log schema + config |
 | Sandbox | `tests/sandbox/` | `run-sandbox.ts`, `fixtures/`, `docker/` | Sandbox test harness and Docker integration tests | All modules (test-only) |

@@ -9,6 +9,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
 
 ### Added
 
+- **Local Dashboard SPA** — React + Vite + TypeScript SPA replacing the legacy embedded-HTML dashboard as the default view
+  - Overview page with KPI cards, skill health grid with status filters, evolution feed, unmatched queries
+  - Per-skill drilldown with usage stats, invocation records, evidence viewer, evolution timeline, pending proposals
+  - Collapsible sidebar navigation listing all skills by health status
+  - shadcn/ui component library with dark/light theme toggle and selftune branding
+  - TanStack Query for data fetching with smart caching, background refetch, and instant back-navigation
+  - 15-second background polling against SQLite-backed v2 API endpoints via TanStack Query `refetchInterval` (SSE was removed — SQLite reads are cheap enough for polling)
+  - New components: `EvidenceViewer`, `EvolutionTimeline`, `ActivityTimeline`, `SkillHealthGrid`, `SectionCards`, `InfoTip`
+  - Glossary tooltips on all metric labels (overview KPI cards, skill report KPI cards) explaining what each metric measures
+  - Tab description tooltips on skill report tabs (Evidence, Invocations, Prompts, Sessions, Pending)
+  - Collapsible lifecycle legend in evolution timeline explaining proposal stages (Created, Validated, Deployed, Rejected, Rolled Back)
+  - Evidence context banner explaining the evidence trail concept
+  - Renamed "Per-Entry Results" to "Individual Test Cases" for clarity
+  - Onboarding flow: full empty-state guide for first-time users (3-step setup), dismissible welcome banner for returning users (localStorage-persisted)
+- **SQLite v2 API endpoints** — `GET /api/v2/overview` and `GET /api/v2/skills/:name` backed by materialized SQLite queries (`getOverviewPayload()`, `getSkillReportPayload()`, `getSkillsList()`)
+- **SQL query optimizations** — Replaced `NOT IN` subqueries with `LEFT JOIN + IS NULL`, moved JS-side dedup to SQL `GROUP BY`, added `LIMIT 200` to unbounded evidence queries
+- **SPA serving from dashboard server** — Built SPA served at `/`, legacy HTML dashboard moved to `/legacy/`
 - **Source-truth-driven pipeline** — Transcripts and rollouts are now the authoritative source; `sync` rebuilds repaired overlays from source data rather than relying solely on hook-time capture
 - **Telemetry contract package** — `@selftune/telemetry-contract` workspace package with canonical schema types, validators, versioning, metadata, and golden fixture tests
 - **Test split** — `make test-fast` / `make test-slow` and `bun run test:fast` / `bun run test:slow` for faster development feedback loop

@@ -87,7 +87,7 @@ A continuous feedback loop that makes your skills learn and adapt. Automatically
 - **Per-stage model control** — `--validation-model`, `--proposal-model`, and `--gate-model` flags give fine-grained control over which model runs each evolution stage.
 - **Auto-activation system** — Hooks detect when selftune should run and suggest actions
 - **Enforcement guardrails** — Blocks SKILL.md edits on monitored skills unless `selftune watch` has been run
-- **Live dashboard server** — `selftune dashboard --serve` with SSE auto-refresh and action buttons
+- **React SPA dashboard** — `selftune dashboard` serves a React SPA with skill health grid, per-skill drilldown, evidence viewer, evolution timeline, dark/light theming, and SQLite-backed v2 API (legacy dashboard at `/legacy/`)
 - **Evolution memory** — Persists context, plans, and decisions across context resets
 - **4 specialized agents** — Diagnosis analyst, pattern analyst, evolution reviewer, integration guide
 - **Sandbox test harness** — Comprehensive automated test coverage, including devcontainer-based LLM testing
@@ -110,7 +110,7 @@ A continuous feedback loop that makes your skills learn and adapt. Automatically
 | `selftune import-skillsbench` | Import external eval corpus from [SkillsBench](https://github.com/benchflow-ai/skillsbench) |
 | `selftune badge --skill <name>` | Generate skill health badge SVG |
 | `selftune watch --skill <name>` | Monitor after deploy. Auto-rollback on regression. |
-| `selftune dashboard` | Open the visual skill health dashboard |
+| `selftune dashboard` | Open the React SPA dashboard (SQLite-backed) |
 | `selftune replay` | Backfill data from existing Claude Code transcripts |
 | `selftune doctor` | Health check: logs, hooks, config, permissions |
 

@@ -10,6 +10,13 @@
   - Hosted badge service at `badge.selftune.dev`
   - CLI `contribute --submit` for sharing skill data
 - Agent-first skill restructure (init command, routing + workflows)
+- Local Dashboard SPA:
+  - React + Vite + TypeScript SPA with shadcn/ui and Tailwind v4
+  - Overview page with KPI cards, skill health grid, evolution feed
+  - Per-skill drilldown with evidence viewer, evolution timeline
+  - SQLite v2 API endpoints (`/api/v2/overview`, `/api/v2/skills/:name`)
+  - Dark/light theme toggle with selftune branding
+  - SPA served at `/`, legacy HTML dashboard at `/legacy/`
 
 ## In Progress
 - Multi-agent sandbox expansion

@@ -0,0 +1 @@
+dist/
@@ -0,0 +1,75 @@
+# Local Dashboard SPA — Handoff
+
+## Architecture
+
+React SPA built with Vite + TypeScript that consumes the **SQLite-backed v2 API endpoints** from the dashboard server. The server materializes JSONL logs into a local SQLite database (`~/.selftune/selftune.db`) and serves pre-aggregated query results.
+
+### Data flow
+
+```text
+JSONL logs → materializeIncremental() → SQLite → getOverviewPayload() / getSkillReportPayload() → /api/v2/* → SPA
+```
+
+## What is implemented
+
+- **Two routes**:
+  - `/` — Overview with KPI section cards (with info tooltips), skill health grid with status filters (healthy/warning/critical/unknown), evolution feed (ActivityTimeline), unmatched queries, onboarding banner (dismissible, localStorage-persisted)
+  - `/skills/:name` — Per-skill drilldown with usage stats (with info tooltips), invocation records, EvidenceViewer (collapsible evidence entries with markdown rendering, context banner), EvolutionTimeline (vertical timeline with pass-rate deltas, lifecycle legend), pending proposals, tab descriptions via hover tooltips
+- **UX helpers**: `InfoTip` component for glossary tooltips on all metrics, lifecycle legend in evolution timeline, evidence context banner, onboarding flow for first-time users
+- **Data layer**: TanStack Query (`@tanstack/react-query`) with smart caching, fetching from v2 endpoints backed by SQLite materialized queries
+  - `GET /api/v2/overview` — combined `getOverviewPayload()` + `getSkillsList()`
+  - `GET /api/v2/skills/:name` — `getSkillReportPayload()` + evolution audit + pending proposals
+- **Live updates**: 15-second polling interval via TanStack Query `refetchInterval` (replaced old SSE approach)
+- **Caching**: `staleTime` of 10s (overview) / 30s (skill report) for instant back-navigation; `gcTime` of 5 minutes; automatic background refetch on window focus
+- **Loading/error/empty/not-found states** on every route
+- **UI framework**: shadcn/ui components with dark/light theme toggle, TanStack Table for data grids
+- **Design**: selftune branding, collapsible sidebar, Tailwind v4
+
+## How to run
+
+```bash
+# Terminal 1: Start the dashboard server
+selftune dashboard --port 7888
+
+# Terminal 2: Start the SPA dev server (proxies /api to port 7888)
+cd apps/local-dashboard
+bun install
+bunx vite
+# → opens at http://localhost:5199
+```
+
+## What was rebased / changed
+
+- **SPA types**: Rewritten to match `queries.ts` payload shapes (`OverviewResponse`, `SkillReportResponse`, `SkillSummary`, `EvidenceEntry`)
+- **API layer**: Now calls `/api/v2/overview` and `/api/v2/skills/:name` instead of `/api/data` + `/api/evaluations/:name`
+- **SSE removed**: Replaced with 15s polling (SQLite reads are cheap, SSE was complex)
+- **Overview page**: Uses `SkillSummary[]` from `getSkillsList()` for skill cards (pre-aggregated pass rate, check count, sessions)
+- **Skill report page**: Single fetch to v2 endpoint instead of parallel overview + evaluations fetch. Shows evidence entries, evolution audit history per skill
+- **Hooks**: Migrated to TanStack Query — `useOverview` uses `useQuery` with `refetchInterval`, `useSkillReport` uses `useQuery` with smart retry (skips retry on 404). Manual polling, request deduplication, and stale-request guards replaced by TanStack Query built-ins.
+
+## Query optimizations
+
+- **Pending proposals**: Replaced `NOT IN` subquery + JS `Set` dedup with `LEFT JOIN + IS NULL + GROUP BY` in both `queries.ts` and `dashboard-server.ts`
+- **Evidence query bounded**: Added `LIMIT 200` to `getSkillReportPayload()` evidence query (was unbounded)
+- **Indexes**: 16 indexes defined in `schema.ts` covering all frequent filter/join columns (`skill_name`, `session_id`, `proposal_id`, `timestamp`, `query+triggered`)
+
+## What now uses SQLite / materialized queries
+
+- **Overview**: `getOverviewPayload(db)` for evolution, unmatched queries, pending proposals, counts; `getSkillsList(db)` for per-skill aggregated stats
+- **Skill report**: `getSkillReportPayload(db, skillName)` for usage stats, recent invocations, evidence; direct SQL for evolution audit + pending proposals per skill
+- **Server**: `materializeIncremental(db)` runs at startup and refreshes every 15s on v2 endpoint access
+
+## What still depends on old dashboard code
+
+- The old v1 endpoints (`/api/data`, `/api/events`, `/api/evaluations/:name`) still work and are used by the legacy `dashboard/index.html`
+- Badge endpoints (`/badge/:name`) and report HTML endpoints (`/report/:name`) use the old `computeStatus` + JSONL reader path
+- Action endpoints (`/api/actions/*`) are unchanged
+
+## What remains before this can become default
+
+1. ~~**Serve built SPA from dashboard-server**~~: Done — `/` serves SPA, old dashboard at `/legacy/`
+2. ~~**Production build**~~: Done — `bun run build:dashboard` in root package.json
+3. **Regression detection**: The SQLite layer doesn't compute regression detection yet — `deriveStatus()` currently only uses pass rate + check count. Add a `regression_detected` column to skill summaries when the monitoring snapshot computation moves to SQLite.
+4. **Monitoring snapshot migration**: Move `computeMonitoringSnapshot()` logic into the SQLite materializer or a query helper (window sessions, false negative rate, baseline comparison)
+5. **Actions integration**: Wire up watch/evolve/rollback buttons in the SPA to `/api/actions/*`
+6. **Migrate badge/report endpoints**: Switch to SQLite-backed queries