Design: Heuristic + LLM Dual-Backend for `dashcli suggest`

Generated by /office-hours on 2026-03-26 Branch: worktree-dashcli-suggest Repo: agi-bootstrap/dashcli Status: APPROVED Mode: Startup

Problem Statement

dashcli's suggest command currently requires ANTHROPIC_API_KEY — an LLM call is the only path from CSV to dashboard spec. This creates a hard adoption barrier: every data scientist who runs dashcli suggest data.csv without an API key configured hits an error and bounces. For a tool positioned as "the entry point that beats Streamlit," requiring an API key is more config than Streamlit, not less.

The heuristic approach eliminates this barrier: column profiling + deterministic spec generation → a useful dashboard from any CSV in under 100ms, zero config, zero cost, fully offline. The LLM approach remains available as an opt-in flag for users who want semantically richer suggestions.

Demand Evidence

Direct observation: data scientists at ByteDance/TikTok are vibe-coding ad-hoc dashboards with inconsistent results
The "agent-first" gap between Aeolus (exabyte-scale, human-first) and one-off vibe-coded dashboards is structural
Streamlit adoption data: 90%+ of Fortune 50 use it for data apps — zero-config is the proven adoption pattern
Landscape research: Quesma's experience building Grafana dashboards with AI+CLI confirms that data profiling is firmly in the "deterministic functions just work" category

Status Quo

Current dashcli suggest: Calls Anthropic API → gets 3-5 YAML specs → writes files to disk. Requires API key. Non-deterministic. Not composable (writes files, not stdout). ~5-15s per run.

Streamlit: streamlit run app.py → instant dashboard. Zero config. But: Python-only, not agent-friendly, no declarative spec.

The gap: No tool offers point-at-CSV → deterministic-YAML-on-stdout that agents can compose.

Target User & Narrowest Wedge

Target user: Data scientists using AI coding tools who need quick dashboards for ad-hoc data without overhead.

Narrowest wedge: dashcli suggest data.csv → single YAML spec on stdout in under 100ms, zero config. The fastest path from data to managed dashboard artifact.

Constraints

Heuristic suggest must work with zero environment variables — no API key, no network
Both backends (heuristic and LLM) must output to stdout for composability
Must not break existing dashcli suggest users — the --ai flag provides the LLM path
Column profiling must handle edge cases: no dates, no measures, special characters in column names
Generated specs must pass validateSpec() validation (uses DashboardSpec.safeParse() per existing codebase pattern)

Premises

Heuristic suggest is the right default — eliminates the API key barrier, makes suggest a deterministic composable building block. Agents already have LLMs; the tool layer should be fast and predictable.
One command, two backends — dashcli suggest data.csv (heuristic, default) and dashcli suggest data.csv --ai (LLM, opt-in). No separate commands.
Column profiling via type + name regex + value pattern matching is sufficient for the default path — structural basics (KPI + bar + line + table), not semantic analysis.
YAML on stdout for both backends — composability over convenience. Multi-document YAML (--- separators) for --ai mode's multiple specs.
The plan's layout algorithm covers the 80% case — KPIs row 0, bar+line row 1, detail table row 2, with graceful degradation.

Approaches Considered

Approach A: Heuristic-Only Replace

Replace LLM suggest entirely with heuristic algorithm. Simplest diff. Loses LLM capability for power users. Completeness: 6/10.

Approach B: Dual-Backend — Heuristic Default + LLM Opt-in (CHOSEN)

Implement heuristic suggest as default. Refactor LLM suggest behind --ai flag. Both output to stdout. Heuristic outputs 1 spec, --ai outputs multiple specs with YAML document markers. Completeness: 9/10.

Approach C: Layered Profile + Suggest

Split into dashcli profile (JSON) and dashcli suggest (YAML). Most agent-composable but adds API surface and loses instant smart dashboards for humans. Completeness: 7/10.

Recommended Approach

Approach B: Dual-Backend — chosen because it covers both the zero-config adoption path (heuristic default) and the power-user path (LLM via --ai), with consistent stdout output for both.

Architecture

src/suggest.ts  — refactored
  ├── Column profiling (NEW)
  │   ├── profileCsv(csvPath) → ProfileResult
  │   │   classifies columns as date/measure/dimension
  │   │   computes date ranges, dimension cardinalities
  │   └── Types: ProfileResult, ColumnClass
  │
  ├── Heuristic spec generation (NEW)
  │   ├── generateSpec(profile, csvBasename) → DashboardSpec
  │   │   deterministic: dates/dims/measures → KPIs + bar + line + table
  │   └── Grid layout algorithm from plan.md
  │
  ├── LLM spec generation (EXISTING, refactored)
  │   ├── suggestAI() — renamed from suggestDashboards(), returns YAML string on stdout
  │   │   (breaking change: OK since v0.1 pre-release, no external consumers)
  │   └── buildSchemaSummary() — unchanged
  │
  └── Orchestrator
      ├── suggest(csvPath) → string (YAML on stdout)
      └── suggestAI(csvPath) → string (multi-doc YAML on stdout)

src/index.ts — modified
  └── suggest command: parse --ai flag, route to heuristic or LLM backend
      Both: process.stdout.write(result)
      --ai requires ANTHROPIC_API_KEY (error if missing)
      Default: no API key needed

Column Classification Algorithm

For each column in CSV:
  1. date   — sqlType === "TEXT" AND (
              name matches /date|time|month|year|created_at|updated_at|timestamp|_at$/i
              OR first 5 non-null values match /^\d{4}-\d{2}-\d{2}/)
  2. measure — sqlType === "INTEGER" or "REAL"
  3. dimension — everything else (TEXT, not date)

Gates and edge cases:

Name-based date check requires TEXT type (prevents INTEGER "year" column misclassification).
Value-based date detection samples first 5 non-null values; all must match to classify as date.
If a column's first data rows are empty/null, SQLite types it as TEXT. The profiler samples beyond the first row to avoid misclassification — check up to 10 non-null values for type inference.
Single-row data: If the CSV has only 1 data row, generate a table-only layout (aggregations over a single value are trivial and confusing).

Layout Algorithm

Row 0: KPIs — 1 per measure, capped at grid width (3-4 columns)
Row 1: Bar (dim[0] × measure[0]) + Line (date[0] × measure[0])
Row 2: Detail table (all dims + all measures)

Graceful degradation:
  - No dimensions → no bar chart, line spans full width
  - No dates → no line chart, bar spans full width
  - No measures → table only (raw SELECT of dimensions)

Grid columns: Math.max(3, Math.min(measures.length, 4))

Filters

First date column → date_range filter (min/max as defaults)
Each dimension with ≤15 distinct values → dropdown filter (default: "all")

The profiler stores both the distinct count and (for dimensions with ≤15 distinct values) the actual distinct values, so filter defaults can be populated. The 15-value threshold keeps dropdown menus usable; higher cardinality dimensions are excluded from filters.

Output Format

Heuristic (default): Single YAML document on stdout.

LLM (--ai): Multiple YAML documents separated by --- on stdout. Each document is a complete, valid dashboard spec.

Key Implementation Details

KPI queries alias result as value (viewer expects row.value)
SQL identifiers double-quote escaped: "col".replace(/"/g, '""')
WHERE clause: all filter placeholders joined with AND
Generated spec validated with validateSpec() (uses DashboardSpec.safeParse()) before output
source field: ./basename.csv (relative path)
YAML via stringify from existing yaml dependency (lineWidth: 0)

Exported API

// NEW — heuristic path
export function profileCsv(csvPath: string): ProfileResult
export function generateSpec(profile: ProfileResult, csvBasename: string): DashboardSpec
export function suggest(csvPath: string): string  // synchronous orchestrator → YAML string

// EXISTING — renamed from suggestDashboards(), refactored to return string instead of writing files
// Breaking change: OK since v0.1 pre-release with no external consumers
export async function suggestAI(csvPath: string, options?: SuggestOptions): Promise<string>

Note: The heuristic suggest() is synchronous (pure computation). In index.ts, call it directly (no await). The --ai path via suggestAI() remains async.

Files Changed

File	Action	Est. Lines
`src/suggest.ts`	Major refactor — add heuristic path, refactor LLM to stdout	~350-400
`src/index.ts`	Modify — parse `--ai` flag, route accordingly	~10
`test/suggest.test.ts`	Major rewrite — heuristic tests + refactored LLM tests	~300

Open Questions

Stdin piping for serve: dashcli serve currently takes a file path and uses existsSync(). For dashcli suggest data.csv | dashcli serve - to work, serve needs stdin support. This is a separate feature — defer to a follow-up milestone. For now, users write to a file: dashcli suggest data.csv > spec.yaml && dashcli serve spec.yaml.
Multi-doc YAML piping: Can dashcli serve accept multi-document YAML from stdin, or does it need a new --pick flag to select one spec? (Deferred to implementation.)
JSON data sources: The plan focuses on CSV. The heuristic profiler should also handle JSON via the existing loadDataSource() path — but the profiling logic assumes tabular data. Confirm JSON works with the same column classification.
Future dashcli profile command: Approach C's profile primitive could be added later as a separate command without conflicting with this design. Worth considering post-launch.

Success Criteria

dashcli suggest sample/sales.csv outputs valid YAML to stdout with zero env vars — in under 100ms
dashcli suggest sample/sales.csv --ai outputs multi-doc YAML to stdout (requires API key)
Generated spec passes validateSpec() validation (uses DashboardSpec.safeParse())
All existing tests pass + new heuristic tests (20+ test cases per plan.md)
No regression in LLM suggest quality — same prompts, same schema summary

Distribution Plan

Existing: dashcli is a bun-based CLI installed locally. No new distribution artifact needed.

Dependencies

None for heuristic path (pure computation on existing data layer)
@anthropic-ai/sdk remains a dependency for --ai flag (already installed)

The Assignment

Run dashcli suggest sample/sales.csv with the heuristic implementation and compare the output to what dashcli suggest sample/sales.csv --ai generates. Show the heuristic output to a data scientist colleague and ask: "Would you use this as a starting point, or would you rather write the spec from scratch?" Their answer tells you whether the heuristic covers the 80% case.

Accepted Expansions (from /autoplan review)

ID/cardinality guard for profiler — columns where cardinality equals row count (e.g., user_id, order_id) should be excluded from measures. Also exclude columns matching /_id$/i or /_key$/i name patterns. Prevents SUM(user_id) nonsense in KPIs.
dashcli profile command — expose profileCsv() as dashcli profile data.csv outputting JSON to stdout. ~20 lines in index.ts. The most agent-aligned primitive — agents can read the profile, reason about it, and generate custom specs.
escId() DRY consolidation — extract the 3 copies (suggest.ts, csv.ts, query.ts) into a shared location. Import from one source.
Determinism test — add success criterion: same CSV input produces identical YAML output across runs.
Label humanization — generate chart labels from column names: total_revenue → Total Revenue.
Time-box note — this milestone should take 1-2 days of CC implementation. If the profiler is still being tweaked after that, scope is creeping.

Updated Files Changed

File	Action	Est. Lines
`src/suggest.ts`	Major refactor — heuristic + LLM refactor + ID guard	~400
`src/index.ts`	Modify — `--ai` flag, `profile` command, routing	~30
`test/suggest.test.ts`	Major rewrite — 38 test cases	~350
`src/utils.ts` (new)	Extract shared `escId()`	~5

Decision Audit Trail

#	Phase	Decision	Principle	Rationale	Rejected
1	CEO	Accept Approach B (dual-backend)	P6	Validated in /office-hours	A, C
2	CEO	Mode: SELECTIVE EXPANSION	P1+P2	Feature enhancement default	—
3	CEO	Defer `dashcli profile` as separate cmd	P3	Separate feature	—
4	CEO	Defer stdin serve	P3	Separate feature	—
5	CEO	Accept escId DRY fix	P2+P4	In blast radius, trivial	—
6	CEO	Accept ID/cardinality guard	P1+P5	Critical gap from subagent	—
7	CEO	Accept `dashcli profile` command	P1+P2	~20 lines, agent-aligned	Override #3
8	CEO	Defer `dashcli data.csv` auto-serve	P3+P6	Requires stdin serve	TODOS.md
9	CEO	Defer template approach	P3	Separate milestone	TODOS.md
10	CEO	Accept determinism test	P5	Trivial, validates key claim	—
11	CEO	Accept time-box note	P6	Prevents scope creep	—
12	CEO	Defer competitive analysis expansion	P3	Informational	TASTE
13	Design	Accept label humanization	P5	Prevents generic chart titles	—
14	Eng	Write 38-test plan	P1	Complete coverage	—
15	Design	Accept chart title templates	P5	Prevents generic titles	—
16	Design	Accept error handling spec	P1	Critical gap from subagent	—
17	Design	Accept stderr success line	P5	Human feedback w/o stdout	—
18	Design	Exclude all-null columns	P1	Prevents useless filters	—
19	Eng	deriveTableName DRY fix	P2+P4	Prevents silent bugs	—
20	Eng	Consider profiler.ts split	P5	Clean seam for profile cmd	TASTE
21	Eng	Fix csv.ts multi-row sampling	P1	Root cause of type misclass	—
22	Eng	Zero measures+dims fallback	P1	Edge case with no output	—
23	Eng	Cap table columns at ~20	P3	Prevents unusable tables	—

What I noticed about how you think

You didn't accept the binary choice on Premise 1. When I asked "heuristic or LLM?", you said "help me compare the two" and then chose "both." That's product thinking — refusing to accept a false tradeoff when you can have both with minimal extra cost.
You're building v0.1.3.0 of a tool nobody outside your team uses yet, but you're already thinking about adoption barriers (API key friction) and composability (stdout piping). You're building infrastructure, not a demo.
You chose the 9/10 completeness option over the 6/10 simple option. You're not cutting corners on a feature that's supposed to be the "entry point that beats Streamlit."

GSTACK REVIEW REPORT

Review	Trigger	Why	Runs	Status	Findings
CEO Review	`/plan-ceo-review`	Scope & strategy	1	clean	5 premises confirmed, 1 expansion accepted
CEO Voice	`autoplan-voices`	Independent challenge	1	issues_open	10 findings (1 critical, 4 high), 5 accepted
Design Review	`/plan-design-review`	UI/UX gaps	1	clean	Label humanization accepted, all states covered
Eng Review	`/plan-eng-review`	Architecture & tests	1	clean	38-test plan, no critical gaps

VERDICT: REVIEWED — 14 auto-decisions logged, 1 taste decision surfaced at gate. Plan ready for implementation with accepted expansions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design: Heuristic + LLM Dual-Backend for `dashcli suggest`

Problem Statement

Demand Evidence

Status Quo

Target User & Narrowest Wedge

Constraints

Premises

Approaches Considered

Approach A: Heuristic-Only Replace

Approach B: Dual-Backend — Heuristic Default + LLM Opt-in (CHOSEN)

Approach C: Layered Profile + Suggest

Recommended Approach

Architecture

Column Classification Algorithm

Layout Algorithm

Filters

Output Format

Key Implementation Details

Exported API

Files Changed

Open Questions

Success Criteria

Distribution Plan

Dependencies

The Assignment

Accepted Expansions (from /autoplan review)

Updated Files Changed

Decision Audit Trail

What I noticed about how you think

GSTACK REVIEW REPORT

FilesExpand file tree

plan.md

Latest commit

History

plan.md

File metadata and controls

Design: Heuristic + LLM Dual-Backend for dashcli suggest

Problem Statement

Demand Evidence

Status Quo

Target User & Narrowest Wedge

Constraints

Premises

Approaches Considered

Approach A: Heuristic-Only Replace

Approach B: Dual-Backend — Heuristic Default + LLM Opt-in (CHOSEN)

Approach C: Layered Profile + Suggest

Recommended Approach

Architecture

Column Classification Algorithm

Layout Algorithm

Filters

Output Format

Key Implementation Details

Exported API

Files Changed

Open Questions

Success Criteria

Distribution Plan

Dependencies

The Assignment

Accepted Expansions (from /autoplan review)

Updated Files Changed

Decision Audit Trail

What I noticed about how you think

GSTACK REVIEW REPORT

Design: Heuristic + LLM Dual-Backend for `dashcli suggest`