Skip to content

Rigs91/ai-report-builder

Repository files navigation

AI Report Builder

Guided, AI-powered reporting workflow that turns messy spreadsheet exports into deterministic, decision-ready reports with visible methodology.

Hero screenshot

What this product does

AI Report Builder is a local-first web app for a common operational workflow:

  1. load a CSV or XLSX export
  2. inspect and repair the data safely
  3. build a semantic understanding of the dataset
  4. capture the reporting outcome in plain English
  5. generate a polished report with deterministic metrics, visuals, and export-ready output

The product is intentionally guided. It is not a blank-canvas BI tool and it does not let an LLM invent numbers.

The problem it solves

Spreadsheet-shaped analysis work is usually fragmented across data cleanup, ad hoc BI work, manual chart selection, and slide rewriting. That costs time and makes trust harder:

  • messy files hide schema and quality issues
  • teams repeat the same repair and reporting steps every time a new export arrives
  • AI summaries often lose the link to real computed values
  • general-purpose BI tools ask users to build the report instead of describing the answer they need

This repo demonstrates a tighter path from upload to trustworthy report.

Who it is for

  • product managers and operators who need a fast answer from a spreadsheet export
  • analysts who want a cleaner first pass before deeper analysis
  • engineering and data leaders evaluating guided analytics workflows
  • recruiters and hiring managers reviewing product, data, and technical execution depth

Why it stands out

  • Outcome-first workflow: users describe the report they need instead of manually building charts.
  • Deterministic analytics boundary: AI helps with inference, planning, and narrative, but not numeric truth.
  • Reviewable repairs: raw data stays separate from the cleaned working layer.
  • Renderer-owned output: the report is structured, exportable, and presentation-ready by default.
  • Trust visibility: methodology, runtime, caveats, and evidence stay attached to the report.

60-second demo path

  1. Launch the app with Run Me.bat or scripts/run-me.ps1.
  2. Load sample datasets/healthcare_showcase_1000.xlsx.
  3. Approve the recommended repairs in Module 2.
  4. In Module 3, keep the payer question: Which payer-plan combinations are driving the largest balance and denial burden?
  5. Generate the report in Module 4.
  6. Open Trust to show the deterministic boundary and active local models.
  7. Export HTML or PDF.

Alternative EHR demo:

  • dataset: sample datasets/clinical_encounter_demo_500.xlsx
  • question: Which patient segments are experiencing the longest ED waits?

Workflow at a glance

Step What the user sees Screenshot
Upload guided entry point for loading a structured dataset Upload step
Profile / fix profiling, repair recommendations, and working-layer readiness Profile and fix
KPI and insight surface computed metrics and narrative tied to the selected question KPI and insights
Report review renderer-owned report with export controls and trust rail Report view
Export output ready for handoff or sharing Export view

Core workflows

1. Guided data intake

  • upload or load a structured workbook
  • inspect field roles, null burden, and likely grain
  • keep the raw input intact

2. Safe repair planning

  • propose repairs with rationale
  • separate apply-now actions from review-required decisions
  • build a cleaned working layer without mutating the original file

3. Outcome capture and report planning

  • infer candidate reporting questions from the dataset
  • translate a business question into a structured report contract
  • keep the reporting plan constrained by the semantic model

4. Deterministic report generation

  • compute metrics, groupings, and visuals deterministically
  • generate narrative and section framing with AI assistance
  • render export-ready HTML and PDF outputs

5. Trust and methodology

  • show formulas, caveats, and runtime/provider metadata
  • keep evidence and report outputs attached to the same session

Architecture and data flow

dataset upload
  -> profiling and repair planning
  -> cleaned working layer
  -> semantic model
  -> report contract drafting
  -> deterministic metric computation
  -> report rendering
  -> trust / methodology / export

Repo layout:

  • frontend/ - Next.js guided workflow UI and Playwright coverage
  • backend/ - FastAPI services, contracts, persistence, and tests
  • packages/contracts/ - shared JSON schemas and generated TypeScript types
  • fixtures/ - acceptance scenarios and deterministic test fixtures
  • sample datasets/ - public-safe demo datasets for the main portfolio flow
  • docs/ - architecture, case study, demo script, and portfolio-facing materials

AI capabilities

AI is real and central here, but it is deliberately bounded.

AI is used for:

  • suggesting repair strategies and deciding what needs review
  • drafting the report contract from the user's question and the semantic model
  • generating narrative wording and reviewer-style polishing for the rendered report

AI is not used for:

  • inventing metrics or chart payloads
  • silently rewriting the raw data
  • bypassing formulas, caveats, or trust metadata

The current local-first demo preset is:

  • Module 2: qwen2.5:14b
  • Module 3: qwen2.5:14b
  • Module 4: qwen2.5:32b

Tech stack

  • Frontend: Next.js, TypeScript, React, Playwright
  • Backend: FastAPI, Pydantic, Python, httpx
  • Contracts: JSON Schema plus generated TypeScript types
  • Storage: local filesystem, SQLite, DuckDB
  • Models: OpenAI or Ollama-backed local models, with launcher-controlled provider selection
  • Charts and output: renderer-owned report views with HTML and PDF export

Run locally

Windows-first launcher

Preferred path:

Run Me.bat

Or:

powershell -ExecutionPolicy Bypass -File .\scripts\run-me.ps1

The launcher handles:

  • dependency/bootstrap checks
  • backend and frontend startup
  • OpenAI vs local Ollama preset selection
  • local model preflight for the demo presets

Manual setup

powershell -ExecutionPolicy Bypass -File .\scripts\bootstrap.ps1

Backend and frontend:

powershell -ExecutionPolicy Bypass -File .\scripts\start-dev.ps1

Common validation commands:

npm.cmd run contracts:generate
backend\.venv\Scripts\python.exe -m pytest backend\tests -q
npm.cmd --prefix frontend run lint
npm.cmd --prefix frontend run typecheck
npm.cmd --prefix frontend run build
powershell -ExecutionPolicy Bypass -File .\scripts\run-tests.ps1

Demo signoff:

powershell -ExecutionPolicy Bypass -File .\scripts\run-demo-signoff.ps1

Demo data

This repo ships with public-safe healthcare demo workbooks:

  • sample datasets/healthcare_showcase_1000.xlsx
    • best for the payer balance and denial story
  • sample datasets/clinical_encounter_demo_500.xlsx
    • best for the ED wait-time story

Recommended questions:

  • Which payer-plan combinations are driving the largest balance and denial burden?
  • Which patient segments are experiencing the longest ED waits?

The healthcare showcase notes are documented in sample datasets/healthcare_showcase_1000.notes.md.

Product decisions and tradeoffs

  • Guided flow over blank canvas: faster to a report, narrower than a BI workbench.
  • Deterministic metrics over model-generated analytics: higher trust, less open-ended flexibility.
  • Local-first runtime and file storage: easier demoability and auditability, less collaboration depth.
  • Fixed report templates over arbitrary chart authoring: better presentation quality, smaller output surface.

Roadmap

  • broader demo datasets beyond healthcare
  • richer follow-up and refinement flows
  • stronger multi-table semantic modeling
  • deeper evaluation harnesses for prompt and report quality
  • packaging improvements for non-Windows local setup

Supporting docs

About

Guided AI reporting workflow for turning messy datasets into deterministic, export-ready reports.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors