Skip to content

timotius-devin/decision-os

Repository files navigation

Decision OS

AI-powered decision analysis for the uncertain. Turn messy choices into structured, reasoned recommendations.

🤔 "I don't know what to eat."

💼 "Should I switch careers?"

🏠 "Is now the right time to move?"

We all face decisions like these — big and small, urgent and slow. Most of the time, we go with gut feel, ask a friend, or get paralyzed by overthinking.

Decision OS gives you a better way. It's like having a team of analysts in your pocket. Ask a quick question for everyday choices, or run a deep multi-agent analysis when the stakes are high. Every recommendation comes with a confidence score, a paper trail, and the ability to track whether it actually worked out.

Because good decisions shouldn't depend on luck.

Two ways to work with it:

  • Quick chat — type a question like "what should I eat?" and get a conversational Q&A that asks clarifying questions, then produces a recommendation. Streaming responses, no forms to fill.
  • Deep analysis — fill in a structured decision brief (goals, constraints, options, stakeholders, risk tolerance) and fire off a 6-agent pipeline that examines your decision from every angle.

The analysis persists. You can track actual outcomes against recommendations, building a personal decision history over time.

How it works

Quick chat

"What should I eat?"
  → "Any dietary restrictions?"
  → "Vegetarian, no nuts"
  → "What cuisine are you in the mood for?"
  → "Something light"
  → "I recommend a Mediterranean grain bowl with..."

No form. No setup. The agent decides when it has enough context and delivers a recommendation with confidence score. The full conversation is saved and reviewable.

Deep analysis

A sequential pipeline of six specialized agents, each with a focused role and validated output:

Agent Role
Framing Clarifies the real decision, surfaces missing context and assumptions
Options Lists all options including overlooked alternatives (always adds "do nothing")
Trade-off Scores options against evaluation criteria (1–5 scale), identifies the strongest
Risk Identifies risks per option with likelihood, impact, and mitigation
Devil's Advocate Challenges your current leaning, flags biases and weak assumptions
Memo Synthesises everything into a final recommendation with confidence score

Analysis runs asynchronously — submit and come back. The detail page shows live progress per agent, then a full trace of every agent's output plus a formatted decision memo.

Outcome tracking

After a decision plays out, record whether it went as expected. Over time, this becomes a personal record of how well your analysis holds up.

Design Philosophy

Why not just ask a chatbot?

A single LLM call can give you advice, but it has structural blind spots:

Problem What happens Why it matters
Single perspective One model, one prompt, one angle Misses dimensions you didn't think to ask about
Confirmation bias LLMs tend to agree with your stated leaning Reinforces what you already believe
Shallow coverage One prompt rarely forces deep examination Risks, trade-offs, and assumptions get glossed over
Unstructured output Freeform text drifts and contradicts itself Hard to compare, validate, or act on

The multi-agent approach

Decision OS splits the work across six specialized agents, each with a focused role, structured output schema, and sequential dependency on the previous agent's work. This mimics how a real decision-making team operates — but runs in minutes, not weeks.

Why six agents specifically?

Agent Rationale
Framing Most bad decisions stem from answering the wrong question. This agent reframes your problem, surfaces hidden assumptions, and identifies missing context before any analysis begins.
Options Humans naturally generate 2–3 options. This agent forces a wider search and always includes "do nothing" as a baseline, preventing premature convergence.
Trade-off Gut feel favors one option. Scoring every option against explicit criteria (1–5 scale) forces comparable, defensible evaluation.
Risk Optimism bias makes us underestimate downsides. This agent systematically identifies risks per option with likelihood, impact, and mitigation strategies.
Devil's Advocate Confirmation bias makes us fall in love with our first idea. This agent attacks your stated leaning, flags cognitive biases, and tests whether your preference holds up under scrutiny.
Memo Raw analysis is overwhelming. This agent synthesizes everything into a single, actionable recommendation with a confidence score — the final deliverable.

Why sequential?

Each agent builds on the structured output of the previous one. Framing sets the table → Options expands the menu → Trade-off scores them → Risk stress-tests the winner → Devil's Advocate attacks the logic → Memo synthesizes everything. Running them in parallel would lose this compounding context.

Why schema validation?

Every agent output is validated against a Zod schema. This forces the LLM to produce structured, parseable data instead of freeform text that drifts or contradicts itself. If an agent returns malformed JSON, the system automatically retries with a repair prompt — no silent failures.

Why auto-retry?

LLMs occasionally hallucinate, misformat, or skip required fields. The pipeline detects validation failures and retries with a targeted repair prompt, dramatically improving reliability without human intervention.

Architecture

graph TD
    A[Landing Page] --> B[Dashboard]
    B --> C["/decisions/new"]
    C --> D["Simple Chat"]
    C --> E["Advanced Form"]
    D --> F["Q&A Agent<br/>(streaming SSE)"]
    F --> G[Result + Save]
    E --> H[Decision Detail]
    H --> I["Run Analysis"]
    I --> J1["Framing Agent"]
    I --> J2["Options Agent"]
    I --> J3["Trade-off Agent"]
    I --> J4["Risk Agent"]
    I --> J5["Devil's Advocate"]
    I --> J6["Memo Agent"]
    J6 --> K[Decision Memo]
    K --> L[Outcome Tracking]
    G --> B
Loading

Tech stack

Layer Technology
Framework Next.js 14 (App Router)
Language TypeScript
Styling Tailwind CSS + shadcn/ui
Database SQLite (via Prisma ORM)
Validation Zod
LLM Provider Anthropic Claude (configurable)
Streaming Server-Sent Events

Getting started

Prerequisites

  • Node.js 18+
  • An Anthropic API key (optional — demo mode works without it)

Setup

# Install dependencies
npm install

# Run database migrations
npx prisma migrate dev

# (Optional) Seed with example data
npm run db:seed

# Start the development server
npm run dev

Open http://localhost:3000.

Configure the LLM

Copy .env.example to .env and add your Anthropic API key:

ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-sonnet-4-20250514

The app works without a key — use the View Demo button to see a pre-seeded analysis.

Scripts

Script Description
npm run dev Start development server
npm run build Production build
npm run lint Run ESLint
npm run db:migrate Run Prisma migrations
npm run db:seed Seed database with examples

Project structure

decision-os/
├── app/
│   ├── page.tsx                     # Landing page
│   ├── layout.tsx                   # Root layout
│   ├── dashboard/page.tsx           # Decision list
│   ├── decisions/
│   │   ├── new/page.tsx             # Simple/Advanced toggle
│   │   └── [id]/page.tsx            # Detail view (tabs or conversation)
│   └── api/
│       ├── decisions/               # CRUD + analyze + outcome + demo
│       └── qa/chat/                 # SSE streaming endpoint
├── components/
│   ├── ui/                          # shadcn/ui primitives
│   ├── simple-chat.tsx              # Chat interface with SSE reader
│   ├── simple-decision-view.tsx     # Read-only conversation view
│   ├── run-analysis-button.tsx      # Loading state for async analysis
│   ├── decision-form.tsx            # Advanced decision form
│   ├── decision-card.tsx            # Dashboard card with mode badge
│   ├── decision-memo.tsx            # Formatted memo output
│   ├── agent-trace.tsx              # Expandable agent output cards
│   ├── analysis-progress.tsx        # Live pipeline progress
│   ├── analysis-error.tsx           # Error state with retry
│   ├── confidence-badge.tsx         # Percentage badge
│   ├── outcome-form.tsx             # Outcome tracking
│   ├── tradeoff-table.tsx           # Scoring matrix
│   └── landing-*.tsx                # Landing page sections
├── lib/
│   ├── agents/                      # Agent runners (6 + qa)
│   │   ├── framing.ts
│   │   ├── options.ts
│   │   ├── tradeoff.ts
│   │   ├── risk.ts
│   │   ├── devils-advocate.ts
│   │   ├── memo.ts
│   │   ├── qa.ts
│   │   └── helpers.ts               # parseAndValidate with auto-retry
│   ├── llm/
│   │   ├── client.ts                # Anthropic SDK wrapper
│   │   └── prompts.ts               # System prompts for all agents
│   ├── schemas/
│   │   ├── agents.ts                # Zod schemas for 6 agents
│   │   ├── qa.ts                    # Zod schemas for Q&A agent
│   │   ├── decision.ts              # Decision form schema
│   │   └── outcome.ts               # Outcome form schema
│   ├── orchestrator.ts              # Agent pipeline runner
│   └── db.ts                        # Prisma client singleton
├── prisma/
│   ├── schema.prisma
│   └── seed.ts
└── types/
    └── index.ts

Contributors

This project was built by Timotius Devin with support from multiple AI systems:

Contributor Version Role
Timotius Devin latest Product vision, architecture decisions, feature prioritization, quality curation
Kimi k2.6 Support for system architecture, component structure, debugging, and code review
DeepSeek V4 Flash Support for code generation, logic refinement, and implementation
Qwen 3.6 Plus Support for feature planning, documentation, and schema design

While AI models accelerated development, all architectural decisions, feature prioritization, and quality assurance were directed by human oversight.

Disclaimer

⚠️ Decision OS uses AI to generate all analysis, recommendations, and memos. These outputs are probabilistic suggestions based on the information you provide — they are not facts, guarantees, or professional advice. Important considerations:

  • The quality of analysis depends entirely on the completeness and accuracy of your input
  • The confidence score reflects internal consistency of the analysis, not real-world certainty
  • Always apply your own critical thinking and judgment before acting on any recommendation
  • For high-stakes decisions (financial, legal, medical, career, or safety-related), consult qualified professionals
  • AI models may occasionally hallucinate, misinterpret context, or overlook important factors

Use Decision OS as a thinking partner, not an oracle.


MIT License

About

AI-powered decision analysis platform — from quick Q&A to deep multi-agent evaluation with outcome tracking

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages