OneCorp Multi-Agent Contract Workflow System

A multi-agent system that shepherds property deals from Expression of Interest (EOI) to executed contract, automating contract validation, email workflows, and SLA monitoring.

Quick Start

Visual Dashboard (Recommended for Demos)

The easiest way to see the system in action is through the visual web dashboard:

# Clone the repository
git clone <repository-url>
cd onecorp-mas

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set up API keys (create .env file)
echo "ANTHROPIC_API_KEY=your_anthropic_api_key_here" > .env   # Extractor/Router (Claude Haiku 4.5)
echo "DEEPINFRA_API_KEY=your_deepinfra_api_key_here" >> .env # Auditor/Comms (Qwen3-235B)

# Launch the visual dashboard
python run_ui.py

The dashboard will open automatically in your browser at http://localhost:5000. Click "Start Demo" to watch the multi-agent workflow execute in real-time.

Dashboard Features:

Real-time workflow progress visualization
Live agent activity indicators
Contract mismatch detection display
Email generation tracking
SLA monitoring with countdown
State transition timeline
Event log with timestamps

Command Line Interface

For headless execution or debugging:

# Run the full demo
python -m src.main --demo

# Or run step-by-step
python -m src.main --step eoi
python -m src.main --step contract-v1
python -m src.main --step contract-v2

# Test SLA overdue scenario
python -m src.main --test-sla

# Reset database
python -m src.main --reset

# Run all tests
pytest tests/ -v

# Run end-to-end integration test
pytest tests/test_end_to_end.py -v

Note: You'll need an Anthropic API key for the Extractor and Router (Claude Haiku 4.5, model id claude-haiku-4-5), and a DeepInfra API key for the Qwen3‑235B‑based Auditor and Comms LLM paths.

Project Structure

onecorp-mas/
├── CLAUDE.md              # Instructions for Claude Code
├── AGENTS.md              # Agent architecture guidelines
├── PROGRESS.md            # Task progress tracking
├── QUICKSTART.md          # Quick start guide
├── tasks.md               # Implementation task specifications
├── run_ui.py              # Visual dashboard launcher
├── requirements.txt       # Python dependencies
├── agent_docs/            # Implementation guides for each agent
│   ├── extraction.md      # Extractor agent (PDF parsing, field extraction)
│   ├── comparison.md      # Auditor agent (mismatch detection, risk scoring)
│   ├── emails.md          # Router + Comms agents (classification, generation)
│   ├── state-machine.md   # Orchestrator (states, transitions, SLA)
│   └── testing.md         # Test patterns and fixtures
├── assets/                # Visual assets
│   ├── architecture.svg   # System architecture diagram
│   └── workflow_diagram.jpeg
├── docs/                  # Project documentation
│   ├── INDEX.md           # Documentation index
│   ├── architecture.md    # System design document
│   ├── demo-script.md     # Demo walkthrough guide
│   ├── demo-recording.md  # Recording instructions
│   └── visual-ui-guide.md # UI documentation
├── spec/                  # Problem specification
│   ├── MAS_Brief.md       # Full requirements
│   ├── judging-criteria.md
│   └── transcript.md      # Stakeholder context
├── data/                  # Input data files
│   ├── source-of-truth/   # EOI PDF (source document)
│   ├── contracts/         # Contract PDFs (V1, V2)
│   ├── emails/
│   │   ├── incoming/      # Incoming email files to process
│   │   └── templates/     # Output email templates
│   └── emails_manifest.json
├── ground-truth/          # Expected outputs for validation
│   ├── eoi_extracted.json
│   ├── v1_extracted.json
│   ├── v2_extracted.json
│   ├── v1_mismatches.json
│   └── expected_outputs.json
├── src/                   # Implementation
│   ├── main.py            # Main entry point
│   ├── agents/            # LLM agents
│   │   ├── router.py      # Email classification
│   │   ├── extractor.py   # PDF field extraction
│   │   ├── auditor.py     # Contract comparison
│   │   ├── comms.py       # Email generation
│   │   └── prompts/       # Agent prompt templates
│   ├── orchestrator/      # Workflow coordination
│   │   ├── state_machine.py
│   │   ├── deal_store.py  # SQLite persistence
│   │   └── sla_monitor.py
│   ├── ui/                # Visual web dashboard
│   │   ├── app.py         # Flask server with SSE
│   │   └── templates/     # HTML dashboard
│   └── utils/             # Shared utilities
│       ├── pdf_parser.py
│       ├── email_parser.py
│       └── date_resolver.py
├── tests/                 # Unit and integration tests
├── n8n/                   # Workflow export (if using n8n)
└── prompts/               # Additional prompt files

Understanding the System

Start here:

spec/MAS_Brief.md — Full problem specification
spec/transcript.md — Stakeholder context
docs/architecture.md — Agent design, interactions, and generalizability

For implementation:

agent_docs/ — Technical guides for building each agent (pattern-based, not hardcoded)

The Challenge

OneCorp processes property contracts through a complex workflow:

EOI signed → Contract received → Validated → Solicitor approved → DocuSign → Executed

Current pain points:

Single shared inbox for all communications
Manual contract checking (slow, error-prone)
Version control across amendments
SLA tracking based on appointment dates

Agent Architecture

Visual Diagram: See assets/architecture.svg for a complete visual representation of agents, data flows, and control flows.

Agent	Responsibility	Model
Router	Email classification, deal mapping	Claude Haiku 4.5
Extractor	PDF field extraction from EOI/contracts	Claude Haiku 4.5
Auditor	Contract vs EOI comparison, risk scoring	Qwen3-235B via DeepInfra
Comms	Email generation (solicitor, vendor, alerts)	Qwen3-235B via DeepInfra

The Orchestrator (non-LLM) manages state transitions and SLA timers.

Demo Scenario

See docs/demo-script.md for a complete 3-minute demo walkthrough.

The dataset includes:

1 EOI (source of truth)
2 contracts (V1 with 5 errors, V2 corrected)
8 emails covering the full workflow

Demo flow:

V1 contract → 5 mismatches detected → Discrepancy alert
V2 contract → Validated → Sent to solicitor
Solicitor approves → Vendor release email
DocuSign flow → Contract executed
SLA test → Remove buyer-signed email → Alert fires

Testing

# Run all tests
pytest tests/ -v

# Run specific test
pytest tests/test_comparison.py -v

# With coverage
pytest tests/ --cov=src --cov-report=html

Ground Truth Files (Test Fixtures)

Files in ground-truth/ are test fixtures for the demo dataset, not runtime data:

File	Purpose
`eoi_extracted.json`	Expected Extractor output for demo EOI
`v1_extracted.json`	Expected Extractor output for demo V1 contract
`v2_extracted.json`	Expected Extractor output for demo V2 contract
`v1_mismatches.json`	Expected Auditor output when comparing V1 to EOI
`expected_outputs.json`	Expected emails/states at each workflow step

Important: Agents should use pattern-based logic, not read these files at runtime. The system must work for ANY property deal, not just the demo.

How This System Meets Judging Criteria

This multi-agent system directly addresses the evaluation criteria outlined in spec/judging-criteria.md:

1. System Design & Architecture

Clear agent separation: 4 specialized LLM agents (Router, Extractor, Auditor, Comms) + deterministic Orchestrator
Structured communication: Agents exchange typed data through the Orchestrator (not ad-hoc prompting)
Visual documentation: Complete architecture diagram showing data/control flows
Stable & reliable: Deterministic state machine ensures predictable behavior

2. Collaboration Between Agents

Meaningful multi-agent workflow: Router classifies → Extractor parses → Auditor validates → Comms generates
Data dependencies: Auditor requires both EOI and contract data from Extractor
Coordinated by Orchestrator: State machine enforces workflow rules (e.g., contract can't go to solicitor until validated)
Emergent intelligence: Validation accuracy improves through multi-stage processing (extraction + comparison + severity assessment)

3. Creativity & Innovation

Hybrid classification: Router uses deterministic pattern matching with LLM fallback for ambiguous cases
Confidence scoring: Extractor assigns confidence to fields; low-confidence triggers re-extraction or human review
Version superseding: Automatic contract version management (V2 supersedes V1)
Semantic comparison: Auditor understands negation ("NOT subject to finance" vs "IS subject to finance")

4. Task Performance

Complete end-to-end workflow: EOI → Contract validation → Solicitor approval → DocuSign → Execution
Handles error cases: V1 contract with 5 mismatches is correctly rejected and alert generated
Self-correcting: V2 corrected contract proceeds smoothly through workflow
SLA monitoring: Detects overdue deadlines and generates alerts

5. Real-World Value

Solves stated problem: Addresses OneCorp's pain points (manual checking, version control, SLA tracking)
Production-ready features: SQLite persistence, email template generation, audit trail
Scalable design: Each deal isolated by ID; supports concurrent processing
Cost-conscious: Uses smaller models for simple tasks (classification) and larger models for reasoning (comparison)

6. Safety & Reliability

Guardrails implemented:
- Confidence threshold (≥0.8) for critical fields (lot number, price, finance terms)
- Human escalation for low-confidence extractions
- Version superseding prevents using outdated contracts
- Only validated contracts sent to solicitor
- SLA alerts prevent deals from stalling
Error handling: PDF parsing failures, LLM API errors, invalid state transitions all handled gracefully
Audit trail: All events logged with timestamps in database
Limitations acknowledged: See Safety & Limitations section below

7. Presentation & User Experience

Visual web dashboard: Real-time workflow visualization with animated agent activity
Clear CLI interface: --demo, --step, --test-sla modes for different use cases
Comprehensive demo script: 3-minute walkthrough in docs/demo-script.md
Live event streaming: Server-Sent Events (SSE) for real-time updates without page refresh
Non-technical friendly: Visual indicators and status badges make the workflow accessible to all audiences
Complete documentation: Architecture docs, implementation guides, test coverage

Safety, Guardrails & Limitations

Built-in Safety Mechanisms

Confidence-based validation
- Critical fields (lot number, total price, finance terms) require ≥80% extraction confidence
- Low-confidence fields trigger re-extraction or human review flags
- No auto-approval when uncertain
State machine guardrails
- Invalid state transitions are blocked (e.g., can't execute contract before buyer signs)
- Contract version management prevents using superseded versions
- Only validated contracts proceed to solicitor
Human-in-the-loop triggers
- Low-confidence extractions flagged for review
- Discrepancy alerts require human decision on amendments
- SLA overdue alerts escalate to internal team
- Human can intervene at any workflow stage
Audit trail
- All events logged with timestamps
- Complete history of state transitions
- Traceable decision path for debugging

Known Limitations

LLM dependency
- Requires API access to Claude Haiku 4.5 via Anthropic (Extractor/Router)
- Auditor/Comms LLM paths use Qwen3-235B via DeepInfra (requires DEEPINFRA_API_KEY when enabled)
- Extraction accuracy depends on PDF quality and structure
- Costs scale with number of deals processed
Pattern-based extraction
- Works best with standard contract formats
- May struggle with highly unusual document layouts
- Requires well-formed PDFs (not scanned images without OCR)
Demo scope
- Tested with Australian property contracts
- May need tuning for other jurisdictions or contract types
- SLA rules currently hardcoded (2 business days after appointment)
Scalability considerations
- SQLite database suitable for single-user/demo use
- Production deployment would need PostgreSQL/MySQL for concurrency
- No rate limiting on LLM API calls (could hit quotas on high volume)
Error recovery
- PDF parsing failures halt processing (no fallback OCR)
- LLM API failures require manual retry
- No automatic recovery from database corruption

Recommended Production Enhancements

Multi-user authentication and authorization
Database migration to PostgreSQL with connection pooling
Rate limiting and retry logic for LLM API calls
Webhook integration for real-time email processing
Admin dashboard for monitoring deal pipeline
Configurable SLA rules per deal type
Backup/disaster recovery procedures

Key Files for Claude Code

If using Claude Code for implementation, start with:

CLAUDE.md — Master instructions and critical rules
agent_docs/ — Detailed implementation guides per agent

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OneCorp Multi-Agent Contract Workflow System

Quick Start

Visual Dashboard (Recommended for Demos)

Command Line Interface

Project Structure

Understanding the System

The Challenge

Agent Architecture

Demo Scenario

Testing

Ground Truth Files (Test Fixtures)

How This System Meets Judging Criteria

1. System Design & Architecture

2. Collaboration Between Agents

3. Creativity & Innovation

4. Task Performance

5. Real-World Value

6. Safety & Reliability

7. Presentation & User Experience

Safety, Guardrails & Limitations

Built-in Safety Mechanisms

Known Limitations

Recommended Production Enhancements

Key Files for Claude Code

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
agent_docs		agent_docs
assets		assets
data		data
docs		docs
ground-truth		ground-truth
n8n		n8n
prompts		prompts
spec		spec
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
OneCorp Multi-Agent Contract Workflow.pptx		OneCorp Multi-Agent Contract Workflow.pptx
PROGRESS.md		PROGRESS.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
UI_IMPLEMENTATION.md		UI_IMPLEMENTATION.md
folder-structure.txt		folder-structure.txt
requirements.txt		requirements.txt
run_ui.py		run_ui.py
tasks.md		tasks.md

License

nikjohn7/OneCorp-Multi-Agent-System

Folders and files

Latest commit

History

Repository files navigation

OneCorp Multi-Agent Contract Workflow System

Quick Start

Visual Dashboard (Recommended for Demos)

Command Line Interface

Project Structure

Understanding the System

The Challenge

Agent Architecture

Demo Scenario

Testing

Ground Truth Files (Test Fixtures)

How This System Meets Judging Criteria

1. System Design & Architecture

2. Collaboration Between Agents

3. Creativity & Innovation

4. Task Performance

5. Real-World Value

6. Safety & Reliability

7. Presentation & User Experience

Safety, Guardrails & Limitations

Built-in Safety Mechanisms

Known Limitations

Recommended Production Enhancements

Key Files for Claude Code

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages