Skip to content

VenkataAnilKumar/ALIV

Repository files navigation

ALIV

Autonomous Local Intelligence Vault

Query hundreds of contracts, leases, and invoices in seconds — entirely offline.
No API keys. No cloud uploads. No NDA violations. Zero data exfiltration.


License: MIT Phase 6 Tauri 2 Rust React 18 SQLite Ollama


▶️ 8-min Demo  ·  📖 Docs  ·  🚀 Quick Start  ·  📊 Benchmarks  ·  📬 Contact


ALIV demo — query, evidence cards, safe file actions

Why ALIV?

Legal and finance teams sit on thousands of sensitive documents — NDAs, lease agreements, invoices, tax records — that cannot be uploaded to any cloud AI without violating confidentiality obligations or compliance requirements (SOC2, HIPAA, GDPR). Generic tools ignore this constraint. ALIV does not.

For legal & finance teams

  • Query across 500+ local documents with citations, not guesses
  • Extract structured fields (expiry dates, penalty clauses, vendor names) with confidence scores
  • Automatically rename and organize chaotic file inboxes
  • Every file action is reviewable, approvable, and fully reversible

For engineering evaluators

  • Rust + Tauri 2 — native performance, ~10 MB binary, no Chromium overhead
  • Hybrid RRF retrieval: 55% BM25 lexical + 45% sqlite-vec ANN
  • 6-phase build with mandatory benchmark pass gates and zero open critical defects
  • Approval-by-default safety model: hard-errors if gate not cleared before file execution

Key Features

⚖️ Legal/Finance Agent

Ask natural language questions across your local document corpus. ALIV assembles a token-budget-aware context window (8,192 tokens) and returns a cited answer using a local LLM.

Query: "Find all agreements expiring in 90 days and extract the penalty clauses."

→ Lease — Acme Commercial (expires 2026-03-01)
  Penalty: $2,800/day holdover  [lease_acme_2025.txt · §5]

→ NDA — Axiom Legal (expires 2026-07-15)
  Penalty: $50,000 per unauthorized disclosure  [nda_axiom_2025.txt · §5]

→ Service Agreement — Delta Advisory (expired 2025-12-31)
  Penalty: $5,000/day for delivery delays  [service_delta_2025.txt · §4]
📂 Autonomous Inbox Organizer

Drop a chaotic folder of 500 files. Rust background threads detect changes instantly, run OCR and extraction, and propose standardized renames — {DocType}_{Vendor}_{YYYY-MM-DD} — then wait for your approval before modifying anything.

Safety model: Generate → Risk label → Select → Approve → Execute → Undo

High-risk items are pre-deselected. execute_actions hard-errors if approve_actions was not called first. Every execution batch is logged and reversible.

🧠 Persistent Memory Layer

Three retention scopes: session, week, permanent. A token-budget assembler selects the most relevant facts, conversation turns, and document summaries before each query. Memory is local SQLite — it survives restarts and never leaves your machine.

🔬 Structured Extraction

Five built-in extraction templates: lease summary, contract key terms, invoice fields, penalty clauses, custom. Each field returns a value, a confidence score (0–1), and source citations. Export to JSON or CSV in one click.

📊 Benchmark Suite & Diagnostics

Built-in benchmark automation across ingestion, retrieval, and extraction with versioned pass/fail results. The Diagnostics screen exports a JSON report of DB stats, runtime state, and platform info — built for real support workflows.


Architecture

graph TB
    UI["React 18 + TypeScript\nZustand · React Router v6\nTailwind + Glassmorphism UI"]
    TAURI["Tauri 2 Shell\nRust Backend\ntyped invoke() IPC"]
    SQLITE["SQLite WAL\n+ sqlite-vec ANN\nvec0 virtual tables"]
    PYTHON["Python Bridge\nPyMuPDF · Tesseract OCR\nsentence-transformers"]
    OLLAMA["Ollama\nllama3.2:3b · localhost:11434\n100% local inference"]

    UI -->|"typed invoke wrappers"| TAURI
    TAURI -->|"rusqlite"| SQLITE
    TAURI -->|"std::process::Command\nJSON-lines IPC · 3× retry"| PYTHON
    TAURI -->|"HTTP · streaming"| OLLAMA
    PYTHON -->|"embeddings + parse results"| SQLITE
Loading

Key architectural decisions:

Decision Rationale
Rust + Tauri 2 (not Electron) ~10 MB binary, native OS APIs, no bundled Chromium
sqlite-vec for ANN search No external vector database — lives in the same WAL-mode SQLite file
Python subprocess bridge Reuses mature ML stack (Tesseract, sentence-transformers, PyMuPDF) without Rust FFI complexity
Hash router Works correctly from any local file path on Windows and macOS without a server
Approval gate as hard error execute_actions returns an error — not a warning — if approve_actions was skipped
Token-budget assembler Prevents context overflow by ranking facts/history/chunks/summaries within a fixed budget

Quick Start

Prerequisites: Rust (stable), Node.js 18+, Python 3.10+, Ollama installed

# Clone and install
git clone https://github.com/VenkataAnilKumar/ALIV.git && cd ALIV
pip install -r python/requirements.txt
npm install

# Pull the LLM (~2 GB, one-time)
ollama pull llama3.2:3b

# Launch in dev mode
cargo tauri dev
# Production build → installer in src-tauri/target/release/bundle/
cargo tauri build

Try it immediately with the demo corpus — 5 pre-built legal documents (lease, NDA, invoice, service agreement, email):

# After launch: register this folder as your workspace
./demo/corpus/

Full scripted walkthrough with expected outputs for every screen: demo/DEMO_GUIDE.md


Performance & Engineering Rigor

MetricResultGate Threshold
Ingestion success rate100% (200/200 files)≥ 98%
Retrieval relevance (top-5)100%≥ 85%
Extraction precision (critical fields)97.4%≥ 95%
Harmful file-action rate< 1%< 1%
Open critical defects00

Production practices:

  • FilesystemGuard — path canonicalization, workspace boundary check, symlink rejection, traversal detection before every file operation
  • 3-attempt retry with exponential backoff — Python worker cold-start resilience (500ms / 1s)
  • Phase gate system — each phase requires two consecutive benchmark passes before advancing
  • React error boundary — frontend crashes surface a recovery screen with an exportable crash log
  • Structured loggingtauri-plugin-log writes to platform app-data dir; no telemetry transmitted
Full Phase 4 defect history
ID Defect Severity Fix
D001 embedding_count always 0 in benchmark Critical Wired _embed_chunk_batch() into benchmark loop
D002 sqlite-vec migration skipped in Python benchmarks High _load_sqlite_vec() loads Python wheel; _populate_vec_index() syncs ANN index
D003 Python worker has no retry on transient timeout High run_worker() wraps with 3-attempt retry, 500ms/1s backoff
D004 FilesystemGuard accepted any non-empty string Critical Real validate_and_canonicalize() with workspace boundary enforcement

All 4 closed — docs/PHASE4_DEFECT_BACKLOG.json


Roadmap

Phase Scope Status
0 · Foundation Tauri skeleton, SQLite schema 0001–0008, benchmark scaffold ✅ Complete
1 · Ingestion PDF / DOCX / EML parsing, OCR fallback, incremental indexing ✅ Complete
2 · Retrieval & Memory RRF hybrid search, extraction templates, persistent memory layer ✅ Complete
3 · Safe File Actions Approval gate, collision-safe rename engine, undo, audit log ✅ Complete
4 · E2E Hardening 4 defects closed, 2× consecutive benchmark suite pass ✅ Complete
5 · UX Polish 8-screen React frontend, 2026 dark glassmorphism design system ✅ Complete
6 · Release Candidate Bundler, installer, onboarding, diagnostics, error boundary 🔄 In progress
Post-6 Cloud-next routing, BYOK, Gemma 4 (27B MMLU 85.2%), VAULT / LEDGER variants 📋 Planned

Contributions welcome — see ALIV_IMPLEMENTATION_BACKLOG.md for open workstreams.


Next Steps

For Legal & Finance Teams

Try ALIV against a real matter folder.

📥 Download installer (Phase 6 RC) ▶️ 8-minute demo walkthrough 📬 Request a scoped pilot

Deployment support available for firms with 50+ users.

For Engineers & Hiring Managers

Explore the architecture, PRDs, and benchmark results.

📐 Architecture docs 📋 Phase PRDs 📊 Latest benchmark results 🔧 Phase 3–4 implementation spec


System Requirements

Minimum Recommended
OS Windows 10 · macOS 11 · Ubuntu 22 Windows 11 · macOS 14
RAM 8 GB 16 GB
Disk 2 GB 10 GB (larger corpora)
Ollama v0.3+ Latest
Model llama3.2:3b mistral:7b or gemma4:27b

MIT License  ·  Built by Venkata Anil Kumar

⭐ Star this repo if ALIV solves a problem you care about.

local LLM · document intelligence · legal AI · offline RAG · Tauri desktop · sqlite-vec

About

ALIV is a local-first desktop document intelligence system for legal and finance workflows, combining offline ingestion, retrieval, extraction, persistent memory, and preview-first file actions with rollback.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages