ALIV

Autonomous Local Intelligence Vault

Query hundreds of contracts, leases, and invoices in seconds — entirely offline.
No API keys. No cloud uploads. No NDA violations. Zero data exfiltration.

▶️ 8-min Demo · 📖 Docs · 🚀 Quick Start · 📊 Benchmarks · 📬 Contact

ALIV demo — query, evidence cards, safe file actions

Why ALIV?

Legal and finance teams sit on thousands of sensitive documents — NDAs, lease agreements, invoices, tax records — that cannot be uploaded to any cloud AI without violating confidentiality obligations or compliance requirements (SOC2, HIPAA, GDPR). Generic tools ignore this constraint. ALIV does not.

For legal & finance teams

Query across 500+ local documents with citations, not guesses
Extract structured fields (expiry dates, penalty clauses, vendor names) with confidence scores
Automatically rename and organize chaotic file inboxes
Every file action is reviewable, approvable, and fully reversible

For engineering evaluators

Rust + Tauri 2 — native performance, ~10 MB binary, no Chromium overhead
Hybrid RRF retrieval: 55% BM25 lexical + 45% sqlite-vec ANN
6-phase build with mandatory benchmark pass gates and zero open critical defects
Approval-by-default safety model: hard-errors if gate not cleared before file execution

Key Features

⚖️ Legal/Finance Agent

Ask natural language questions across your local document corpus. ALIV assembles a token-budget-aware context window (8,192 tokens) and returns a cited answer using a local LLM.

Query: "Find all agreements expiring in 90 days and extract the penalty clauses."

→ Lease — Acme Commercial (expires 2026-03-01)
  Penalty: $2,800/day holdover  [lease_acme_2025.txt · §5]

→ NDA — Axiom Legal (expires 2026-07-15)
  Penalty: $50,000 per unauthorized disclosure  [nda_axiom_2025.txt · §5]

→ Service Agreement — Delta Advisory (expired 2025-12-31)
  Penalty: $5,000/day for delivery delays  [service_delta_2025.txt · §4]

📂 Autonomous Inbox Organizer

Drop a chaotic folder of 500 files. Rust background threads detect changes instantly, run OCR and extraction, and propose standardized renames — {DocType}_{Vendor}_{YYYY-MM-DD} — then wait for your approval before modifying anything.

Safety model: Generate → Risk label → Select → Approve → Execute → Undo

High-risk items are pre-deselected. execute_actions hard-errors if approve_actions was not called first. Every execution batch is logged and reversible.

🧠 Persistent Memory Layer

Three retention scopes: session, week, permanent. A token-budget assembler selects the most relevant facts, conversation turns, and document summaries before each query. Memory is local SQLite — it survives restarts and never leaves your machine.

🔬 Structured Extraction

Five built-in extraction templates: lease summary, contract key terms, invoice fields, penalty clauses, custom. Each field returns a value, a confidence score (0–1), and source citations. Export to JSON or CSV in one click.

📊 Benchmark Suite & Diagnostics

Built-in benchmark automation across ingestion, retrieval, and extraction with versioned pass/fail results. The Diagnostics screen exports a JSON report of DB stats, runtime state, and platform info — built for real support workflows.

Architecture

graph TB
    UI["React 18 + TypeScript\nZustand · React Router v6\nTailwind + Glassmorphism UI"]
    TAURI["Tauri 2 Shell\nRust Backend\ntyped invoke() IPC"]
    SQLITE["SQLite WAL\n+ sqlite-vec ANN\nvec0 virtual tables"]
    PYTHON["Python Bridge\nPyMuPDF · Tesseract OCR\nsentence-transformers"]
    OLLAMA["Ollama\nllama3.2:3b · localhost:11434\n100% local inference"]

    UI -->|"typed invoke wrappers"| TAURI
    TAURI -->|"rusqlite"| SQLITE
    TAURI -->|"std::process::Command\nJSON-lines IPC · 3× retry"| PYTHON
    TAURI -->|"HTTP · streaming"| OLLAMA
    PYTHON -->|"embeddings + parse results"| SQLITE

Key architectural decisions:

Decision	Rationale
Rust + Tauri 2 (not Electron)	~10 MB binary, native OS APIs, no bundled Chromium
sqlite-vec for ANN search	No external vector database — lives in the same WAL-mode SQLite file
Python subprocess bridge	Reuses mature ML stack (Tesseract, sentence-transformers, PyMuPDF) without Rust FFI complexity
Hash router	Works correctly from any local file path on Windows and macOS without a server
Approval gate as hard error	`execute_actions` returns an error — not a warning — if `approve_actions` was skipped
Token-budget assembler	Prevents context overflow by ranking facts/history/chunks/summaries within a fixed budget

Quick Start

Prerequisites: Rust (stable), Node.js 18+, Python 3.10+, Ollama installed

# Clone and install
git clone https://github.com/VenkataAnilKumar/ALIV.git && cd ALIV
pip install -r python/requirements.txt
npm install

# Pull the LLM (~2 GB, one-time)
ollama pull llama3.2:3b

# Launch in dev mode
cargo tauri dev

# Production build → installer in src-tauri/target/release/bundle/
cargo tauri build

Try it immediately with the demo corpus — 5 pre-built legal documents (lease, NDA, invoice, service agreement, email):

# After launch: register this folder as your workspace
./demo/corpus/

Full scripted walkthrough with expected outputs for every screen: demo/DEMO_GUIDE.md

Performance & Engineering Rigor

Metric	Result	Gate Threshold
Ingestion success rate	100% (200/200 files)	≥ 98%
Retrieval relevance (top-5)	100%	≥ 85%
Extraction precision (critical fields)	97.4%	≥ 95%
Harmful file-action rate	< 1%	< 1%
Open critical defects	0	0

Production practices:

FilesystemGuard — path canonicalization, workspace boundary check, symlink rejection, traversal detection before every file operation
3-attempt retry with exponential backoff — Python worker cold-start resilience (500ms / 1s)
Phase gate system — each phase requires two consecutive benchmark passes before advancing
React error boundary — frontend crashes surface a recovery screen with an exportable crash log
Structured logging — tauri-plugin-log writes to platform app-data dir; no telemetry transmitted

Full Phase 4 defect history

ID	Defect	Severity	Fix
D001	`embedding_count` always 0 in benchmark	Critical	Wired `_embed_chunk_batch()` into benchmark loop
D002	sqlite-vec migration skipped in Python benchmarks	High	`_load_sqlite_vec()` loads Python wheel; `_populate_vec_index()` syncs ANN index
D003	Python worker has no retry on transient timeout	High	`run_worker()` wraps with 3-attempt retry, 500ms/1s backoff
D004	FilesystemGuard accepted any non-empty string	Critical	Real `validate_and_canonicalize()` with workspace boundary enforcement

All 4 closed — docs/PHASE4_DEFECT_BACKLOG.json

Roadmap

Phase	Scope	Status
0 · Foundation	Tauri skeleton, SQLite schema 0001–0008, benchmark scaffold	✅ Complete
1 · Ingestion	PDF / DOCX / EML parsing, OCR fallback, incremental indexing	✅ Complete
2 · Retrieval & Memory	RRF hybrid search, extraction templates, persistent memory layer	✅ Complete
3 · Safe File Actions	Approval gate, collision-safe rename engine, undo, audit log	✅ Complete
4 · E2E Hardening	4 defects closed, 2× consecutive benchmark suite pass	✅ Complete
5 · UX Polish	8-screen React frontend, 2026 dark glassmorphism design system	✅ Complete
6 · Release Candidate	Bundler, installer, onboarding, diagnostics, error boundary	🔄 In progress
Post-6	Cloud-next routing, BYOK, Gemma 4 (27B MMLU 85.2%), VAULT / LEDGER variants	📋 Planned

Contributions welcome — see ALIV_IMPLEMENTATION_BACKLOG.md for open workstreams.

Next Steps

For Legal & Finance Teams

Try ALIV against a real matter folder.

→ 📥 Download installer (Phase 6 RC) → ▶️ 8-minute demo walkthrough → 📬 Request a scoped pilot

Deployment support available for firms with 50+ users.

For Engineers & Hiring Managers

Explore the architecture, PRDs, and benchmark results.

→ 📐 Architecture docs → 📋 Phase PRDs → 📊 Latest benchmark results → 🔧 Phase 3–4 implementation spec

System Requirements

	Minimum	Recommended
OS	Windows 10 · macOS 11 · Ubuntu 22	Windows 11 · macOS 14
RAM	8 GB	16 GB
Disk	2 GB	10 GB (larger corpora)
Ollama	v0.3+	Latest
Model	`llama3.2:3b`	`mistral:7b` or `gemma4:27b`

MIT License · Built by Venkata Anil Kumar

⭐ Star this repo if ALIV solves a problem you care about.

local LLM · document intelligence · legal AI · offline RAG · Tauri desktop · sqlite-vec

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github		.github
benchmarks		benchmarks
db		db
demo		demo
dist		dist
docs		docs
python		python
scripts		scripts
src-tauri		src-tauri
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
DEMO.md		DEMO.md
PRODUCT_DOCUMENT.md		PRODUCT_DOCUMENT.md
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ALIV

Autonomous Local Intelligence Vault

Why ALIV?

Key Features

Architecture

Quick Start

Performance & Engineering Rigor

Roadmap

Next Steps

For Legal & Finance Teams

For Engineers & Hiring Managers

System Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

ALIV

Autonomous Local Intelligence Vault

Why ALIV?

Key Features

Architecture

Quick Start

Performance & Engineering Rigor

Roadmap

Next Steps

For Legal & Finance Teams

For Engineers & Hiring Managers

System Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages