Skip to content

anthillnet/sinain-hud

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

537 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Sinain Sinain HUD

MIT License CI npm macOS 12.3+ LongMemEval IPR 82.8%

Context OS β€” ambient intelligence for builders. Captures what you see and hear, distills it into a private knowledge graph, accessible from MCP, HTTP API, web UI, and a screen-recording-invisible HUD overlay.

Sinain demo

Quick Start Β· Docs Β· Privacy Β· Configuration Β· Contributing


You, Augmented

Sinain captures your screen and audio continuously, distills the stream into a structured live knowledge graph of facts, entities, and decisions, and exposes that graph through every interface where you might need it.

  • Capture β€” screen frames + system audio, with <private> tag stripping and auto-redaction of credentials and tokens before anything leaves your machine.
  • Distill β€” facts (atomic claims), entities (people / projects / topics), decisions (what was chosen and why) β€” extracted by an LLM, integrated by deterministic graph operations (no LLM in the integration step β†’ 82.8% Information Preservation Rate on the LongMemEval benchmark (ICLR 2025, 500 questions), measured against full-context oracle).
  • Access from anywhere β€” MCP server for agents (Claude Code, Codex, Goose, OpenClaude, Junie, Aider), HTTP API for systems and integrations, web UI at localhost:9500/knowledge/ui/ for browsing, and a HUD overlay for in-the-moment recall.

The HUD overlay is invisible to screen capture β€” never appears in screenshots, recordings, or screen shares.

Agent-Agnostic

Sinain feeds the same screen and audio context to any MCP-compatible agent. Switch agents on the fly β€” no restart, no context loss.

  • Tested with Claude Code, OpenClaude, Codex, Goose, Junie, and Aider. Any MCP-compatible agent works.
  • Pick agents in the overlay's flash-icon selector β€” spawn tasks can route to any profile in your roster.
  • Add custom profiles (personal Claude config, alternate models) by editing agents.json. The roster is the source of truth.
  • Knowledge modules travel with you β€” export from one machine, import on another.

Privacy Controls

By default, sinain uses cloud APIs (OpenRouter) for transcription and analysis. When you need tighter control, switch privacy modes β€” no code changes, one env var.

  • off β†’ standard β†’ strict β†’ paranoid β€” four modes in ~/.sinain/.env.
  • paranoid mode: Ollama + whisper.cpp. Zero network calls at runtime β€” the embedding model is pre-cached during the setup wizard (sinain setup-embedding), so subsequent runs are fully offline.
  • HUD overlay is invisible to screen capture (NSWindow.sharingType = .none β€” see overlay/macos/Runner/AppDelegate.swift:70). Verifiable via split-screen recording with QuickTime + OBS + Loom β€” the HUD is absent from all three.

Quick Start

npx @geravant/sinain@latest start

That's it. On first run, sinain will:

  1. Run an interactive setup wizard β€” transcription backend, API key, agent, privacy mode
  2. Auto-download the overlay app (~17MB), sck-capture binary (~5MB), embedding model (~90MB), and Python dependencies
  3. Start all services β€” sinain-core, sense_client, overlay, and agent

All assets are cached locally after the first install. In paranoid mode, subsequent runs are fully offline β€” no network calls at runtime.

Pin @latest on every invocation. npx @geravant/sinain (without @latest) caches forever against the unversioned spec β€” you'd silently keep running an old version for months. Sinain self-updates automatically when stale, but pinning @latest makes it explicit and saves a redundant relaunch.

Re-run the wizard anytime: npx @geravant/sinain@latest start --setup

Prerequisites

  • macOS 12.3+ β€” Sinain uses ScreenCaptureKit (introduced in 12.3). Earlier versions are not supported in this release. Apple Silicon and Intel both work.
  • Node.js 18+ β€” nodejs.org (LTS recommended)
  • Python 3.10+ β€” brew install python3 (macOS) or python.org
  • OpenRouter API key (optional for local-only mode) β€” openrouter.ai
  • Network access during first install β€” the wizard downloads ~112MB total (overlay app, sck-capture binary, sentence-transformer embedding model). All cached locally; subsequent runs need network only for cloud LLM API calls (or zero network in paranoid mode).

Fully local? No API key needed. Ollama + whisper-cli = zero cloud at runtime. See Running Fully Local.

First install reproducibility? See docs/cold-install-verification.md for a step-by-step verified-on-fresh-user-account guide, including the timing measurement and the failure modes the audit caught + fixed.

macOS Permissions

  1. System Settings β†’ Privacy & Security β†’ Screen Recording β€” add your Terminal, then quit and reopen Terminal (macOS TCC entitlements only apply to processes started after the grant)
  2. System Settings β†’ Privacy & Security β†’ Microphone β€” same: add Terminal, then restart Terminal

Sinain detects when these permissions are missing and surfaces a clear restart-instruction banner; you'll never get a silent degraded mode.

Managing sinain

npx @geravant/sinain@latest stop             # stop all services
npx @geravant/sinain@latest status           # check what's running
npx @geravant/sinain@latest start --setup    # re-run setup wizard
npx @geravant/sinain@latest start --no-sense # skip screen capture
npx @geravant/sinain@latest start --no-overlay  # headless mode

Always pin @latest β€” see the note in Quick Start above.

Architecture

β”Œβ”€β”€β”€ Your Device ─────────────────────────────────────────────────────┐
β”‚                                                                     β”‚
β”‚  sck-capture (Swift)                                                β”‚
β”‚    β”œβ”€ system audio (PCM) ──► sinain-core :9500                      β”‚
β”‚    └─ screen frames (JPEG) ──► sense_client ─── POST /sense ──►    β”‚
β”‚                                                      β”‚              β”‚
β”‚                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚                              β”‚                                      β”‚
β”‚                         sinain-core                                 β”‚
β”‚                           β”œβ”€ audio pipeline β†’ transcription         β”‚
β”‚                           β”œβ”€ agent loop β†’ digest + HUD text         β”‚
β”‚                           β”œβ”€ knowledge graph (private, on-device)   β”‚
β”‚                           └─ WebSocket feed                         β”‚
β”‚                                  β”‚                                  β”‚
β”‚                                  β–Ό                                  β”‚
β”‚                           overlay (Flutter)                         β”‚
β”‚                           private, invisible to screen capture      β”‚
β”‚                                                                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Components

Component Language What it does Docs
sinain-core TypeScript Central hub: audio pipeline, agent loop, knowledge graph, WS feed README
overlay Dart / Swift Private HUD (macOS), display modes, hotkeys Hotkeys
sense_client Python Screen capture, SSIM diff, OCR, privacy filter sense_client/
sck-capture Swift ScreenCaptureKit: system audio + screen frames tools/sck-capture/
sinain-agent Bash Shell harness that connects any agent to sinain-core sinain-agent/
sinain-knowledge TypeScript Curation, playbook, eval, portable knowledge modules Knowledge System
sinain-mcp-server TypeScript MCP server exposing sinain tools to agents sinain-mcp-server/

Configuration

Sinain splits config across two files in ~/.sinain/:

  • .env β€” secrets (API keys) and infrastructure (ports, audio device, privacy mode, analyzer LLM).
  • agents.json β€” agent roster (default agent, allowed-tools whitelists, analyzer pacing).

Both are created by the setup wizard. To re-run: npx @geravant/sinain start --setup.

Agents & profiles β†’ agents.json

The agent roster lives in ~/.sinain/agents.json. Each entry is a profile mapping a name to a binary + behavior type + optional env, settings, and model overrides. The overlay's flash-icon selector lets you pick which profile handles spawn tasks at runtime. Custom profiles like pclaude (personal claude with its own config dir) are first-class β€” the dispatch decision keys off profile.type, not the profile name.

See Agent Roster & Profiles for the complete schema, recipes, and routing model.

Context Analysis (HUD summarizer) β†’ .env

The context analysis loop runs every 3–30 seconds, sending recent audio/screen context to an LLM. It produces the short HUD text shown on the overlay plus a richer digest stored in the feed buffer for the knowledge graph.

Variable Default Description
ANALYSIS_PROVIDER openrouter openrouter (cloud) or ollama (local, free)
ANALYSIS_MODEL google/gemini-2.5-flash-lite Primary model for text analysis
ANALYSIS_VISION_MODEL google/gemini-2.5-flash Auto-selected when screen images are present
ANALYSIS_ENDPOINT (auto per provider) Override for custom OpenAI-compatible endpoints
ANALYSIS_API_KEY (from OPENROUTER_API_KEY) API key; not needed for ollama
ANALYSIS_FALLBACK_MODELS gemini-2.5-flash,... Comma-separated fallback chain

Other Key Settings β†’ .env

Variable Default Description
OPENROUTER_API_KEY β€” Required (unless ANALYSIS_PROVIDER=ollama + local transcription)
PRIVACY_MODE off off / standard / strict / paranoid

See docs/CONFIGURATION.md for the full reference.

Privacy Modes

Mode What it does
off All data flows freely β€” maximum insight quality
standard Auto-redacts credentials before cloud APIs (wizard default)
strict Only summaries leave your machine β€” no raw text sent to cloud
paranoid Fully local: Ollama + whisper.cpp. Zero network calls at runtime (embedding model pre-cached at install).

See Privacy Threat Model for the full design.

Hotkeys

Global hotkeys use Cmd+Shift:

Shortcut Action
Cmd+Shift+Space Toggle overlay visibility
Cmd+Shift+M Cycle display mode
Cmd+Shift+/ Open command input
Cmd+Shift+H Quit overlay

See docs/HOTKEYS.md for all 15 shortcuts.

Running Fully Local

No cloud APIs needed. Local models handle everything:

# 1. Install local transcription
./setup-local-stt.sh

# 2. Install Ollama + vision model
brew install ollama && ollama pull llava

# 3. Start in local mode
./start-local.sh
Model Size Speed Best for
llava 4.7 GB ~2s/frame General use (recommended)
llama3.2-vision 7.9 GB ~4s/frame Best accuracy
moondream 1.7 GB ~1s/frame Fastest, lower quality

Setup Guides

Setup Guide
Agent Roster & Profiles docs/AGENT-ROSTER.md β€” pick agents, add custom profiles
Bare Agent docs/INSTALL-BARE-AGENT.md β€” the default install path
From Source git clone, cp .env.example ~/.sinain/.env, ./start.sh

Knowledge System

Sinain builds a persistent knowledge graph from everything it captures β€” audio transcriptions, screen OCR, and agent interactions. Facts are distilled incrementally (on buffer full and session end), stored in an EAV triplestore with graph relationships, and retrieved via hybrid search (FTS5 + tag-based + entity graph backrefs with RRF fusion).

The integration step is fully deterministic β€” no LLM decides what to store. Every extracted fact is preserved.

npx @geravant/sinain export-knowledge   # export playbook, modules, graph
npx @geravant/sinain import-knowledge ~/sinain-knowledge-export.tar.gz

See Knowledge System docs for architecture details.

Querying knowledge from any MCP agent

Sinain's knowledge graph is exposed to any MCP-aware agent via the bundled MCP server. See Connect Your Coding Agent (MCP) below for setup.

Connect Your Coding Agent (MCP)

Sinain ships an MCP server that exposes 15 sinain_* tools β€” including sinain_knowledge_query, sinain_get_knowledge, sinain_distill_session, sinain_get_context, and sinain_respond β€” to any MCP-aware agent. Register it once and the agent can read your knowledge graph and surface text on the HUD from any project.

npx @geravant/sinain@latest mcp install

The wizard detects which MCP agents you have installed and registers sinain for the ones you select. Re-runnable any time; idempotent.

Agent Setup Config it touches
Claude Code mcp install (auto via wizard) ~/.claude.json (claude mcp add)
Claude Desktop mcp install (auto via wizard) ~/Library/Application Support/Claude/claude_desktop_config.json (mac)
Cursor mcp install (auto via wizard) ~/.cursor/mcp.json
Codex mcp install (auto via wizard) ~/.codex/config.toml (codex mcp add)
Goose mcp install (auto via wizard) ~/.config/goose/config.yaml
Junie mcp install (auto via wizard) ~/.junie/mcp/mcp.json

Already in sinain onboard β€” step 6 of the advanced flow runs the same registration. Quickstart asks once if any MCP agent is detected.

Deep Dives

Topic Doc
Knowledge System docs/knowledge-system.md
Knowledge API (HTTP) docs/KNOWLEDGE-API.md
MCP Integration (setup) docs/MCP-INTEGRATION.md
MCP Capabilities (tools + recipes) docs/MCP-CAPABILITIES.md
LongMemEval Audit docs/LONGMEMEVAL-AUDIT.md
Privacy Threat Model docs/privacy-protection-design.md
Full Configuration docs/CONFIGURATION.md
All Hotkeys docs/HOTKEYS.md

Contributing

See CONTRIBUTING.md.

License

MIT

About

Ambient intelligence that sees what you see, hears what you hear, and acts on your behalf

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors