Heidi CLI: The "Brain" for Your Local AI

Listen, we've all been there. You've got a shiny new LLM running on your laptop, but it's basically a goldfish. It forgets what it did five minutes ago, and it keeps making the same dumb mistakes.

Enter Heidi CLI.

Heidi is a command-center for a Unified Learning Suite. It's not just some fancy wrapper for an API; it's a full-on "Closed-Loop Learning System." Basically, it's a way to turn those generic AI models into specialized, self-improving agents that actually learn from their own successes and failures. It's like a personal trainer, but for your LLMs.

🚀 One-Click Installation

Install Heidi CLI with a single command! The installer will automatically:

Clone the latest version from GitHub
Install all dependencies
Build and install Heidi CLI
Verify the installation

Linux/macOS

# Quick install (one command)
curl -fsSL https://raw.githubusercontent.com/heidi-dang/heidi-cli/main/install | bash

# Or download first
wget https://raw.githubusercontent.com/heidi-dang/heidi-cli/main/install
chmod +x install
./install

Windows (PowerShell)

# Download and run
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/heidi-dang/heidi-cli/main/install.ps1" -OutFile "install.ps1"
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
.\install.ps1

After Installation

# Verify installation
heidi --version

# Quick start
heidi setup
heidi api generate --name "My First Key"
heidi model serve

🎉 That's it! Heidi CLI is now installed and ready to use!

📚 Complete Documentation

📖 Step-by-Step Guide: docs/how-to-use.md

From your first model download to enterprise deployment, our comprehensive guide covers:

✅ Quick Start - Get running in 5 minutes
✅ Setup & Configuration - Configure your environment
✅ Model Management - Download and manage models
✅ HuggingFace Integration - Access 100,000+ models
✅ Token & Cost Tracking - Monitor usage and costs
✅ Analytics & Monitoring - Performance insights
✅ Advanced Features - Power user capabilities
✅ Enterprise Deployment - Production setup
✅ Troubleshooting - Common issues and solutions

🚀 How the Magic Actually Happens (The 5-Phase Loop)

Think of Heidi like a "Perception-Action-Learning" loop. She's got five internal modules that play together like a well-oiled (and slightly sarcastic) machine:

1. The Multi-Model Host (`src/model_host`)

Stop talking to the cloud! Heidi hosts your models right here on your machine (without Ollama and local transformers).

What it does: Gives you a unified, OpenAI-compatible API (/v1/chat/completions).
The "Secret Sauce": You can route requests to different models—like a "stable" one for real work and an "experimental" one for when you're feeling spicy—without ever touching your app code.

2. Runtime Learning & Memory (`src/runtime`)

Heidi doesn't like starting from zero every time.

What it does: Uses a SQLite database for both short-term "what just happened?" and long-term "wait, I remember this!" memory.
The "Secret Sauce": Once a task is done, the Reflection Engine kicks in. It scores how well it did (Reward Scoring) and saves "pro-tips" that worked. Next time you ask something similar, Heidi whispers those successful strategies back into the prompt. It's basically cheating, but legal.

3. The Data Pipeline (`src/pipeline`)

Clean data = happy AI.

What it does: Grabs every single interaction and stuffs them into dated "Run Folders."
The "Secret Sauce": A Curation Engine digests these runs, tosses out the garbage, and applies a Secret Redaction Layer. It scrubs your OpenAI keys, deep-rooted secrets, and embarrassing passwords before they ever touch the retraining loop. Privacy is cool, okay?

4. Registry & Atomic Hot-Swap (`src/registry`)

When you've got enough data, it's time to level up.

What it does: Manages a Model Registry with stable and candidate channels (think of it like "Production" vs. "Beta").
The "Secret Sauce": After retraining, an Eval Harness checks if the new model is actually better or if it's just hallucinating harder. If it passes, Heidi does an Atomic Hot-Swap—reloading the new model in milliseconds with zero downtime.

5. HuggingFace Integration (`src/integrations`)

Discover, download, and manage models from the world's largest AI model hub.

What it does: Seamlessly integrates with HuggingFace Hub for model discovery and management.
The "Secret Sauce": Smart auto-configuration based on model metadata, usage analytics, and intelligent recommendations. Turns generic models into specialized tools with zero manual configuration.

Commands You'll Actually Use

Feature	The Command	What's it for?
Model Hosting	`heidi model serve`	Spins up local server. Easy peasy.
Agent Memory	`heidi memory search`	Digging through the agent's brain for that one thing.
Reflection	`heidi learning reflect`	Forces the agent to think about what it just did.
Data Export	`heidi learning export`	Bags up curated/redacted data for retraining.
Promotion	`heidi learning promote`	Moves a "Candidate" model to "Stable" status.
System Health	`heidi doctor`	Makes sure everything isn't on fire.
HuggingFace Hub	`heidi hf search <query>`	Search models on HuggingFace Hub
HuggingFace Hub	`heidi hf info <model>`	Get detailed model information
HuggingFace Hub	`heidi hf download <model>`	Download model and auto-configure
HuggingFace Hub	`heidi hf list-local`	List downloaded models
HuggingFace Hub	`heidi hf compare <model1> <model2>`	Compare models with recommendations
HuggingFace Hub	`heidi hf batch-download <model1> <model2>`	Download multiple models in parallel
HuggingFace Hub	`heidi hf analytics [model]`	View usage analytics and performance
HuggingFace Hub	`heidi hf remove <model>`	Remove downloaded model

Why does the AI Community even care?

Let's be real, the current AI world is a bit of a mess. Heidi fixes the big headaches:

Privacy is King: Most learning happens in the cloud. Nope. Heidi keeps your training data, your memory, and your weights 100% on your machine. Your company secrets stay your secrets.
Stopping the "Stupid Loop": We've all seen agents make the same mistake twice. Heidi's Redaction & Reflection layers make sure the model actually gets better at your specific job, not just weirder.
MLOps for the rest of us: Usually, you need a team of engineers to build retraining pipelines. Heidi abstracts all that noise into a single CLI tool. Now you can run a professional-grade Model Lab from your bedroom.
Access to Thousands of Models: Why limit yourself to a few models? Heidi's HuggingFace integration gives you access to thousands of models with smart auto-configuration and usage analytics.

How to get this thing running

1. Install the bits:

python -m pip install -e '.[dev]'

2. Check the vitals:

# This makes sure your state/ directories and docs are alive
heidi doctor

3. Fire it up:

# Start the model host and wait for the "Serving" message
heidi model serve

4. Check your status:

heidi status

5. Discover and Download Models:

# Search for models on HuggingFace Hub
heidi hf search "mistral" --limit 5

# Get detailed information about a model
heidi hf info "microsoft/DialoGPT-small"

# Download and auto-configure a model
heidi hf download "microsoft/DialoGPT-small" --add-to-config

# Compare multiple models
heidi hf compare "microsoft/DialoGPT-small" "mistralai/Mistral-7B-Instruct-v0.2"

# Batch download multiple models
heidi hf batch-download "model1" "model2" "model3"

# View usage analytics
heidi hf analytics

# List your downloaded models
heidi hf list-local

Real-World Examples

Model Discovery:

# Find the perfect model for your needs
heidi hf search "coding" --limit 10

# Compare models side-by-side
heidi hf compare "codellama/CodeLlama-7b-Instruct-hf" "WizardLM/WizardCoder-15B-V1.0"

# Get detailed model information
heidi hf info "meta-llama/Llama-2-7b-chat-hf"

Smart Downloads:

# Download with automatic configuration
heidi hf download "microsoft/DialoGPT-small" --add-to-config

# Batch download for efficiency
heidi hf batch-download "microsoft/DialoGPT-small" "meta-llama/Llama-3.2-1B-Instruct"

# Models are stored in ~/.heidi/models/huggingface/
# Automatically configured for immediate use

Usage Analytics:

# Track model performance and usage patterns
heidi hf analytics

# Get detailed analytics for a specific model
heidi hf analytics "microsoft_DialoGPT-small" --days 7

# Export analytics data for analysis
heidi hf analytics --export

Production Deployment:

# Start serving multiple models
heidi model serve

# Models appear in OpenAI-compatible API
curl -X POST http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "microsoft_DialoGPT-small", "messages": [{"role": "user", "content": "Hello!"}]}'

# Rich metadata in model listings
curl http://127.0.0.1:8000/v1/models | jq '.data[] | {id, display_name, capabilities, huggingface}'

Architecture Overview

heidi-cli/
├── src/heidi_cli/
│   ├── model_host/          # Multi-model API server
│   │   ├── server.py      # FastAPI server with OpenAI endpoints
│   │   ├── manager.py     # Model routing and request handling
│   │   └── metadata.py    # Rich model metadata management
│   ├── runtime/             # Learning & memory system
│   │   ├── db.py          # SQLite database management
│   │   ├── reflection.py   # Performance analysis
│   │   └── curation.py    # Data cleaning and redaction
│   ├── pipeline/            # Data export pipeline
│   │   └── curation.py    # Smart data curation
│   ├── registry/            # Model versioning and hot-swap
│   │   ├── manager.py     # Model registry management
│   │   ├── eval.py        # Model evaluation
│   │   └── hotswap.py     # Zero-downtime model swapping
│   ├── integrations/        # External integrations
│   │   ├── huggingface.py # HuggingFace Hub integration
│   │   └── analytics.py   # Usage analytics system
│   └── cli.py             # Unified command interface
└── state/                  # Local data storage
    ├── models/              # Downloaded models
    ├── registry/            # Model versions
    ├── memory/              # Agent memory
    ├── logs/                # System logs
    └── analytics/           # Usage analytics database

Key Features

Model Discovery:

Search thousands of models on HuggingFace Hub
Filter by task, size, capabilities, and popularity
Get detailed model information and metadata
Compare models side-by-side with intelligent recommendations

Smart Configuration:

Automatic model configuration based on metadata
Capability detection (chat, coding, vision, embeddings)
Device requirements based on model size
Context length and token optimization

Usage Analytics:

Real-time request tracking and performance metrics
Latency analysis (P95, P99, throughput)
Error monitoring and success rates
Token efficiency calculations
Export functionality for external analysis

Batch Operations:

Parallel model downloads (up to 3 concurrent)
Progress tracking and error handling
Automatic configuration for multiple models
Comprehensive summary reports

API Integration:

OpenAI-compatible endpoints (/v1/models, /v1/chat/completions)
Rich metadata in model listings
Seamless integration with existing tools
Automatic capability detection and routing

Use Cases

Developers:

Test different models for specific tasks
Compare performance across model families
Batch download model collections
Track usage patterns and optimize costs

Enterprises:

Manage model fleets with analytics
Enforce model governance and compliance
Optimize resource allocation
Track ROI on model investments

Researchers:

Access thousands of models for experimentation
Compare model architectures and capabilities
Track performance metrics across models
Export data for academic analysis

Advanced Configuration

Model Storage:

# Models are stored in ~/.heidi/models/huggingface/
# Each model has its own directory with metadata
# HuggingFace cache structure for efficient storage

Analytics Database:

# SQLite database at ~/.heidi/analytics/usage.db
# Tracks requests, performance, errors, and trends
# Thread-safe for concurrent access
# Exportable for external analysis

Configuration:

# Heidi config at ~/.heidi/config/suite.json
# Automatic model configuration
# Capability detection and optimization
# Device and precision settings

Why Heidi CLI?

Privacy First: Your data, your models, your rules. Everything stays local.
Smart Automation: From model discovery to performance tracking, it just works.
Professional Grade: Enterprise-ready features with monitoring and analytics.
Cost Effective: No cloud fees, no subscription traps.
Open Source: Full transparency and extensibility.
HuggingFace Powered: Access to thousands of models with one command.
Complete Documentation: Step-by-step guide for all users.

Heidi is written by humans (mostly) to help machines act more like humans (the smart ones).

📚 Complete Documentation

📖 Step-by-Step Guide: docs/how-to-use.md

From your first model download to enterprise deployment, our comprehensive guide covers:

✅ Quick Start - Get running in 5 minutes
✅ Setup & Configuration - Configure your environment
✅ Model Management - Download and manage models
✅ HuggingFace Integration - Access 100,000+ models
✅ Token & Cost Tracking - Monitor usage and costs
✅ Analytics & Monitoring - Performance insights
✅ Advanced Features - Power user capabilities
✅ Enterprise Deployment - Production setup
✅ Troubleshooting - Common issues and solutions

Quick Start

# 1. Install
pip install heidi-cli

# 2. Setup
heidi setup

# 3. Discover Models
heidi hf search "text-generation" --limit 5

# 4. Download Your Favorite
heidi hf download "microsoft/DialoGPT-small" --add-to-config

# 5. Start Serving
heidi model serve

# 6. Use It!
curl -X POST http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "microsoft_DialoGPT-small", "messages": [{"role": "user", "content": "Hello!"}]}'

That's it! You're now running a professional-grade AI model hosting platform with HuggingFace integration.

Name		Name	Last commit message	Last commit date
Latest commit History 346 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
fixtures/failures		fixtures/failures
schemas		schemas
scripts		scripts
src		src
state		state
tasks		tasks
tests		tests
tools		tools
.env		.env
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
INSTALL.md		INSTALL.md
README.md		README.md
install		install
install.ps1		install.ps1
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heidi CLI: The "Brain" for Your Local AI

🚀 One-Click Installation

Linux/macOS

Windows (PowerShell)

After Installation

📚 Complete Documentation

🚀 How the Magic Actually Happens (The 5-Phase Loop)

1. The Multi-Model Host (`src/model_host`)

2. Runtime Learning & Memory (`src/runtime`)

3. The Data Pipeline (`src/pipeline`)

4. Registry & Atomic Hot-Swap (`src/registry`)

5. HuggingFace Integration (`src/integrations`)

Commands You'll Actually Use

Why does the AI Community even care?

How to get this thing running

Real-World Examples

Architecture Overview

Key Features

Use Cases

Advanced Configuration

Why Heidi CLI?

📚 Complete Documentation

Quick Start

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Heidi CLI: The "Brain" for Your Local AI

🚀 One-Click Installation

Linux/macOS

Windows (PowerShell)

After Installation

📚 Complete Documentation

🚀 How the Magic Actually Happens (The 5-Phase Loop)

1. The Multi-Model Host (src/model_host)

2. Runtime Learning & Memory (src/runtime)

3. The Data Pipeline (src/pipeline)

4. Registry & Atomic Hot-Swap (src/registry)

5. HuggingFace Integration (src/integrations)

Commands You'll Actually Use

Why does the AI Community even care?

How to get this thing running

Real-World Examples

Architecture Overview

Key Features

Use Cases

Advanced Configuration

Why Heidi CLI?

📚 Complete Documentation

Quick Start

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. The Multi-Model Host (`src/model_host`)

2. Runtime Learning & Memory (`src/runtime`)

3. The Data Pipeline (`src/pipeline`)

4. Registry & Atomic Hot-Swap (`src/registry`)

5. HuggingFace Integration (`src/integrations`)

Packages