Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 18 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,13 +256,25 @@ agents:

---

## The AgentGuard Platform
## The Governed Swarm Platform

| Project | What It Does |
|---------|--------------|
| [**AgentGuard**](https://github.com/AgentGuardHQ/agentguard) | Governance kernel — policy enforcement for any agent driver |
| [**AgentGuard Cloud**](https://github.com/AgentGuardHQ/agentguard-cloud) | SaaS dashboard — observability, session replay, compliance |
| **ShellForge** | Governed local agent runtime — the onramp to AgentGuard |
| Project | Role | What It Does |
|---------|------|--------------|
| **ShellForge** | Orchestration | Governed agent runtime — CLI drivers + OpenClaw + local models |
| [**Octi Pulpo**](https://github.com/AgentGuardHQ/octi-pulpo) | Coordination | Swarm brain — shared memory, model routing, budget-aware dispatch |
| [**AgentGuard**](https://github.com/AgentGuardHQ/agentguard) | Governance | Policy enforcement, telemetry, invariants — on every tool call |
| [**AgentGuard Cloud**](https://github.com/AgentGuardHQ/agentguard-cloud) | Observability | SaaS dashboard — session replay, compliance, analytics |

ShellForge orchestrates. Octi Pulpo coordinates. AgentGuard governs.

### Supported Runtimes

| Runtime | What It Adds | Best For |
|---------|-------------|----------|
| **CLI Drivers** | Claude Code, Codex, Copilot, Gemini, Goose | Coding, PRs, commits |
| **[OpenClaw](https://github.com/openclaw/openclaw)** | Browser automation, 100+ skills, web app access | Integrations, NotebookLM, ChatGPT |
| **[NemoClaw](https://github.com/NVIDIA/NemoClaw)** | OpenClaw + NVIDIA OpenShell sandbox + Nemotron | Enterprise, air-gapped, zero-cost local inference |
| **[Ollama](https://ollama.com)** | Local model inference (Metal GPU) | Privacy, zero API cost |

---

Expand Down
164 changes: 123 additions & 41 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,30 +2,119 @@

## Overview

ShellForge is a single Go binary (~7.5MB) that provides governed local AI agent execution. Its core value is **governance** — when frameworks like OpenCode or DeepAgents are installed, they provide the agentic loop; ShellForge wraps them with AgentGuard policy enforcement.
ShellForge is a single Go binary (~7.5MB) that provides governed AI agent execution. Its core value is **governance** — every agent driver, whether a CLI tool, browser session, or local model, runs through AgentGuard policy enforcement on every action.

## 8-Layer Stack
## Execution Model

ShellForge supports three classes of agent driver, all governed uniformly:

```
┌─────────────────────────────────────────────────────────────┐
│ CLI Drivers (coding) │
│ Claude Code · Codex · Copilot CLI · Gemini CLI · Goose │
├─────────────────────────────────────────────────────────────┤
│ OpenClaw / NemoClaw (browser + integrations) │
│ Web apps · NotebookLM · ChatGPT · Slack · 100+ skills │
├─────────────────────────────────────────────────────────────┤
│ Local Models (zero cost) │
│ Ollama · Nemotron (via NemoClaw) │
└─────────────────────────────────────────────────────────────┘
│ every tool call
═════╪═════════════════
║ AgentGuard Kernel ║
║ allow · deny · audit║
═════╪═════════════════
│ approved
Octi Pulpo (coordination)
─────┼─────────────────
Your Environment (files, shell, git, browser, APIs)
```

### CLI Drivers

Purpose-built for code generation. Each uses its own subscription — no API keys needed:

| Driver | Subscription | Best For |
|--------|-------------|----------|
| `claude-code` | Claude Max | Complex reasoning, architecture |
| `codex` | OpenAI Pro | Code generation, refactoring |
| `copilot` | GitHub Pro | PR workflows, code review |
| `gemini-cli` | Google AI Premium | Analysis, multi-modal |
| `goose` | Free (local Ollama) | Air-gapped, zero cost |

### OpenClaw / NemoClaw Runtime

Browser automation and integrations via consumer app subscriptions:

| App | Via | Capability |
|-----|-----|-----------|
| ChatGPT | Browser (Playwright) | Reasoning tasks via existing OpenAI Plus subscription |
| NotebookLM | Browser (Playwright) | Audio briefings, slide decks, charts, Drive docs |
| Gemini App | Browser (Playwright) | Multi-modal analysis via Google AI Premium |
| Slack, Discord | OpenClaw skills | Messaging, notifications, integrations |

**NemoClaw** (optional, heavier) adds:
- **NVIDIA OpenShell** — kernel-level sandbox (process isolation, not just policy)
- **Nemotron** — local NVIDIA models for zero-cost inference

### Local Models

Zero token cost via Ollama or Nemotron:

| Model | Params | RAM | Best For |
|-------|--------|-----|----------|
| `qwen3:1.7b` | 1.7B | ~1.2 GB | Fast triage, classification |
| `qwen3:8b` | 8B | ~6 GB | Balanced reasoning |
| `qwen3:30b` | 30B | ~19 GB | Production quality |
| Nemotron (via NemoClaw) | Various | GPU | NVIDIA hardware acceleration |

## The Governed Swarm Platform

ShellForge is one layer in a three-part platform:

```
┌─────────────────────────────────────────────┐
│ Layer 8: OpenShell (Kernel Sandbox) │ Docker/Colima isolation
├─────────────────────────────────────────────┤
│ Layer 7: DefenseClaw (Supply Chain) │ Cisco AI BoM Scanner
├─────────────────────────────────────────────┤
│ Layer 6: Dagu (Orchestration) │ YAML DAG workflows + web UI
├─────────────────────────────────────────────┤
│ Layer 5: Goose / OpenCode (Execution) │ Primary local agent driver
├─────────────────────────────────────────────┤
│ Layer 4: AgentGuard (Governance Kernel) │ Policy enforcement
├─────────────────────────────────────────────┤
│ Layer 3: TurboQuant (Quantization) │ KV cache optimization (optional)
├─────────────────────────────────────────────┤
│ Layer 2: RTK (Token Compression) │ Auto-compress I/O (optional)
├─────────────────────────────────────────────┤
│ Layer 1: Ollama (Local LLM) │ Metal GPU on Mac
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ ShellForge │
│ Orchestration — forge and run agent swarms │
│ CLI drivers + OpenClaw/NemoClaw + local models │
├─────────────────────────────────────────────────────┤
│ Octi Pulpo │
│ Coordination — shared memory, model routing, │
│ budget-aware dispatch, priority signaling │
├─────────────────────────────────────────────────────┤
│ AgentGuard │
│ Governance — policy enforcement, telemetry, │
│ invariants, compliance │
└─────────────────────────────────────────────────────┘
```

ShellForge orchestrates. Octi Pulpo coordinates. AgentGuard governs.

## Cost-Aware Routing

Octi Pulpo routes tasks to the cheapest capable driver:

| Tier | Driver | Cost | Use When |
|------|--------|------|----------|
| Local | Ollama / Nemotron | $0 | Simple tasks, triage, classification |
| Subscription | Browser → ChatGPT / NotebookLM / Gemini | Already paying | Medium tasks, artifacts, briefings |
| CLI | Claude Code / Codex / Copilot | Already paying | Coding, PRs, commits |
| API | Direct API calls | Per-token | Burst capacity, programmatic access |

## Infrastructure Stack

| Layer | Project | What It Does |
|-------|---------|--------------|
| **Infer** | [Ollama](https://ollama.com) | Local LLM inference (Metal GPU on Mac) |
| **Optimize** | [RTK](https://github.com/rtk-ai/rtk) | Token compression — 70-90% reduction on shell output |
| **Execute** | [Goose](https://block.github.io/goose) / [OpenClaw](https://github.com/openclaw/openclaw) | Agent execution + browser automation |
| **Orchestrate** | [Dagu](https://github.com/dagu-org/dagu) | YAML DAG workflows with scheduling and web UI |
| **Coordinate** | [Octi Pulpo](https://github.com/AgentGuardHQ/octi-pulpo) | Swarm coordination via MCP |
| **Govern** | [AgentGuard](https://github.com/AgentGuardHQ/agentguard) | Policy enforcement on every action |
| **Sandbox** | [OpenShell](https://github.com/NVIDIA/OpenShell) | Kernel-level isolation (Docker on macOS) |
| **Scan** | [DefenseClaw](https://github.com/cisco-ai-defense/defenseclaw) | Supply chain scanner — AI Bill of Materials |

## Go Project Layout

```
Expand All @@ -38,48 +127,41 @@ internal/
├── ollama/ # Ollama HTTP client (chat, generate)
├── agent/ # Native fallback agentic loop
├── tools/ # 5 tool implementations + RTK wrapper
├── engine/ # Pluggable engine interface (OpenCode, DeepAgents)
├── engine/ # Pluggable engine interface (Goose, OpenClaw, OpenCode)
├── logger/ # Structured JSON logging
├── scheduler/ # Memory-aware scheduling + cron
├── orchestrator/ # Multi-agent state machine
├── normalizer/ # Canonical Action Representation
├── correction/ # Denial tracking + escalation
├── intent/ # Format-agnostic intent parsing
└── integration/ # RTK, OpenShell, DefenseClaw, TurboQuant, AgentGuard
```

## Engine Architecture

ShellForge uses a pluggable engine system:

1. **Goose (Block)** (preferred local driver) — subprocess, native Ollama support, SHELL wrapped via `govern-shell.sh`
2. **OpenCode** (alternative) — subprocess, `--non-interactive` mode, governance-wrapped
3. **DeepAgents** (alternative) — subprocess, Node.js/Python SDK, governance-wrapped
4. **Native** (fallback) — built-in multi-turn loop with Ollama + tool calling

The engine selection is automatic based on what's installed. Use `shellforge run goose` for local models, or `shellforge agent` for the native loop.
1. **Goose** (preferred local driver) — subprocess, native Ollama support, SHELL wrapped via `govern-shell.sh`
2. **OpenClaw** (browser + integrations) — browser automation, web app access, 100+ skills
3. **NemoClaw** (enterprise) — OpenClaw + NVIDIA OpenShell sandbox + Nemotron local models
4. **CLI Drivers** (cloud coding) — Claude Code, Codex, Copilot CLI, Gemini CLI
5. **Native** (fallback) — built-in multi-turn loop with Ollama + tool calling

## Governance Flow

```
User Request → Engine (Goose/OpenCode/DeepAgents/Native)
User Request → Engine (Goose/OpenClaw/CLI/Native)
→ Tool Call → Governance Check (agentguard.yaml)
→ ALLOW → Execute Tool → Return Result
→ DENY → Log Violation → Correction Feedback → Retry
```

## Data Flow

1. User invokes `./shellforge qa` (or agent, report, scan)
2. CLI loads `agentguard.yaml` governance policy
3. Detects available engine (Goose > OpenCode > DeepAgents > Native)
4. Engine sends prompt to Ollama (via RTK for token compression)
5. LLM responds with tool calls
6. Each tool call passes through governance check
7. Allowed tools execute (shell commands wrapped by RTK + OpenShell sandbox)
8. Results compressed by RTK, fed back to LLM
9. Loop continues until task complete or budget exhausted
The format-agnostic intent parser handles tool calls from any LLM output format (tool_calls, JSON blocks, XML tags, function_call).

## macOS (Apple Silicon) Support

All 8 layers run on Mac M4:
All layers run on Mac M4:
- Ollama uses Metal for GPU acceleration
- RTK, AgentGuard, OpenCode are native arm64 binaries
- TurboQuant runs via PyTorch (MPS backend)
- RTK, AgentGuard, ShellForge are native arm64 binaries
- OpenShell runs inside Docker/Colima (Linux VM for Landlock)
- DefenseClaw installs via pip or source build
21 changes: 21 additions & 0 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,27 @@ Foundation types exist (`internal/action/`, `internal/orchestrator/`, `internal/

## Planned

### Phase 7.5 — Octi Pulpo Integration + Browser Drivers

ShellForge orchestrates, Octi Pulpo coordinates, AgentGuard governs. This phase wires the three together.

#### 7.5.1 — Octi Pulpo Coordination
- [ ] Consume Octi Pulpo MCP tools (route_recommend, coord_claim, coord_signal)
- [ ] Budget-aware driver selection — query Octi Pulpo before choosing model/driver
- [ ] Duplicate work prevention via coord_claim (prevents agent stampedes)
- [ ] Driver health signals — broadcast ShellForge agent status to Octi Pulpo

#### 7.5.2 — OpenClaw / NemoClaw Browser Driver
- [ ] OpenClaw as execution runtime for browser-based agents
- [ ] NemoClaw as optional adapter (never a dependency — protect kernel independence)
- [ ] Browser driver support in `shellforge run` (alongside Goose, Claude Code, Copilot, Codex, Gemini)
- [ ] Governed browser actions through AgentGuard kernel

#### 7.5.3 — Ecosystem Wiring
- [ ] ShellForge agents auto-connect to Octi Pulpo MCP server on startup
- [ ] Shared memory across ShellForge-managed agents via Octi Pulpo memory_store/recall
- [ ] Model routing delegation — ShellForge defers to Octi Pulpo route_recommend

### Phase 8 — AgentGuard MCP Server
- [ ] MCP server exposing governed tools
- [ ] Goose → MCP → AgentGuard → execute
Expand Down
Loading