diff --git a/README.md b/README.md index c3c51ab..6b64a12 100644 --- a/README.md +++ b/README.md @@ -256,13 +256,25 @@ agents: --- -## The AgentGuard Platform +## The Governed Swarm Platform -| Project | What It Does | -|---------|--------------| -| [**AgentGuard**](https://github.com/AgentGuardHQ/agentguard) | Governance kernel — policy enforcement for any agent driver | -| [**AgentGuard Cloud**](https://github.com/AgentGuardHQ/agentguard-cloud) | SaaS dashboard — observability, session replay, compliance | -| **ShellForge** | Governed local agent runtime — the onramp to AgentGuard | +| Project | Role | What It Does | +|---------|------|--------------| +| **ShellForge** | Orchestration | Governed agent runtime — CLI drivers + OpenClaw + local models | +| [**Octi Pulpo**](https://github.com/AgentGuardHQ/octi-pulpo) | Coordination | Swarm brain — shared memory, model routing, budget-aware dispatch | +| [**AgentGuard**](https://github.com/AgentGuardHQ/agentguard) | Governance | Policy enforcement, telemetry, invariants — on every tool call | +| [**AgentGuard Cloud**](https://github.com/AgentGuardHQ/agentguard-cloud) | Observability | SaaS dashboard — session replay, compliance, analytics | + +ShellForge orchestrates. Octi Pulpo coordinates. AgentGuard governs. + +### Supported Runtimes + +| Runtime | What It Adds | Best For | +|---------|-------------|----------| +| **CLI Drivers** | Claude Code, Codex, Copilot, Gemini, Goose | Coding, PRs, commits | +| **[OpenClaw](https://github.com/openclaw/openclaw)** | Browser automation, 100+ skills, web app access | Integrations, NotebookLM, ChatGPT | +| **[NemoClaw](https://github.com/NVIDIA/NemoClaw)** | OpenClaw + NVIDIA OpenShell sandbox + Nemotron | Enterprise, air-gapped, zero-cost local inference | +| **[Ollama](https://ollama.com)** | Local model inference (Metal GPU) | Privacy, zero API cost | --- diff --git a/docs/architecture.md b/docs/architecture.md index 9a6394d..017ca82 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -2,30 +2,119 @@ ## Overview -ShellForge is a single Go binary (~7.5MB) that provides governed local AI agent execution. Its core value is **governance** — when frameworks like OpenCode or DeepAgents are installed, they provide the agentic loop; ShellForge wraps them with AgentGuard policy enforcement. +ShellForge is a single Go binary (~7.5MB) that provides governed AI agent execution. Its core value is **governance** — every agent driver, whether a CLI tool, browser session, or local model, runs through AgentGuard policy enforcement on every action. -## 8-Layer Stack +## Execution Model + +ShellForge supports three classes of agent driver, all governed uniformly: + +``` +┌─────────────────────────────────────────────────────────────┐ +│ CLI Drivers (coding) │ +│ Claude Code · Codex · Copilot CLI · Gemini CLI · Goose │ +├─────────────────────────────────────────────────────────────┤ +│ OpenClaw / NemoClaw (browser + integrations) │ +│ Web apps · NotebookLM · ChatGPT · Slack · 100+ skills │ +├─────────────────────────────────────────────────────────────┤ +│ Local Models (zero cost) │ +│ Ollama · Nemotron (via NemoClaw) │ +└─────────────────────────────────────────────────────────────┘ + │ every tool call + ═════╪═════════════════ + ║ AgentGuard Kernel ║ + ║ allow · deny · audit║ + ═════╪═════════════════ + │ approved + Octi Pulpo (coordination) + ─────┼───────────────── + │ + Your Environment (files, shell, git, browser, APIs) +``` + +### CLI Drivers + +Purpose-built for code generation. Each uses its own subscription — no API keys needed: + +| Driver | Subscription | Best For | +|--------|-------------|----------| +| `claude-code` | Claude Max | Complex reasoning, architecture | +| `codex` | OpenAI Pro | Code generation, refactoring | +| `copilot` | GitHub Pro | PR workflows, code review | +| `gemini-cli` | Google AI Premium | Analysis, multi-modal | +| `goose` | Free (local Ollama) | Air-gapped, zero cost | + +### OpenClaw / NemoClaw Runtime + +Browser automation and integrations via consumer app subscriptions: + +| App | Via | Capability | +|-----|-----|-----------| +| ChatGPT | Browser (Playwright) | Reasoning tasks via existing OpenAI Plus subscription | +| NotebookLM | Browser (Playwright) | Audio briefings, slide decks, charts, Drive docs | +| Gemini App | Browser (Playwright) | Multi-modal analysis via Google AI Premium | +| Slack, Discord | OpenClaw skills | Messaging, notifications, integrations | + +**NemoClaw** (optional, heavier) adds: +- **NVIDIA OpenShell** — kernel-level sandbox (process isolation, not just policy) +- **Nemotron** — local NVIDIA models for zero-cost inference + +### Local Models + +Zero token cost via Ollama or Nemotron: + +| Model | Params | RAM | Best For | +|-------|--------|-----|----------| +| `qwen3:1.7b` | 1.7B | ~1.2 GB | Fast triage, classification | +| `qwen3:8b` | 8B | ~6 GB | Balanced reasoning | +| `qwen3:30b` | 30B | ~19 GB | Production quality | +| Nemotron (via NemoClaw) | Various | GPU | NVIDIA hardware acceleration | + +## The Governed Swarm Platform + +ShellForge is one layer in a three-part platform: ``` -┌─────────────────────────────────────────────┐ -│ Layer 8: OpenShell (Kernel Sandbox) │ Docker/Colima isolation -├─────────────────────────────────────────────┤ -│ Layer 7: DefenseClaw (Supply Chain) │ Cisco AI BoM Scanner -├─────────────────────────────────────────────┤ -│ Layer 6: Dagu (Orchestration) │ YAML DAG workflows + web UI -├─────────────────────────────────────────────┤ -│ Layer 5: Goose / OpenCode (Execution) │ Primary local agent driver -├─────────────────────────────────────────────┤ -│ Layer 4: AgentGuard (Governance Kernel) │ Policy enforcement -├─────────────────────────────────────────────┤ -│ Layer 3: TurboQuant (Quantization) │ KV cache optimization (optional) -├─────────────────────────────────────────────┤ -│ Layer 2: RTK (Token Compression) │ Auto-compress I/O (optional) -├─────────────────────────────────────────────┤ -│ Layer 1: Ollama (Local LLM) │ Metal GPU on Mac -└─────────────────────────────────────────────┘ +┌─────────────────────────────────────────────────────┐ +│ ShellForge │ +│ Orchestration — forge and run agent swarms │ +│ CLI drivers + OpenClaw/NemoClaw + local models │ +├─────────────────────────────────────────────────────┤ +│ Octi Pulpo │ +│ Coordination — shared memory, model routing, │ +│ budget-aware dispatch, priority signaling │ +├─────────────────────────────────────────────────────┤ +│ AgentGuard │ +│ Governance — policy enforcement, telemetry, │ +│ invariants, compliance │ +└─────────────────────────────────────────────────────┘ ``` +ShellForge orchestrates. Octi Pulpo coordinates. AgentGuard governs. + +## Cost-Aware Routing + +Octi Pulpo routes tasks to the cheapest capable driver: + +| Tier | Driver | Cost | Use When | +|------|--------|------|----------| +| Local | Ollama / Nemotron | $0 | Simple tasks, triage, classification | +| Subscription | Browser → ChatGPT / NotebookLM / Gemini | Already paying | Medium tasks, artifacts, briefings | +| CLI | Claude Code / Codex / Copilot | Already paying | Coding, PRs, commits | +| API | Direct API calls | Per-token | Burst capacity, programmatic access | + +## Infrastructure Stack + +| Layer | Project | What It Does | +|-------|---------|--------------| +| **Infer** | [Ollama](https://ollama.com) | Local LLM inference (Metal GPU on Mac) | +| **Optimize** | [RTK](https://github.com/rtk-ai/rtk) | Token compression — 70-90% reduction on shell output | +| **Execute** | [Goose](https://block.github.io/goose) / [OpenClaw](https://github.com/openclaw/openclaw) | Agent execution + browser automation | +| **Orchestrate** | [Dagu](https://github.com/dagu-org/dagu) | YAML DAG workflows with scheduling and web UI | +| **Coordinate** | [Octi Pulpo](https://github.com/AgentGuardHQ/octi-pulpo) | Swarm coordination via MCP | +| **Govern** | [AgentGuard](https://github.com/AgentGuardHQ/agentguard) | Policy enforcement on every action | +| **Sandbox** | [OpenShell](https://github.com/NVIDIA/OpenShell) | Kernel-level isolation (Docker on macOS) | +| **Scan** | [DefenseClaw](https://github.com/cisco-ai-defense/defenseclaw) | Supply chain scanner — AI Bill of Materials | + ## Go Project Layout ``` @@ -38,8 +127,13 @@ internal/ ├── ollama/ # Ollama HTTP client (chat, generate) ├── agent/ # Native fallback agentic loop ├── tools/ # 5 tool implementations + RTK wrapper -├── engine/ # Pluggable engine interface (OpenCode, DeepAgents) +├── engine/ # Pluggable engine interface (Goose, OpenClaw, OpenCode) ├── logger/ # Structured JSON logging +├── scheduler/ # Memory-aware scheduling + cron +├── orchestrator/ # Multi-agent state machine +├── normalizer/ # Canonical Action Representation +├── correction/ # Denial tracking + escalation +├── intent/ # Format-agnostic intent parsing └── integration/ # RTK, OpenShell, DefenseClaw, TurboQuant, AgentGuard ``` @@ -47,39 +141,27 @@ internal/ ShellForge uses a pluggable engine system: -1. **Goose (Block)** (preferred local driver) — subprocess, native Ollama support, SHELL wrapped via `govern-shell.sh` -2. **OpenCode** (alternative) — subprocess, `--non-interactive` mode, governance-wrapped -3. **DeepAgents** (alternative) — subprocess, Node.js/Python SDK, governance-wrapped -4. **Native** (fallback) — built-in multi-turn loop with Ollama + tool calling - -The engine selection is automatic based on what's installed. Use `shellforge run goose` for local models, or `shellforge agent` for the native loop. +1. **Goose** (preferred local driver) — subprocess, native Ollama support, SHELL wrapped via `govern-shell.sh` +2. **OpenClaw** (browser + integrations) — browser automation, web app access, 100+ skills +3. **NemoClaw** (enterprise) — OpenClaw + NVIDIA OpenShell sandbox + Nemotron local models +4. **CLI Drivers** (cloud coding) — Claude Code, Codex, Copilot CLI, Gemini CLI +5. **Native** (fallback) — built-in multi-turn loop with Ollama + tool calling ## Governance Flow ``` -User Request → Engine (Goose/OpenCode/DeepAgents/Native) +User Request → Engine (Goose/OpenClaw/CLI/Native) → Tool Call → Governance Check (agentguard.yaml) → ALLOW → Execute Tool → Return Result → DENY → Log Violation → Correction Feedback → Retry ``` -## Data Flow - -1. User invokes `./shellforge qa` (or agent, report, scan) -2. CLI loads `agentguard.yaml` governance policy -3. Detects available engine (Goose > OpenCode > DeepAgents > Native) -4. Engine sends prompt to Ollama (via RTK for token compression) -5. LLM responds with tool calls -6. Each tool call passes through governance check -7. Allowed tools execute (shell commands wrapped by RTK + OpenShell sandbox) -8. Results compressed by RTK, fed back to LLM -9. Loop continues until task complete or budget exhausted +The format-agnostic intent parser handles tool calls from any LLM output format (tool_calls, JSON blocks, XML tags, function_call). ## macOS (Apple Silicon) Support -All 8 layers run on Mac M4: +All layers run on Mac M4: - Ollama uses Metal for GPU acceleration -- RTK, AgentGuard, OpenCode are native arm64 binaries -- TurboQuant runs via PyTorch (MPS backend) +- RTK, AgentGuard, ShellForge are native arm64 binaries - OpenShell runs inside Docker/Colima (Linux VM for Landlock) - DefenseClaw installs via pip or source build diff --git a/docs/roadmap.md b/docs/roadmap.md index ab10459..283e75e 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -72,6 +72,27 @@ Foundation types exist (`internal/action/`, `internal/orchestrator/`, `internal/ ## Planned +### Phase 7.5 — Octi Pulpo Integration + Browser Drivers + +ShellForge orchestrates, Octi Pulpo coordinates, AgentGuard governs. This phase wires the three together. + +#### 7.5.1 — Octi Pulpo Coordination +- [ ] Consume Octi Pulpo MCP tools (route_recommend, coord_claim, coord_signal) +- [ ] Budget-aware driver selection — query Octi Pulpo before choosing model/driver +- [ ] Duplicate work prevention via coord_claim (prevents agent stampedes) +- [ ] Driver health signals — broadcast ShellForge agent status to Octi Pulpo + +#### 7.5.2 — OpenClaw / NemoClaw Browser Driver +- [ ] OpenClaw as execution runtime for browser-based agents +- [ ] NemoClaw as optional adapter (never a dependency — protect kernel independence) +- [ ] Browser driver support in `shellforge run` (alongside Goose, Claude Code, Copilot, Codex, Gemini) +- [ ] Governed browser actions through AgentGuard kernel + +#### 7.5.3 — Ecosystem Wiring +- [ ] ShellForge agents auto-connect to Octi Pulpo MCP server on startup +- [ ] Shared memory across ShellForge-managed agents via Octi Pulpo memory_store/recall +- [ ] Model routing delegation — ShellForge defers to Octi Pulpo route_recommend + ### Phase 8 — AgentGuard MCP Server - [ ] MCP server exposing governed tools - [ ] Goose → MCP → AgentGuard → execute