· GitHub
AI Agent framework for Ruby. Built on ruby-mana.
Claw turns ruby-mana's embedded LLM engine into a full agent with persistent memory, interactive chat, and session recovery. Think of it as the agent layer on top of mana's execution engine.
gem install ruby-clawclaw launches a full-screen terminal UI (built on Charm Ruby's bubbletea) with 4 zones: top status bar, left chat panel, right status panel, and bottom command bar. All interaction happens in the TUI — type Ruby expressions directly, or use /ask to talk to the AI agent.
- Ruby-first: expressions are evaluated directly
- Natural language is automatically routed to the AI agent
- Streaming output with markdown rendering
- Session persists across restarts
Claw stores memories as human-readable Markdown in .ruby-claw/:
.ruby-claw/
MEMORY.md # Long-term facts (editable!)
session.md # Conversation summary
system_prompt.md # Custom agent personality
values.json # Variable snapshots
definitions.rb # Method definitions
log/
2026-03-29.md # Daily interaction log
traces/
20260405_103000.md # Execution traces
evolution/
20260405_accept.md # Evolution logs
gems/ # Editable gem source (after claw init)
The LLM can remember facts that persist across sessions:
claw> remember that the API uses OAuth2
claw> # ... next session ...
claw> what auth does our API use?
# => "OAuth2 — I remembered this from a previous session"Variables and method definitions survive across sessions:
claw> a = 42
claw> def greet(name) = "Hello #{name}"
claw> exit
$ claw # restart
claw> a # => 42
claw> greet("world") # => "Hello world"When conversation grows large, old messages are automatically summarized in the background.
Temporarily disable memory loading and saving:
Claw.incognito do
~"translate <text> to French, store in <french>"
# No memories loaded, nothing remembered
end
Claw::Memory.incognito? # => true inside the blockWith many memories (>20), only the most relevant are injected into prompts.
Snapshot and rollback the entire agent state (context, memory, variables, filesystem):
claw> /snapshot before-refactor
✓ snapshot #2 created (before-refactor)
claw> # ... make changes ...
claw> /rollback 2
✓ rolled back to snapshot #2
Slash commands:
| Command | Description |
|---|---|
/snapshot [label] |
Snapshot all resources |
/rollback <id> |
Rollback to a snapshot |
/diff [id_a id_b] |
Show diff between snapshots |
/history |
List all snapshots |
/status |
Show current resource state |
/evolve |
Run a self-evolution cycle |
/role <name> |
Switch agent role/identity |
/forge <method> |
Promote a method to a formal tool |
/plan toggles plan mode. When active, the LLM generates a step-by-step plan without executing any tools. The user reviews the proposed steps, then confirms execution -- which runs in a safe fork so the original state is preserved if anything goes wrong.
Role files are Markdown documents stored in .ruby-claw/roles/. Each role defines an agent identity (system prompt, constraints, tool permissions).
/role <name>switches the active agent identity at runtimeclaw initcreates a default role
claw benchmark run executes the benchmark suite -- 9 built-in tasks spanning the mana, claw, runtime, and evolution layers. Each task runs 3 times, and scoring covers:
- Correctness -- did the agent produce the right result?
- Rounds efficiency -- how many LLM round-trips were needed?
- Token efficiency -- total token usage
- Tool path accuracy -- did the agent call the expected tools in the expected order?
claw benchmark diff <a> <b> compares two benchmark reports side by side. Auto-triggers an evolution cycle on score regression or 3 consecutive failures.
runtime.fork_async(prompt:, vars:, role:) spawns a child agent that runs in an isolated thread with deep-copied variables and an optional git worktree for filesystem isolation.
Child lifecycle methods:
child.join-- block until the child finisheschild.cancel!-- abort the childchild.diff-- inspect changes made by the childchild.merge!-- merge the child's results back into the parent
All operations are thread-safe with Mutex protection.
Every LLM interaction is logged as a Markdown file in .ruby-claw/traces/:
# Task: compute average of numbers
- Model: claude-sonnet-4-20250514
- Steps: 2
- Total tokens: 1100 in / 350 out
- Total latency: 1400ms
## Step 1
- Latency: 800ms
- Tokens: 500 in / 200 out
### Tool calls
- **read_var**(name: "numbers") -> [1, 2, 3]Claw has a three-layer tool architecture:
- Core tools (always loaded):
read_var,write_var,call_func,eval,remember,search_tools,load_tool - Project tools (on-demand):
.ruby-claw/tools/*.rb— indexed at startup, loaded viaload_tool - Hub tools (remote): community tools from a ruby-claw-toolhub, downloaded on demand
Create a project tool:
# .ruby-claw/tools/format_report.rb
class FormatReport
include Claw::Tool
tool_name "format_report"
description "Format raw data into a readable report"
parameter :data, type: "Hash", required: true, desc: "Raw data"
parameter :style, type: "String", required: false, desc: "brief or detailed"
def call(data:, style: "brief")
# ...
end
endThe agent discovers tools via search_tools and loads them via load_tool. Use /forge <method_name> to promote an eval-defined method into a formal tool class.
claw console launches a local web UI at http://127.0.0.1:4567 for observability and operations:
- Dashboard — version, tool/memory/snapshot counts
- Prompt Inspector — view and edit the assembled system prompt
- LLM Monitor — real-time event stream via Server-Sent Events
- Trace Explorer — browse execution traces
- Memory Manager — add/remove long-term memories
- Tool Manager — view core tools, load/unload project tools
- Snapshot Manager — create snapshots, rollback state
All data is served via a REST API (/api/status, /api/traces, /api/memory, etc.).
Initialize a project with editable gem source for self-evolution:
claw initCreates:
.ruby-claw/
gems/
ruby-claw/ # Editable source
ruby-mana/
tools/ # Project tool classes
roles/ # Agent role definitions
benchmarks/ # Benchmark reports
system_prompt.md # Customizable agent personality
MEMORY.md
.git/ # Filesystem snapshots
The agent can improve its own code:
claw> /evolve
⚡ running evolution cycle...
✓ accepted: Improve error message specificity
Flow: read traces → LLM diagnoses improvement → fork runtime → apply change → run tests → keep or rollback.
Evolution logs are written to .ruby-claw/evolution/.
| Command | Description |
|---|---|
claw |
Launch the TUI (default) |
claw init |
Scaffold a new project |
claw status |
Show current resource state |
claw history |
List all snapshots |
claw rollback <id> |
Rollback to a snapshot |
claw trace [id] |
View execution traces |
claw evolve |
Run a self-evolution cycle |
claw benchmark run |
Run the benchmark suite |
claw benchmark diff <a> <b> |
Compare two benchmark reports |
claw console |
Launch the web console UI |
claw version |
Print version |
claw help |
Show help |
Claw.configure do |c|
c.memory_pressure = 0.7 # Compact when tokens > 70% of context window
c.memory_keep_recent = 4 # Keep last 4 conversation rounds during compaction
c.compact_model = nil # nil = use main model for summarization
c.persist_session = true # Save/restore session across restarts
c.memory_top_k = 10 # Max memories to inject when searching
c.on_compact = ->(summary) { puts summary }
c.tools_dir = nil # Custom tools directory (default: .ruby-claw/tools)
c.hub_url = nil # Remote tool hub URL
c.console_port = 4567 # Web console port
end
# Mana config (inherited)
Mana.configure do |c|
c.model = "claude-sonnet-4-6"
c.api_key = "sk-..."
endClaw extends mana via its tool registration interface — no monkey-patching:
# Claw registers the "remember" tool into mana's engine
Mana.register_tool(remember_tool_definition) { |input| ... }
# Claw injects long-term memories into mana's system prompt
Mana.register_prompt_section { |context| memory_text }- ruby-mana = Embedded LLM engine (
~"..."syntax, binding manipulation, tool calling) - ruby-claw = Agent framework (TUI, memory, persistence, knowledge)
Claw depends on mana. You can use mana standalone for embedding LLM in Ruby code, or add claw for interactive agent features.
MIT