Ember is a command-line AI coding agent written in Rust. It runs an agentic loop β plan, use tools, observe results, repeat β against any LLM backend you configure. One binary, no Python runtime, no Node.js.
- Provider-agnostic β 10 LLM backends out of the box. Switch models mid-session with
/model. Bring your own OpenAI-compatible endpoint. /compareβ A/B test providers β Send the same prompt to two providers side-by-side. Compare quality and cost. Pick the winner. No other CLI tool can do this.- Smart Model Cascade β
--model autoanalyzes prompt complexity and routes simple questions to fast/cheap models, complex tasks to powerful ones. Save money without losing quality. - EMBER.md project context β Drop an
EMBER.mdin your project root and Ember loads it as system context. Like Claude Code'sCLAUDE.md. /undoβ Revert the last file change made by a tool. Every write is snapshotted.- Git-native β
/commit,/diffright from the REPL. Auto-commit after tool runs. - Session forking β branch a conversation like a git branch. Explore an alternative approach, then restore to the fork point if it doesn't work out.
- Voice mode (preview) β
ember voicefor hands-free coding with speech-to-text and TTS. - RAG indexing (preview) β
ember index .to embed your codebase for semantic search. - Multi-agent orchestration (preview) β
ember agents run "task" --roles coder,reviewerfor parallel specialized agents. - Plugin hooks (preview) β intercept any tool call before or after execution. Approve, deny, log, or transform output from your own code.
- Auto-compaction β when the context window fills up, the oldest turns are summarised in-place. The session continues without interruption.
- Cost tracking β every API call records token counts and a USD estimate.
/costshows the running total for the session. - Granular permissions β restrict what paths a tool may read or write, which commands it may run, and whether writes are allowed at all.
- MCP support β connect external tool servers over stdio, HTTP, or WebSocket. Tools are namespaced and auto-discovered.
- Plan Mode β
/plantoggles read-only mode where Ember proposes changes without executing./executeruns the plan. Like Gemini CLI's Plan Mode, but better. .ember/rules/directory β Modular rule files instead of one giant EMBER.md. Organize by concern:style.md,testing.md,security.md. Auto-merged at load./checkpoint+/replayβ Save conversation checkpoints and replay sessions as tutorials. Great for onboarding and code review.ember benchβ Built-in benchmarking across providers. Compare quality, latency, and cost in one command. No other tool has this.ember learn(preview) β Tracks your coding preferences and patterns over time. Personalized AI that gets better the more you use it.- Semantic caching (preview) β Similar prompts served from cache.
/cacheshows stats,/cache clearresets. - Single binary β
cargo build --releaseproduces one ~15 MB executable with no runtime dependencies.
# Install
cargo install ember-cli
# or: curl -fsSL https://ember.dev/install.sh | sh
# Set your API key
export ANTHROPIC_API_KEY="..." # or OPENAI_API_KEY, etc.
# One-shot task
ember chat "Refactor this function to use iterators" --tools filesystem
# Interactive mode
ember chatOnce in interactive mode:
You: explain what this crate does
Ember: β¦
/model claude-3-5-sonnet # switch model
/cost # show session cost
/fork before-refactor # create a branch point
/compact # force context compaction
/forks # list branches
/restore <id> # go back
The core loop in ember-core drives a ReAct-style agent:
user message β LLM call β [tool calls β tool results β LLM call β¦] β response
- Configurable max tool rounds per turn (default: 25)
- Tool timeout per call (default: 120 s)
- Max output tokens per completion (default: 4096)
- Auto-compact when token count exceeds 80% of the context window
The loop is backend-agnostic: LlmBackend and ToolBackend are traits. Swap in any provider or a mock for testing.
Branch a conversation at any point. Each fork stores a snapshot of the full turn history.
/fork try-different-prompt β creates a named branch
/forks β lists all forks with turn counts
/restore <fork-id> β replaces current history with the snapshot
Forks are ordered by creation time. The most recently created fork is marked active in the list.
Every turn records input tokens, output tokens, cache-creation tokens, and cache-read tokens. Costs are looked up from a built-in pricing table:
| Model family | Input (per 1M) | Output (per 1M) |
|---|---|---|
| Claude Opus | $15.00 | $75.00 |
| Claude Sonnet | $3.00 | $15.00 |
| Claude Haiku | $1.00 | $5.00 |
| GPT-4o | $2.50 | $10.00 |
| GPT-4o mini | $0.15 | $0.60 |
Anthropic prompt-cache tokens (creation and read) are tracked separately. The /cost command shows per-turn breakdown and session total.
Unknown models return a zero-cost sentinel β the tracker never panics on an unrecognised model ID.
Plugins intercept tool calls at three points in the lifecycle:
| Hook | When | What it can do |
|---|---|---|
PreToolUse |
Before execution | Approve or deny the call |
PostToolUse |
After success | Replace the tool output |
PostToolUseFailure |
After failure | Log errors, trigger fallbacks |
Hooks are registered with a priority (lower = runs first). All hooks for an event are called even when one denies β messages from every handler are collected.
runner.register(HookHandler {
name: "policy".to_string(),
events: vec![HookEvent::PreToolUse],
priority: 0,
handler: Box::new(|ctx| {
if ctx.tool_name == "shell" && ctx.tool_input.contains("rm -rf") {
HookRunResult::deny("destructive shell command blocked")
} else {
HookRunResult::allow()
}
}),
});Three modes, configurable per-tool:
Unrestrictedβ allow everything (default, no breaking change to existing code)Interactiveβ surface aNeedsApprovalresult for every action; the caller handles the promptPolicyβ evaluate actions against per-tool rules
Per-tool rules include:
allowed_paths/denied_pathsβ component-level prefix matching (/tmpdoes not match/tmp_other/foo)read_onlyβ deny all writes regardless of path rulesallowed_commandsβ whitelist of executable names (bare name or full path, matched by basename)max_execution_timeβ per-action timeout
When a conversation grows large, compact_conversation replaces the oldest turns with a summary turn and returns metrics:
turns_removedβ how many turns were mergedoriginal_tokens/compacted_tokensβ before/after estimates (4 chars β 1 token heuristic)summaryβ the text inserted at the front of the conversation
keep_recent_turns (default: 4) and summary_max_tokens (default: 2000) are configurable. The compaction only fires when the estimated token count exceeds compact_threshold Γ max_context_tokens (defaults: 0.8 Γ 100k).
The REPL recognises these slash commands:
| Command | Aliases | What it does |
|---|---|---|
/help |
/h |
List all commands |
/status |
β | Turns, tokens, cost for this session |
/compact |
β | Force context compaction now |
/model [name] |
/m |
Show or change the active model |
/permissions [mode] |
/perm |
Show or change permission mode |
/config [section] |
/cfg |
Display merged configuration |
/memory |
/mem |
Show context window usage |
/clear [--yes] |
/c |
Clear the conversation |
/cost |
β | Cost breakdown for this session |
/fork [name] |
β | Create a named fork point |
/forks |
β | List all forks |
/restore <id> |
β | Restore to a fork |
Tab completion is handled by SlashCompleter β prefixes are matched against the registry, so /mo completes to /model and /me to /memory.
The terminal renderer uses pulldown-cmark for Markdown parsing and syntect for syntax highlighting:
- Fenced code blocks with a
ββ rust ββββββborder and 24-bit colour highlighting - Coloured headings (H1βH6), emphasis, strong, inline code, blockquotes, links
- Animated Braille spinners (
β β β Ήβ Έβ Όβ ΄β ¦β §β β ) with success/failure finish states - Tool output formatter: header line (
β‘ Running: bash ls -la), output truncation, error display
All rendering methods accept any io::Write so they are testable without a TTY.
Three config sources are merged in order: User β Project β Local. Later sources override earlier ones. The merge is deep (nested tables are merged, not replaced) and every entry records which config file set it.
let config = ConfigLoader::new()
.add_user_config()
.add_project_config("./ember.toml")
.add_local_config("./.ember.local.toml")
.load()?;
// Know which file set a value:
let entry = config.get("model");
println!("{:?}", entry.source); // ConfigSource::Project("./ember.toml")Startup is split into ordered phases so the critical path can be measured and optional phases can be skipped:
CliEntry β ConfigLoad β ProviderInit β PluginDiscovery β
McpSetup β SystemPrompt β ToolRegistry β SessionInit β MainRuntime
BootstrapTimer records wall-clock time for each phase. BootstrapPlan::fast_path(&[PluginDiscovery, McpSetup]) skips the two slowest phases for quick one-shot commands.
ember-cli CLI entry, REPL, TUI rendering, slash commands
ember-core Agent runtime, compaction, permissions, forking, config merge, bootstrap
ember-llm 10 LLM provider adapters, streaming, token usage
ember-tools File ops, shell, web, git, code execution
ember-plugins Plugin system, hook pipeline
ember-mcp MCP client, multi-transport (stdio/HTTP/WebSocket), tool registry
ember-storage Persistent storage, checkpoints
ember-telemetry Usage tracking, session logging
ember-browser Browser automation (chromiumoxide)
ember-voice Voice I/O
ember-web Web interface
ember-desktop Desktop app (Tauri)
ember-i18n Internationalization
ember-enterprise Enterprise features
| Feature | Ember | Claude Code | Codex CLI |
|---|---|---|---|
| Multi-Provider | β 10 providers | β Anthropic only | β OpenAI only |
| Session Forking | β | β | β |
| Plugin Hooks | β Preview | β | β |
| MCP Support | β Preview | β | β |
| Cost Tracking | β Per-turn + session | Basic | Basic |
| Prompt Cache Tracking | β Creation + read | β | N/A |
| Auto-Compaction | β Configurable | β | β |
| Per-Tool Permissions | β Path/command/time | β | β |
| Config Merge (3 levels) | β | β | β |
| Single Binary | β | β | β |
| Open Source | β MIT | Partial | β |
cargo install ember-cligit clone https://github.com/niklasmarderx/ember
cd ember
cargo build --release
./target/release/ember --versiondocker run -it --rm ghcr.io/niklasmarderx/ember chat "Hello"API keys are read from environment variables:
export ANTHROPIC_API_KEY="..."
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="..."Or place them in .env at the project root. Run ember config init to create a starter config file, ember config show to inspect the merged result.
| Provider | Status |
|---|---|
| Anthropic (Claude) | β |
| OpenAI (GPT-4o, o1) | β |
| Google Gemini | β |
| Ollama (local) | β |
| Groq | β |
| DeepSeek | β |
| Mistral | β |
| OpenRouter | β |
| xAI (Grok) | β |
| AWS Bedrock | β |
Any OpenAI-compatible API endpoint also works via the OpenAI provider with a custom base URL.
git clone https://github.com/niklasmarderx/ember
cd ember
cargo test --workspace
cargo run -p ember-cli -- chat "Hello"Issues and PRs are welcome. See CONTRIBUTING.md for guidelines.
MIT β see LICENSE-MIT
