A2A-compliant agent server for routing, memory, and multi-model orchestration.
Quick start · Skills · API · Deploy
ÆLLI is an A2A-compliant agent server that routes natural-language tasks to the right skill, runs generation-review workflows that prevent a model from approving its own output, and persists project experience into playbooks that improve future sessions. It sits between your IDE, your models, and your engineering knowledge — and handles none of the things that belong in a chat window.
It is not a chatbot. It is the control plane.
- Node.js 18+
- pnpm — package manager (
npm install -g pnpmif not installed) - Postgres 16
- Qdrant — vector store for engineering knowledge search
- LiteLLM proxy with your model routes configured
Note: ÆLLI delegates all model calls to LiteLLM. You need a running LiteLLM instance with at least one chat model and one embedding model route before
pnpm startcan serve skill requests.
git clone https://github.com/raelli/aelli.git
cd aelli
pnpm install
cp .env.example .envEdit .env — the three required values:
AELLI_INBOUND_SECRET=<strong-random-secret>
LITELLM_BASE_URL=http://your-litellm-host:4000
LITELLM_API_KEY=<your-litellm-key>Then:
pnpm test # requires Postgres and Qdrant (see Full stack below)
pnpm startExpected:
[AELLI] Server running on port 3456
docker network create integrahub_network 2>/dev/null || true
docker compose up -dThis starts ÆLLI on port 3456, the memory service on port 3457, Postgres, and Qdrant. The compose file expects an external Docker network named integrahub_network — create it once with the command above.
| Layer | Components |
|---|---|
| Caller | Human · IDE · LiteLLM gateway |
| Core | Express server · A2A protocol · port 3456 |
| Skills | Orchestrator · Dev Advisor · Engineering Knowledge · Multi-Router · Octowiz |
| Storage | Postgres · Qdrant · MemPalace (session experience) |
Each skill is mounted at /a2a/<skill-id> and publishes an agent card at /a2a/<skill-id>/.well-known/agent.json.
| Skill | Endpoint | Purpose |
|---|---|---|
| Orchestrator | /a2a/aelli |
Primary entry point. Routes natural-language tasks to the best registered skill. |
| Dev Advisor | /a2a/dev-advisor |
Watches coding-session hook events and flags file conflicts, branch drift, and spec deviation. |
| Engineering Knowledge | /a2a/engineering |
Qdrant-backed semantic search and ingest for engineering knowledge. |
| Multi-Router | /a2a/aelli-router |
Classifies tasks, routes to coding or review models, prevents a model from approving its own output. |
| Octowiz | /a2a/octowiz |
Builds model-tier-aware planning and review context bundles for Octowiz / Claude Code workflows. |
/a2a/aelli
Use this when the caller knows what they want but not which agent should do it. The orchestrator keeps a live registry of all mounted skill cards, selects the best skill for the request, and streams the result over SSE.
Operation: aelli — free-form natural-language request
/a2a/dev-advisor
Dev Advisor processes hook events from Octowiz / Claude Code sessions and runs deterministic engineering-hygiene checks:
- file-conflict detection — warns when another session changed a file after the current session started
- branch-drift detection — warns when a branch falls behind its remote tracking ref
- spec-deviation detection — warns when modified files were not mentioned in the triggering prompt
Set AELLI_LLM_ADVICE=true to add an LLM advisory pass on top of the deterministic checks.
Operation: dev-advisor — accepts a hook-event payload
/a2a/engineering
Embeds content through LiteLLM, stores vectors in Qdrant, and returns ranked passages with a synthesized answer. Ingests new documents without restarting ÆLLI.
Operations:
engineering:query— semantic search with ranked passages and synthesized answerengineering:ingest— ingest a document with title, content, source URL, and optional metadata
/a2a/aelli-router
Classifies tasks by complexity, routes coding work to generation models, and hands review to a separate reviewer model. The generator does not approve its own output.
Workflow types:
| Workflow | Trigger | Flow |
|---|---|---|
standard |
default | generate → review loop → complete |
complex |
XHIGH / MAX tier | generate → architecture review → deep review → revise loop |
high-risk |
escalate=true or reasoning-required task |
generate → review → validation gate → revise |
All three variants share one bounded review loop. An escalation policy evaluates the
complete loop state every iteration and can silently upgrade the reviewer to the
REASONING tier — triggered by security/architecture keywords, two validation failures,
reviewer disagreement, or an exhausted revision budget. When it fires, the reviewed
progress event carries the reason (escalation: "keyword" | "validation" | "disagreement" | "budget").
Operations:
aelli-router:route— synchronous classification returning{ router, tier, model, workflow }aelli-router:workflow— streaming workflow execution with live progress events
/a2a/octowiz
Builds context bundles for Claude Code and Octowiz. Bundle size is model-tier-aware: COMPACT for lightweight models, STANDARD for normal tasks, FULL for reasoning-tier or high-context work. Bundles can include project playbooks from MemPalace, operational memory, and relevant engineering knowledge.
Operations:
octowiz:plan— planning context bundleoctowiz:review— review context bundle
ÆLLI separates memory by responsibility:
| Layer | Backing system | Used for |
|---|---|---|
| Operational memory | aelli-memory service + Postgres |
Registry, system state, deterministic key/value memory |
| Knowledge memory | Qdrant + embeddings | Searchable project and engineering knowledge |
| Experience memory | MemPalace + Postgres | Session experiences, reflection, project playbooks |
Truth stays at the source. ÆLLI builds context around GitHub, Confluence, Jira, and other systems; it does not try to become the canonical database for everything.
cp .env.example .envRequired:
| Variable | Purpose |
|---|---|
AELLI_INBOUND_SECRET |
Shared secret for protected endpoints via x-aelli-secret. |
LITELLM_BASE_URL |
LiteLLM gateway for outbound model and embedding calls. |
LITELLM_API_KEY |
Bearer token for LiteLLM. |
Storage (for memory features):
| Variable | Purpose |
|---|---|
DATABASE_URL |
Postgres connection for MemPalace and PgStore. |
MEMORY_SERVICE_URL |
URL of the aelli-memory service (default: http://localhost:3457). |
AELLI_MEMORY_SECRET |
Shared secret for memory service requests. |
QDRANT_URL |
Qdrant server URL (default: http://localhost:6333). |
GITHUB_WEBHOOK_SECRET |
Enables and verifies /webhooks/github when set. |
Model routing overrides:
| Variable | Purpose |
|---|---|
LITELLM_MODEL |
Default model route. |
LITELLM_EMBED_MODEL |
Embedding model route. |
CODING_MODEL_NORMAL |
Coding model for normal tasks. |
CODING_MODEL_HIGH |
Coding model for high-complexity tasks. |
CODING_MODEL_XHIGH |
Coding model for extra-high-complexity tasks. |
CODING_MODEL_MAX |
Coding model for maximum-complexity tasks. |
NEMOTRON_MODEL_SIMPLE |
Reviewer model for simple checks. |
NEMOTRON_MODEL_MEDIUM |
Reviewer model for medium checks. |
NEMOTRON_MODEL_COMPLEX |
Reviewer model for complex checks. |
NEMOTRON_MODEL_REASONING |
Reviewer model for reasoning-heavy checks. |
NEMOTRON_LITELLM_BASE_URL |
Optional separate LiteLLM base URL for Nemotron calls. |
Feature flags:
| Variable | Default | Purpose |
|---|---|---|
AELLI_LLM_ADVICE |
false |
Enable LLM advisory pass in Dev Advisor on top of deterministic checks. |
PORT |
3456 |
HTTP port for the ÆLLI server. |
See .env.example for the full list.
Register each ÆLLI skill as an A2A agent in LiteLLM:
| LiteLLM agent name | Invocation URL |
|---|---|
aelli |
http://<host>:3456/a2a/aelli |
aelli-dev-advisor |
http://<host>:3456/a2a/dev-advisor |
aelli-engineering |
http://<host>:3456/a2a/engineering |
aelli-router |
http://<host>:3456/a2a/aelli-router |
aelli-octowiz |
http://<host>:3456/a2a/octowiz |
Agent-card discovery:
http://<host>:3456/a2a/<skill-id>/.well-known/agent.json
| Endpoint | Auth | Description |
|---|---|---|
GET /a2a/:skill/.well-known/agent.json |
none | Agent card for a mounted skill. |
GET /a2a/:skill/.well-known/agent-card.json |
none | Alternate agent-card path. |
POST /a2a/aelli |
x-aelli-secret |
Orchestrator. |
POST /a2a/dev-advisor |
x-aelli-secret |
Dev Advisor hook-event processing. |
POST /a2a/engineering |
x-aelli-secret |
Engineering query / ingest. |
POST /a2a/aelli-router |
x-aelli-secret |
Route classification and streaming workflows. |
POST /a2a/octowiz |
x-aelli-secret |
Plan / review context bundles. |
| Endpoint | Auth | Description |
|---|---|---|
GET /a2a/task-queue |
x-aelli-secret |
SSE stream of pending daemon tasks. |
POST /a2a/task-queue/:id/claim |
x-aelli-secret |
Claim a task lease. |
POST /a2a/task-queue/:id/result |
x-aelli-secret |
Submit task result. |
POST /webhooks/github |
HMAC-SHA256 | GitHub push / PR ingest into Engineering Knowledge. |
| Endpoint | Auth | Description |
|---|---|---|
GET /memory/:key |
x-aelli-memory-secret |
Read memory value. |
PUT /memory/:key |
x-aelli-memory-secret |
Write memory value. |
The memory service fails closed when AELLI_MEMORY_SECRET is not set.
docker compose up -d
curl -s http://127.0.0.1:3456/a2a/aelli/.well-known/agent.json | jq .nameExpected: "AELLI Orchestrator"
For Dev Advisor end-to-end behavior:
- Start the full stack.
- Open a Claude Code session with the Octowiz plugin enabled.
- Edit a file.
- Submit a prompt that does not mention that file.
- Expect a spec-deviation warning in the session context.
- Stop ÆLLI mid-session — Claude Code should continue without breaking.
At the end of a Claude Code session, ÆLLI records an experience row in aelli_experiences scoped to the repository basename (the last path component of the repository root).
When enough new experiences accumulate, the Reflection Agent:
- reads experiences since the last reflection cursor;
- synthesizes an updated project playbook;
- writes it to the memory service at
agent:aelli:playbook:<scope>; - makes that playbook available to future Octowiz context bundles.
The whole pipeline runs against a single SessionMemoryStore interface — experience
diaries, the reflection cursor, and playbooks all go through one seam, with Postgres
and the memory service as adapters behind it.
This turns repeated engineering work into reusable guidance instead of letting every session rediscover the same problems.
ÆLLI can run a scheduled Nemotron review loop over selected modules in src/router/, src/skills/, and src/mempalace/. The guard script bin/improve-guard.js blocks risky automatic commits — if a proposed diff exceeds the configured safety threshold, the run is written to .octowiz/improve-review-queue.jsonl for human review.
By default ÆLLI calls models through LiteLLM. For local development you can bypass LiteLLM entirely.
mlx_lm server \
--model mlx-community/DeepSeek-R1-0528-Qwen3-8B-4bit \
--port 8080Run your local embedding server on a port other than 3456 (which ÆLLI uses). Port 8081 is a safe default:
aelli-mlx serve --port 8081AELLI_LLM_BACKEND=local pnpm startRelevant variables:
| Variable | Default | Purpose |
|---|---|---|
AELLI_LOCAL_LLM_URL |
http://localhost:8080 |
Local MLX chat server. |
AELLI_LOCAL_EMBED_URL |
http://localhost:8081 |
Local embedding server. |
AELLI_LOCAL_MODEL |
falls back to LITELLM_MODEL |
Optional local chat model override. |
AELLI_LOCAL_EMBED_MODEL |
falls back to LITELLM_EMBED_MODEL |
Optional local embedding model override. |
- Create
src/skills/<name>/index.jsexporting{ id, card, handle }. - Add a
card.jsonalongside it describing the skill. - Register it in
createSkills()insrc/skills/index.js. - Restart ÆLLI.
- Register the new
/a2a/<name>invocation URL in LiteLLM if it should be callable through the gateway.
Routes and agent-card endpoints are mounted automatically from the skill registry.
From the server hosting ÆLLI, in the repository root:
Update a running compose stack:
git fetch origin
git checkout main
git pull --ff-only origin main
docker compose up -d --build --no-deps --force-recreate aelliHealth check:
docker ps --filter name=aelli
docker inspect aelli \
--format 'status={{.State.Status}} health={{if .State.Health}}{{.State.Health.Status}}{{else}}none{{end}}'
docker exec aelli sh -lc \
'wget -qSO- --timeout=5 http://127.0.0.1:3456/a2a/aelli/.well-known/agent.json 2>&1 | head -3'
docker logs --timestamps --since 5m aelliRollback to a previous commit:
PREV=$(git rev-parse HEAD~1) # or the exact commit SHA
git reset --hard "$PREV"
docker compose up -d --build --no-deps --force-recreate aelliBug reports and pull requests are welcome.
- Fork the repo and create a feature branch.
- Install dependencies:
pnpm install - Run tests before and after your change:
pnpm test - Open a pull request with a clear description of what changed and why.
For larger changes, open an issue first to discuss the approach.
MIT — built by the GFE/IntegraHub team.