ÆLLI

A2A-compliant agent server for routing, memory, and multi-model orchestration.

ÆLLI is an A2A-compliant agent server that routes natural-language tasks to the right skill, runs generation-review workflows that prevent a model from approving its own output, and persists project experience into playbooks that improve future sessions. It sits between your IDE, your models, and your engineering knowledge — and handles none of the things that belong in a chat window.

It is not a chatbot. It is the control plane.

Prerequisites

Node.js 18+
pnpm — package manager (npm install -g pnpm if not installed)
Postgres 16
Qdrant — vector store for engineering knowledge search
LiteLLM proxy with your model routes configured

Note: ÆLLI delegates all model calls to LiteLLM. You need a running LiteLLM instance with at least one chat model and one embedding model route before pnpm start can serve skill requests.

Quick start

git clone https://github.com/raelli/aelli.git
cd aelli
pnpm install
cp .env.example .env

Edit .env — the three required values:

AELLI_INBOUND_SECRET=<strong-random-secret>
LITELLM_BASE_URL=http://your-litellm-host:4000
LITELLM_API_KEY=<your-litellm-key>

Then:

pnpm test          # requires Postgres and Qdrant (see Full stack below)
pnpm start

Expected:

[AELLI] Server running on port 3456

Full stack with Docker

docker network create integrahub_network 2>/dev/null || true
docker compose up -d

This starts ÆLLI on port 3456, the memory service on port 3457, Postgres, and Qdrant. The compose file expects an external Docker network named integrahub_network — create it once with the command above.

How it works

Layer	Components
Caller	Human · IDE · LiteLLM gateway
Core	Express server · A2A protocol · port 3456
Skills	Orchestrator · Dev Advisor · Engineering Knowledge · Multi-Router · Octowiz
Storage	Postgres · Qdrant · MemPalace (session experience)

Each skill is mounted at /a2a/<skill-id> and publishes an agent card at /a2a/<skill-id>/.well-known/agent.json.

Skills

Skill	Endpoint	Purpose
Orchestrator	`/a2a/aelli`	Primary entry point. Routes natural-language tasks to the best registered skill.
Dev Advisor	`/a2a/dev-advisor`	Watches coding-session hook events and flags file conflicts, branch drift, and spec deviation.
Engineering Knowledge	`/a2a/engineering`	Qdrant-backed semantic search and ingest for engineering knowledge.
Multi-Router	`/a2a/aelli-router`	Classifies tasks, routes to coding or review models, prevents a model from approving its own output.
Octowiz	`/a2a/octowiz`	Builds model-tier-aware planning and review context bundles for Octowiz / Claude Code workflows.

Orchestrator

/a2a/aelli

Use this when the caller knows what they want but not which agent should do it. The orchestrator keeps a live registry of all mounted skill cards, selects the best skill for the request, and streams the result over SSE.

Operation: aelli — free-form natural-language request

Dev Advisor

/a2a/dev-advisor

Dev Advisor processes hook events from Octowiz / Claude Code sessions and runs deterministic engineering-hygiene checks:

file-conflict detection — warns when another session changed a file after the current session started
branch-drift detection — warns when a branch falls behind its remote tracking ref
spec-deviation detection — warns when modified files were not mentioned in the triggering prompt

Set AELLI_LLM_ADVICE=true to add an LLM advisory pass on top of the deterministic checks.

Operation: dev-advisor — accepts a hook-event payload

Engineering Knowledge

/a2a/engineering

Embeds content through LiteLLM, stores vectors in Qdrant, and returns ranked passages with a synthesized answer. Ingests new documents without restarting ÆLLI.

Operations:

engineering:query — semantic search with ranked passages and synthesized answer
engineering:ingest — ingest a document with title, content, source URL, and optional metadata

Multi-Router

/a2a/aelli-router

Classifies tasks by complexity, routes coding work to generation models, and hands review to a separate reviewer model. The generator does not approve its own output.

Workflow types:

Workflow	Trigger	Flow
`standard`	default	generate → review loop → complete
`complex`	XHIGH / MAX tier	generate → architecture review → deep review → revise loop
`high-risk`	`escalate=true` or reasoning-required task	generate → review → validation gate → revise

All three variants share one bounded review loop. An escalation policy evaluates the complete loop state every iteration and can silently upgrade the reviewer to the REASONING tier — triggered by security/architecture keywords, two validation failures, reviewer disagreement, or an exhausted revision budget. When it fires, the reviewed progress event carries the reason (escalation: "keyword" | "validation" | "disagreement" | "budget").

Operations:

aelli-router:route — synchronous classification returning { router, tier, model, workflow }
aelli-router:workflow — streaming workflow execution with live progress events

Octowiz

/a2a/octowiz

Builds context bundles for Claude Code and Octowiz. Bundle size is model-tier-aware: COMPACT for lightweight models, STANDARD for normal tasks, FULL for reasoning-tier or high-context work. Bundles can include project playbooks from MemPalace, operational memory, and relevant engineering knowledge.

Operations:

octowiz:plan — planning context bundle
octowiz:review — review context bundle

Memory layers

ÆLLI separates memory by responsibility:

Layer	Backing system	Used for
Operational memory	`aelli-memory` service + Postgres	Registry, system state, deterministic key/value memory
Knowledge memory	Qdrant + embeddings	Searchable project and engineering knowledge
Experience memory	MemPalace + Postgres	Session experiences, reflection, project playbooks

Truth stays at the source. ÆLLI builds context around GitHub, Confluence, Jira, and other systems; it does not try to become the canonical database for everything.

Configuration

cp .env.example .env

Required:

Variable	Purpose
`AELLI_INBOUND_SECRET`	Shared secret for protected endpoints via `x-aelli-secret`.
`LITELLM_BASE_URL`	LiteLLM gateway for outbound model and embedding calls.
`LITELLM_API_KEY`	Bearer token for LiteLLM.

Storage (for memory features):

Variable	Purpose
`DATABASE_URL`	Postgres connection for MemPalace and PgStore.
`MEMORY_SERVICE_URL`	URL of the `aelli-memory` service (default: `http://localhost:3457`).
`AELLI_MEMORY_SECRET`	Shared secret for memory service requests.
`QDRANT_URL`	Qdrant server URL (default: `http://localhost:6333`).
`GITHUB_WEBHOOK_SECRET`	Enables and verifies `/webhooks/github` when set.

Model routing overrides:

Variable	Purpose
`LITELLM_MODEL`	Default model route.
`LITELLM_EMBED_MODEL`	Embedding model route.
`CODING_MODEL_NORMAL`	Coding model for normal tasks.
`CODING_MODEL_HIGH`	Coding model for high-complexity tasks.
`CODING_MODEL_XHIGH`	Coding model for extra-high-complexity tasks.
`CODING_MODEL_MAX`	Coding model for maximum-complexity tasks.
`NEMOTRON_MODEL_SIMPLE`	Reviewer model for simple checks.
`NEMOTRON_MODEL_MEDIUM`	Reviewer model for medium checks.
`NEMOTRON_MODEL_COMPLEX`	Reviewer model for complex checks.
`NEMOTRON_MODEL_REASONING`	Reviewer model for reasoning-heavy checks.
`NEMOTRON_LITELLM_BASE_URL`	Optional separate LiteLLM base URL for Nemotron calls.

Feature flags:

Variable	Default	Purpose
`AELLI_LLM_ADVICE`	`false`	Enable LLM advisory pass in Dev Advisor on top of deterministic checks.
`PORT`	`3456`	HTTP port for the ÆLLI server.

See .env.example for the full list.

LiteLLM registration

Register each ÆLLI skill as an A2A agent in LiteLLM:

LiteLLM agent name	Invocation URL
`aelli`	`http://<host>:3456/a2a/aelli`
`aelli-dev-advisor`	`http://<host>:3456/a2a/dev-advisor`
`aelli-engineering`	`http://<host>:3456/a2a/engineering`
`aelli-router`	`http://<host>:3456/a2a/aelli-router`
`aelli-octowiz`	`http://<host>:3456/a2a/octowiz`

Agent-card discovery:

http://<host>:3456/a2a/<skill-id>/.well-known/agent.json

API reference

A2A endpoints

Endpoint	Auth	Description
`GET /a2a/:skill/.well-known/agent.json`	none	Agent card for a mounted skill.
`GET /a2a/:skill/.well-known/agent-card.json`	none	Alternate agent-card path.
`POST /a2a/aelli`	`x-aelli-secret`	Orchestrator.
`POST /a2a/dev-advisor`	`x-aelli-secret`	Dev Advisor hook-event processing.
`POST /a2a/engineering`	`x-aelli-secret`	Engineering query / ingest.
`POST /a2a/aelli-router`	`x-aelli-secret`	Route classification and streaming workflows.
`POST /a2a/octowiz`	`x-aelli-secret`	Plan / review context bundles.

Runtime endpoints

Endpoint	Auth	Description
`GET /a2a/task-queue`	`x-aelli-secret`	SSE stream of pending daemon tasks.
`POST /a2a/task-queue/:id/claim`	`x-aelli-secret`	Claim a task lease.
`POST /a2a/task-queue/:id/result`	`x-aelli-secret`	Submit task result.
`POST /webhooks/github`	HMAC-SHA256	GitHub push / PR ingest into Engineering Knowledge.

Memory service

Endpoint	Auth	Description
`GET /memory/:key`	`x-aelli-memory-secret`	Read memory value.
`PUT /memory/:key`	`x-aelli-memory-secret`	Write memory value.

The memory service fails closed when AELLI_MEMORY_SECRET is not set.

Smoke test

docker compose up -d
curl -s http://127.0.0.1:3456/a2a/aelli/.well-known/agent.json | jq .name

Expected: "AELLI Orchestrator"

For Dev Advisor end-to-end behavior:

Start the full stack.
Open a Claude Code session with the Octowiz plugin enabled.
Edit a file.
Submit a prompt that does not mention that file.
Expect a spec-deviation warning in the session context.
Stop ÆLLI mid-session — Claude Code should continue without breaking.

MemPalace reflection loop

At the end of a Claude Code session, ÆLLI records an experience row in aelli_experiences scoped to the repository basename (the last path component of the repository root).

When enough new experiences accumulate, the Reflection Agent:

reads experiences since the last reflection cursor;
synthesizes an updated project playbook;
writes it to the memory service at agent:aelli:playbook:<scope>;
makes that playbook available to future Octowiz context bundles.

The whole pipeline runs against a single SessionMemoryStore interface — experience diaries, the reflection cursor, and playbooks all go through one seam, with Postgres and the memory service as adapters behind it.

This turns repeated engineering work into reusable guidance instead of letting every session rediscover the same problems.

Self-improvement loop

ÆLLI can run a scheduled Nemotron review loop over selected modules in src/router/, src/skills/, and src/mempalace/. The guard script bin/improve-guard.js blocks risky automatic commits — if a proposed diff exceeds the configured safety threshold, the run is written to .octowiz/improve-review-queue.jsonl for human review.

Local development (Apple Silicon)

By default ÆLLI calls models through LiteLLM. For local development you can bypass LiteLLM entirely.

Start the LLM server

mlx_lm server \
  --model mlx-community/DeepSeek-R1-0528-Qwen3-8B-4bit \
  --port 8080

Start the embedding server

Run your local embedding server on a port other than 3456 (which ÆLLI uses). Port 8081 is a safe default:

aelli-mlx serve --port 8081

Start ÆLLI in local mode

AELLI_LLM_BACKEND=local pnpm start

Relevant variables:

Variable	Default	Purpose
`AELLI_LOCAL_LLM_URL`	`http://localhost:8080`	Local MLX chat server.
`AELLI_LOCAL_EMBED_URL`	`http://localhost:8081`	Local embedding server.
`AELLI_LOCAL_MODEL`	falls back to `LITELLM_MODEL`	Optional local chat model override.
`AELLI_LOCAL_EMBED_MODEL`	falls back to `LITELLM_EMBED_MODEL`	Optional local embedding model override.

Adding a skill

Create src/skills/<name>/index.js exporting { id, card, handle }.
Add a card.json alongside it describing the skill.
Register it in createSkills() in src/skills/index.js.
Restart ÆLLI.
Register the new /a2a/<name> invocation URL in LiteLLM if it should be callable through the gateway.

Routes and agent-card endpoints are mounted automatically from the skill registry.

Deployment

From the server hosting ÆLLI, in the repository root:

Update a running compose stack:

git fetch origin
git checkout main
git pull --ff-only origin main
docker compose up -d --build --no-deps --force-recreate aelli

Health check:

docker ps --filter name=aelli
docker inspect aelli \
  --format 'status={{.State.Status}} health={{if .State.Health}}{{.State.Health.Status}}{{else}}none{{end}}'
docker exec aelli sh -lc \
  'wget -qSO- --timeout=5 http://127.0.0.1:3456/a2a/aelli/.well-known/agent.json 2>&1 | head -3'
docker logs --timestamps --since 5m aelli

Rollback to a previous commit:

PREV=$(git rev-parse HEAD~1)   # or the exact commit SHA
git reset --hard "$PREV"
docker compose up -d --build --no-deps --force-recreate aelli

Contributing

Bug reports and pull requests are welcome.

Fork the repo and create a feature branch.
Install dependencies: pnpm install
Run tests before and after your change: pnpm test
Open a pull request with a clear description of what changed and why.

For larger changes, open an issue first to discuss the approach.

License

MIT — built by the GFE/IntegraHub team.

—

ÆLLI · octowiz ↗

Name		Name	Last commit message	Last commit date
Latest commit History 168 Commits
.claude		.claude
.husky		.husky
.superpowers/brainstorm/77697-1780597834		.superpowers/brainstorm/77697-1780597834
bin		bin
deploy		deploy
docs		docs
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.npmrc		.npmrc
CLAUDE.md		CLAUDE.md
DEPLOYING.md		DEPLOYING.md
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
index.js		index.js
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml

Folders and files

Latest commit

History

Repository files navigation

ÆLLI

Prerequisites

Quick start

Full stack with Docker

How it works

Skills

Orchestrator

Dev Advisor

Engineering Knowledge

Multi-Router

Octowiz

Memory layers

Configuration

LiteLLM registration

API reference

A2A endpoints

Runtime endpoints

Memory service

Smoke test

MemPalace reflection loop

Self-improvement loop

Local development (Apple Silicon)

Start the LLM server

Start the embedding server

Start ÆLLI in local mode

Adding a skill

Deployment

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 23

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages