Skip to content

raelli/aelli

Repository files navigation

ÆLLI

ÆLLI

A2A-compliant agent server for routing, memory, and multi-model orchestration.

A2A  node 18+  release  license: MIT

Quick start  ·  Skills  ·  API  ·  Deploy


ÆLLI is an A2A-compliant agent server that routes natural-language tasks to the right skill, runs generation-review workflows that prevent a model from approving its own output, and persists project experience into playbooks that improve future sessions. It sits between your IDE, your models, and your engineering knowledge — and handles none of the things that belong in a chat window.

It is not a chatbot. It is the control plane.

Prerequisites

Note: ÆLLI delegates all model calls to LiteLLM. You need a running LiteLLM instance with at least one chat model and one embedding model route before pnpm start can serve skill requests.

Quick start

git clone https://github.com/raelli/aelli.git
cd aelli
pnpm install
cp .env.example .env

Edit .env — the three required values:

AELLI_INBOUND_SECRET=<strong-random-secret>
LITELLM_BASE_URL=http://your-litellm-host:4000
LITELLM_API_KEY=<your-litellm-key>

Then:

pnpm test          # requires Postgres and Qdrant (see Full stack below)
pnpm start

Expected:

[AELLI] Server running on port 3456

Full stack with Docker

docker network create integrahub_network 2>/dev/null || true
docker compose up -d

This starts ÆLLI on port 3456, the memory service on port 3457, Postgres, and Qdrant. The compose file expects an external Docker network named integrahub_network — create it once with the command above.

How it works

Layer Components
Caller Human · IDE · LiteLLM gateway
Core Express server · A2A protocol · port 3456
Skills Orchestrator · Dev Advisor · Engineering Knowledge · Multi-Router · Octowiz
Storage Postgres · Qdrant · MemPalace (session experience)

Each skill is mounted at /a2a/<skill-id> and publishes an agent card at /a2a/<skill-id>/.well-known/agent.json.

Skills

Skill Endpoint Purpose
Orchestrator /a2a/aelli Primary entry point. Routes natural-language tasks to the best registered skill.
Dev Advisor /a2a/dev-advisor Watches coding-session hook events and flags file conflicts, branch drift, and spec deviation.
Engineering Knowledge /a2a/engineering Qdrant-backed semantic search and ingest for engineering knowledge.
Multi-Router /a2a/aelli-router Classifies tasks, routes to coding or review models, prevents a model from approving its own output.
Octowiz /a2a/octowiz Builds model-tier-aware planning and review context bundles for Octowiz / Claude Code workflows.

Orchestrator

/a2a/aelli

Use this when the caller knows what they want but not which agent should do it. The orchestrator keeps a live registry of all mounted skill cards, selects the best skill for the request, and streams the result over SSE.

Operation: aelli — free-form natural-language request


Dev Advisor

/a2a/dev-advisor

Dev Advisor processes hook events from Octowiz / Claude Code sessions and runs deterministic engineering-hygiene checks:

  • file-conflict detection — warns when another session changed a file after the current session started
  • branch-drift detection — warns when a branch falls behind its remote tracking ref
  • spec-deviation detection — warns when modified files were not mentioned in the triggering prompt

Set AELLI_LLM_ADVICE=true to add an LLM advisory pass on top of the deterministic checks.

Operation: dev-advisor — accepts a hook-event payload


Engineering Knowledge

/a2a/engineering

Embeds content through LiteLLM, stores vectors in Qdrant, and returns ranked passages with a synthesized answer. Ingests new documents without restarting ÆLLI.

Operations:

  • engineering:query — semantic search with ranked passages and synthesized answer
  • engineering:ingest — ingest a document with title, content, source URL, and optional metadata

Multi-Router

/a2a/aelli-router

Classifies tasks by complexity, routes coding work to generation models, and hands review to a separate reviewer model. The generator does not approve its own output.

Workflow types:

Workflow Trigger Flow
standard default generate → review loop → complete
complex XHIGH / MAX tier generate → architecture review → deep review → revise loop
high-risk escalate=true or reasoning-required task generate → review → validation gate → revise

All three variants share one bounded review loop. An escalation policy evaluates the complete loop state every iteration and can silently upgrade the reviewer to the REASONING tier — triggered by security/architecture keywords, two validation failures, reviewer disagreement, or an exhausted revision budget. When it fires, the reviewed progress event carries the reason (escalation: "keyword" | "validation" | "disagreement" | "budget").

Operations:

  • aelli-router:route — synchronous classification returning { router, tier, model, workflow }
  • aelli-router:workflow — streaming workflow execution with live progress events

Octowiz

/a2a/octowiz

Builds context bundles for Claude Code and Octowiz. Bundle size is model-tier-aware: COMPACT for lightweight models, STANDARD for normal tasks, FULL for reasoning-tier or high-context work. Bundles can include project playbooks from MemPalace, operational memory, and relevant engineering knowledge.

Operations:

  • octowiz:plan — planning context bundle
  • octowiz:review — review context bundle

Memory layers

ÆLLI separates memory by responsibility:

Layer Backing system Used for
Operational memory aelli-memory service + Postgres Registry, system state, deterministic key/value memory
Knowledge memory Qdrant + embeddings Searchable project and engineering knowledge
Experience memory MemPalace + Postgres Session experiences, reflection, project playbooks

Truth stays at the source. ÆLLI builds context around GitHub, Confluence, Jira, and other systems; it does not try to become the canonical database for everything.


Configuration

cp .env.example .env

Required:

Variable Purpose
AELLI_INBOUND_SECRET Shared secret for protected endpoints via x-aelli-secret.
LITELLM_BASE_URL LiteLLM gateway for outbound model and embedding calls.
LITELLM_API_KEY Bearer token for LiteLLM.

Storage (for memory features):

Variable Purpose
DATABASE_URL Postgres connection for MemPalace and PgStore.
MEMORY_SERVICE_URL URL of the aelli-memory service (default: http://localhost:3457).
AELLI_MEMORY_SECRET Shared secret for memory service requests.
QDRANT_URL Qdrant server URL (default: http://localhost:6333).
GITHUB_WEBHOOK_SECRET Enables and verifies /webhooks/github when set.

Model routing overrides:

Variable Purpose
LITELLM_MODEL Default model route.
LITELLM_EMBED_MODEL Embedding model route.
CODING_MODEL_NORMAL Coding model for normal tasks.
CODING_MODEL_HIGH Coding model for high-complexity tasks.
CODING_MODEL_XHIGH Coding model for extra-high-complexity tasks.
CODING_MODEL_MAX Coding model for maximum-complexity tasks.
NEMOTRON_MODEL_SIMPLE Reviewer model for simple checks.
NEMOTRON_MODEL_MEDIUM Reviewer model for medium checks.
NEMOTRON_MODEL_COMPLEX Reviewer model for complex checks.
NEMOTRON_MODEL_REASONING Reviewer model for reasoning-heavy checks.
NEMOTRON_LITELLM_BASE_URL Optional separate LiteLLM base URL for Nemotron calls.

Feature flags:

Variable Default Purpose
AELLI_LLM_ADVICE false Enable LLM advisory pass in Dev Advisor on top of deterministic checks.
PORT 3456 HTTP port for the ÆLLI server.

See .env.example for the full list.


LiteLLM registration

Register each ÆLLI skill as an A2A agent in LiteLLM:

LiteLLM agent name Invocation URL
aelli http://<host>:3456/a2a/aelli
aelli-dev-advisor http://<host>:3456/a2a/dev-advisor
aelli-engineering http://<host>:3456/a2a/engineering
aelli-router http://<host>:3456/a2a/aelli-router
aelli-octowiz http://<host>:3456/a2a/octowiz

Agent-card discovery:

http://<host>:3456/a2a/<skill-id>/.well-known/agent.json

API reference

A2A endpoints

Endpoint Auth Description
GET /a2a/:skill/.well-known/agent.json none Agent card for a mounted skill.
GET /a2a/:skill/.well-known/agent-card.json none Alternate agent-card path.
POST /a2a/aelli x-aelli-secret Orchestrator.
POST /a2a/dev-advisor x-aelli-secret Dev Advisor hook-event processing.
POST /a2a/engineering x-aelli-secret Engineering query / ingest.
POST /a2a/aelli-router x-aelli-secret Route classification and streaming workflows.
POST /a2a/octowiz x-aelli-secret Plan / review context bundles.

Runtime endpoints

Endpoint Auth Description
GET /a2a/task-queue x-aelli-secret SSE stream of pending daemon tasks.
POST /a2a/task-queue/:id/claim x-aelli-secret Claim a task lease.
POST /a2a/task-queue/:id/result x-aelli-secret Submit task result.
POST /webhooks/github HMAC-SHA256 GitHub push / PR ingest into Engineering Knowledge.

Memory service

Endpoint Auth Description
GET /memory/:key x-aelli-memory-secret Read memory value.
PUT /memory/:key x-aelli-memory-secret Write memory value.

The memory service fails closed when AELLI_MEMORY_SECRET is not set.

Smoke test

docker compose up -d
curl -s http://127.0.0.1:3456/a2a/aelli/.well-known/agent.json | jq .name

Expected: "AELLI Orchestrator"

For Dev Advisor end-to-end behavior:

  1. Start the full stack.
  2. Open a Claude Code session with the Octowiz plugin enabled.
  3. Edit a file.
  4. Submit a prompt that does not mention that file.
  5. Expect a spec-deviation warning in the session context.
  6. Stop ÆLLI mid-session — Claude Code should continue without breaking.

MemPalace reflection loop

At the end of a Claude Code session, ÆLLI records an experience row in aelli_experiences scoped to the repository basename (the last path component of the repository root).

When enough new experiences accumulate, the Reflection Agent:

  1. reads experiences since the last reflection cursor;
  2. synthesizes an updated project playbook;
  3. writes it to the memory service at agent:aelli:playbook:<scope>;
  4. makes that playbook available to future Octowiz context bundles.

The whole pipeline runs against a single SessionMemoryStore interface — experience diaries, the reflection cursor, and playbooks all go through one seam, with Postgres and the memory service as adapters behind it.

This turns repeated engineering work into reusable guidance instead of letting every session rediscover the same problems.

Self-improvement loop

ÆLLI can run a scheduled Nemotron review loop over selected modules in src/router/, src/skills/, and src/mempalace/. The guard script bin/improve-guard.js blocks risky automatic commits — if a proposed diff exceeds the configured safety threshold, the run is written to .octowiz/improve-review-queue.jsonl for human review.

Local development (Apple Silicon)

By default ÆLLI calls models through LiteLLM. For local development you can bypass LiteLLM entirely.

Start the LLM server

mlx_lm server \
  --model mlx-community/DeepSeek-R1-0528-Qwen3-8B-4bit \
  --port 8080

Start the embedding server

Run your local embedding server on a port other than 3456 (which ÆLLI uses). Port 8081 is a safe default:

aelli-mlx serve --port 8081

Start ÆLLI in local mode

AELLI_LLM_BACKEND=local pnpm start

Relevant variables:

Variable Default Purpose
AELLI_LOCAL_LLM_URL http://localhost:8080 Local MLX chat server.
AELLI_LOCAL_EMBED_URL http://localhost:8081 Local embedding server.
AELLI_LOCAL_MODEL falls back to LITELLM_MODEL Optional local chat model override.
AELLI_LOCAL_EMBED_MODEL falls back to LITELLM_EMBED_MODEL Optional local embedding model override.

Adding a skill

  1. Create src/skills/<name>/index.js exporting { id, card, handle }.
  2. Add a card.json alongside it describing the skill.
  3. Register it in createSkills() in src/skills/index.js.
  4. Restart ÆLLI.
  5. Register the new /a2a/<name> invocation URL in LiteLLM if it should be callable through the gateway.

Routes and agent-card endpoints are mounted automatically from the skill registry.

Deployment

From the server hosting ÆLLI, in the repository root:

Update a running compose stack:

git fetch origin
git checkout main
git pull --ff-only origin main
docker compose up -d --build --no-deps --force-recreate aelli

Health check:

docker ps --filter name=aelli
docker inspect aelli \
  --format 'status={{.State.Status}} health={{if .State.Health}}{{.State.Health.Status}}{{else}}none{{end}}'
docker exec aelli sh -lc \
  'wget -qSO- --timeout=5 http://127.0.0.1:3456/a2a/aelli/.well-known/agent.json 2>&1 | head -3'
docker logs --timestamps --since 5m aelli

Rollback to a previous commit:

PREV=$(git rev-parse HEAD~1)   # or the exact commit SHA
git reset --hard "$PREV"
docker compose up -d --build --no-deps --force-recreate aelli

Contributing

Bug reports and pull requests are welcome.

  1. Fork the repo and create a feature branch.
  2. Install dependencies: pnpm install
  3. Run tests before and after your change: pnpm test
  4. Open a pull request with a clear description of what changed and why.

For larger changes, open an issue first to discuss the approach.

License

MIT — built by the GFE/IntegraHub team.

ÆLLI  ·  octowiz ↗

About

AELLI — GFE/IntegraHub orchestration brain, A2A agent gateway

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors