A multi-agent AI assistant for Attack & Defense CTF competitions, built on LangGraph. An operator chats with a supervisor agent; security goals are routed to specialist agents that analyze captured traffic, patch vulnerable services, and write/run exploits with every dangerous action gated behind human approval.
One LangGraph graph (src/graph.py:make_graph, served by the LangGraph Agent
Server):
- supervisor (
src/agents/conversational.py, Haiku) -> operator-facing front door. Replies directly to small talk; hands any security goal to the pipeline via its singledispatch_security_tasktool, which jumps toclassifywithCommand(goto=..., graph=PARENT). - classify (
src/agents/orchestrator.py, Haiku, structured output) -> splits the request into one scoped sub-question per specialist; an "all services" task is expanded into one task per live service (read from Janus). - specialists (Sonnet, ReAct via
create_agent, prompt caching) -> run in parallel via theSendAPI, each with a least-privilege MCP toolset:- traffic (
traffic_agent.py) -> analyzes HTTP traffic captured by Janus (reverse proxy), writes alert/drop rules. Tools: janus-mcpdefenderview. - patch (
patch_agent.py) -> minimal source fixes, deployed to the competition VM via git push (post-receive hook rebuilds). Tools: patcher-mcp. - exploit (
exploit_agent.py) -> writes/tests/runs exploits through exploitfarm (xfarm). Tools: exploiter-mcp + janus-mcpexploit(read-only packets) + patcher-mcpread(read-only source).
- traffic (
- final (
orchestrator.py) -> fan-in; synthesizes the specialists' reports into one operator-facing answer.
A current Mermaid diagram lives in diagrams/graph_overview.mmd.
HumanInTheLoopMiddleware (src/tools/hitl.py) interrupts on the critical
tools only: Janus rule create/update/delete, patch deploy/rollback, exploit
push/start. Approval cards (with the diff / exploit source) render in the Agent
Chat UI; everything else runs autonomously.
Each capability is a separate FastMCP server; agents consume tag-filtered HTTP endpoints so each agent only ever sees the tools its role allows:
| Server | Path | Endpoints |
|---|---|---|
src/janus/ |
Janus traffic + rules | /traffic/mcp, /defender/mcp, /exploit/mcp (port 8765) |
src/patcher/ |
git workspace + SSH deploy to the VM | /mcp, read-only /read/mcp (8766) |
src/exploiter/ |
xfarm authoring + attack lifecycle | /mcp (8767) |
The graph compiles with checkpointer=None: the Agent Server injects
persistence (in-memory for langgraph dev, Postgres/Redis in the Helm
deployment), which is also what pauses/resumes HITL interrupts.
cp .env_example .env # AWS Bedrock creds + model ids + MCP base URLs
# start the three MCP servers (each has its own .env, README and compose file):
# src/janus, src/patcher, src/exploiter
uv sync --group dev
uv run langgraph dev # Agent Server on :2024
# UI: ../agent-chat-ui (pnpm dev), point it at http://localhost:2024 / graph "conversational"Tests (pure functions only, no network):
uv run --group dev pytest -qdeploy-k8s/ contains kustomize bases (namespace + config, Postgres/Redis,
the three MCP servers) and a values file for the official langgraph-cloud
Helm chart (self-hosted Lite). The step-by-step commands used on Minikube are
in deploy-k8s/notes.txt. The agent image is built with
uv run langgraph build.
This project is for authorized CTF competitions only. Outward-facing actions (starting exploits, deploying patches, changing live proxy rules) always require explicit operator approval through the HITL gates.
