An autonomous MUD player using LLM agents for situational awareness and decision-making, with an auto-prompt-engineering loop that improves gameplay over time.
Disclaimer: This project was vibecoded. No guarantees or warranties. Use at your own risk.
All testing was done on tbaMUD, which is based on CircleMUD and DikuMUD. The command syntax and game mechanics assume a DikuMUD-style MUD.
- MH (Memory Head): 6 parallel API calls updating situational awareness (location, inventory, equipment, stats, spells, session summary)
- DH (Decision Head): Chooses the next action based on game state and goals; updates goals after each action
- Critic/Engineer/Editor loop: Every N steps, reviews gameplay and automatically improves the DH prompt
See docs/ARCHITECTURE.md for full details.
python -m venv .venv
source .venv/bin/activate # or .venv\Scripts\activate on Windows
pip install -r requirements.txt
cp .env.example .env
# Edit .env: set OPENAI_API_KEY, MUD_HOST, MUD_PORT
# Optionally set MUD_CHARACTER and MUD_PASSWORD for auto-login.env:OPENAI_API_KEY,MUD_HOST,MUD_PORT,MUD_CHARACTER,MUD_PASSWORDconfig.yaml: timeouts, paths, model names (including separate models for critic/engineer/editor), orchestrator options
Key config options:
orchestrator.critic_interval: Run the auto-prompt-engineering loop every N steps (default: 20, set tonullto disable)openai.model_critic/model_engineer: Smarter model (e.g. gpt-4o) for analysisopenai.model_editor: Cheaper model (e.g. gpt-4o-mini) for applying edits
- Orchestrator (main loop):
python main.py [max_steps]— defaults to 10 rounds then exits; pass0for unlimited. - Interrupt: Press Ctrl+C to stop gracefully.
- Manual override: While running in a TTY, type a command and press Enter to inject it as the next action instead of DH's choice.
data/logs/orchestrator.log— high-level orchestrator eventsdata/logs/gameplay.jsonl— per-step debug log (MH context, action, MUD output, goals)data/logs/critic.jsonl— critic diagnoses (what's going well / not going well)data/logs/engineer_changes.jsonl— specific prompt changes suggested by the engineer
All logs are reset at the start of each run.
python scripts/test_memory.py— memory read/write (no deps)python scripts/test_mud_client.py— telnet connect (needs MUD_HOST)python scripts/test_agents_api.py— MH, DH action, DH goals (needs OPENAI_API_KEY)
src/mud/— telnet client, buffer, silence detectionsrc/agents/— MH, DH, Critic, Engineer, Editorsrc/memory/— memory file read/writesrc/orchestrator.py— main loop including auto-prompt-engineeringprompts/— prompt templates (mh_*.txt, dh.txt, dh_goals.txt, critic.txt, engineer.txt, editor.txt)data/— memory files (.md), logs underdata/logs/
commands.md— persistent, user-populated command reference (never cleared)spells.md— updated from kickoff only (e.g. afterpractice)current_location.md,session_summary.md,goals.md,inventory.md,equipment.md,statbar.md— cleared each run, updated by MH/DH