Go server implementing Donald Michie's MENACE (Machine Educable Noughts And Crosses Engine) — a machine-learning Tic-Tac-Toe system. Real-time multiplayer backend where players join rooms via short codes.
The client UI lives in bbdev-portfolio and is served by bbdev-gateway. This repo is the Go API/WebSocket server only.
make run # build + start on :8080
make pretrain # generate menace_weights.gob (5000 games vs perfect player)
make docker-build # build ghcr.io/billybolton/menace-server:latest
make docker-push # build + push to GHCRServer starts on :8080 (override with PORT env var).
Create a new game room. Returns a join code.
{ "opponent": "human" }Opponent options: human, menace, perfect, random.
Response:
{ "code": "a1b2c3", "role": "X" }Get current room state.
Make a move (positions 1-9, left-to-right, top-to-bottom).
{ "position": 5, "role": "X" }Reset the board for a new game in the same room.
WebSocket endpoint for real-time game state updates.
Incoming messages:
{ "type": "move", "data": { "position": 5 } }Outgoing messages:
joined— sent on connect with role assignment and initial statestate— broadcast on every state changeerror— sent on invalid moves
Health check endpoint. Returns {"status": "ok"} when the server is ready.
MENACE (Machine Educable Noughts And Crosses Engine) was invented by Donald Michie in 1961. It learns to play Tic-Tac-Toe through reinforcement — no game knowledge is programmed, only a feedback mechanism.
Each unique board state gets a "matchbox" containing colored beads — one color per legal move. The number of beads for a given move represents the probability of choosing it. Initial bead counts are weighted by position type: corners (8), edges (4), center (2).
A 3×3 board has 8 symmetries: 4 rotations and 2 reflections (plus their rotated variants). Before storing or looking up a state, every transform is applied via a BoardIndexes permutation array. If any transformed board matches an existing state, they are treated as identical. This reduces ~5,500 reachable states down to ~304 unique matchboxes.
When it's MENACE's turn, it finds the matchbox matching the current board (trying all rotations/reflections), then randomly draws a bead weighted by count. Higher bead count = higher probability of that move being chosen.
After each game, MENACE walks its learning path — the sequence of matchboxes it visited — and adjusts bead counts:
| Outcome | Adjustment |
|---|---|
| Win | +3 beads |
| Draw | +1 bead |
| Loss | −1 bead (minimum 1) |
Over thousands of games, winning moves accumulate beads while losing moves decay. The system converges toward strong play purely through trial and error.
MENACE is pre-trained against a Perfect Player (minimax backward induction) for 50,000 games during docker build. The trained bead weights are serialized (menace_weights.gob) and baked into the final image. At startup the server loads the weights instantly. Each room clones these cached weights so every session gets a fresh copy of a highly skilled MENACE without repeating the expensive training. If no weights file is present, the server falls back to training at startup.
menace/
├── main.go Entry point — starts HTTP server on :8080
├── cmd/pretrain/main.go CLI tool: train MENACE vs perfect player
├── internal/
│ ├── game/
│ │ ├── types.go CellValue, GameState, GameOutcome enums
│ │ ├── tictactoe.go Board representation, move logic, win detection
│ │ ├── transformer.go 8 symmetry transforms (rotation + reflection)
│ │ ├── enumerations.go BFS game tree generation with symmetry dedup
│ │ ├── menace_game.go Single matchbox: beads, weighted random pick, outcome update
│ │ ├── menace_player.go MENACE AI: state lookup, move selection, learning path
│ │ ├── perfect_game.go Minimax node with backward induction outcomes
│ │ ├── perfect_player.go Perfect AI: always plays optimally
│ │ ├── random_player.go Random move baseline
│ │ ├── player.go Player interface + base implementation
│ │ └── engine.go PlayGame / TrainMenace helpers
│ └── server/
│ ├── server.go HTTP routes, WebSocket handler, middleware chain
│ ├── room.go Room lifecycle, AI move execution, state snapshots
│ └── middleware.go CORS, rate limiting, security headers
├── Dockerfile.server Multi-stage build + pretrain
├── Makefile Build, test, Docker commands
└── go.mod
- Cached AI players —
PerfectPlayerandMenacePlayerare built once at startup inNewRoomManager(). Each room clones from the cached instance rather than re-enumerating the full game tree. - Level-indexed state lookup — Game states are stored in arrays indexed by turn number (0–9). MENACE only scans the level matching the current turn, avoiding irrelevant states.
- Lazy transform iteration — Symmetry transforms are applied one at a time via an iterator pattern, short-circuiting as soon as a match is found.
- Learning path as linked list — Each move appends to a path of
MenaceGamereferences. At game end, a single traversal adjusts every bead along it. - Pre-allocated game reuse — On reset, the next game board is pre-allocated in a background goroutine so resets are instant.
See CONTRIBUTING.md.