Skip to content

wesmar/Chess

Repository files navigation

♔ Modern Chess Engine

Chess Engine Screenshot

A lightweight, educational chess engine written in modern C++20 with pure WinAPI interface. Perfect for learning how chess engines work without diving through hundreds of thousands of lines of code.

Windows C++ License Build

Why This Project?

Most chess engines are either too simple (lacking modern techniques) or too complex (thousands of files, external dependencies). This project aims to be educational yet powerful - implementing professional chess engine techniques in clean, readable code.

This is a case study in building a high-performance chess system using Data-Oriented Design (DOD) principles, maximizing computational throughput while drastically reducing binary size and eliminating all external library dependencies.

Key Features

🎯 Extreme Optimization

  • No CRT version: ~ 200 KB
  • With CRT version: ~ 500 KB
  • Zero external dependencies - no Qt, SDL, or .NET
  • Portable: Even runs in Windows Recovery Environment (WinRE)!

🧠 Advanced Search Algorithms

The engine employs a comprehensive suite of search techniques:

Core Search Framework

  • Minimax with Alpha-Beta Pruning - efficient move tree search
  • Principal Variation Search (PVS) - optimizes alpha-beta by assuming first move is best
  • Iterative Deepening - finds best move within time limit
  • Aspiration Windows - narrows search window based on previous iteration score for faster cutoffs
  • Quiescence Search - eliminates horizon effect by searching tactical sequences

Pruning Techniques

  • Null Move Pruning (NMP) - skips branches where opponent can't improve even with two consecutive moves; disabled in endgames to avoid zugzwang errors
  • ProbCut - prunes branches where shallow tactical search at higher beta already exceeds the expected bound; uses captures-only search with SEE filter
  • Late Move Reduction (LMR) - reduces depth for quiet moves late in move list using logarithmic formula log(depth) × log(moveIndex) / 2.25, with re-search on alpha improvement
  • Late Move Pruning (LMP) - completely skips late quiet moves at shallow depths
  • Futility Pruning - skips quiet moves at shallow depths when static eval + margin ≤ alpha
  • Reverse Futility Pruning (RFP) - returns early when static eval − margin ≥ beta (static null move pruning)
  • Delta Pruning - in quiescence search, skips captures that can't possibly raise alpha
  • Mate Distance Pruning - stops searching for longer mates when shorter one already found

Search Extensions

  • Check Extensions - extends search depth when side to move is in check
  • Singular Extensions (SE) - extends the TT move when a reduced search with lowered beta confirms it is the only good move in the position; uses excludedMove parameter to prevent recursive SE

Position Caching

  • Zobrist Hashing - lightning-fast incremental position comparison using XOR-sum
  • Transposition Table - avoids re-analyzing positions with striped locking for thread safety; separate ProbeSE() method for Singular Extension reference scores
  • Internal Iterative Deepening (IID) - searches shallowly to find a good TT move when none is available, improving move ordering at cost-effective depth
  • Threefold Repetition Detection - recognizes draw by repetition during search

Move Ordering Heuristics

  • TT Move - transposition table best move searched first
  • SEE (Static Exchange Evaluation) - evaluates capture sequences; winning captures (SEE ≥ 0) ordered before killers, losing captures (SEE < 0) ordered last
  • MVV-LVA (Most Valuable Victim - Least Valuable Attacker) - breaks ties among captures of equal SEE
  • Killer Move Heuristic - remembers quiet moves that caused beta cutoffs at each ply (two slots per ply)
  • Countermove Heuristic - remembers the quiet move that refuted the previous move; provides a third ordering hint after killers
  • History Heuristic - side-specific scoring for quiet moves that historically caused cutoffs, with aging (values halved at each iterative deepening iteration to prioritize recent data)
  • Center Control Bonus - tactical bonus for moves targeting central squares

Parallel Search

  • Root-Parallel Search - multi-threaded search with dynamic load balancing and shared alpha
  • Thread-Local Heuristics - separate killer move and countermove tables per thread to prevent data races

Opening Book

  • Hardcoded Opening Lines - Zobrist-indexed book with random move selection for variety

⚡ High-Performance Memory Architecture

Stack-Allocated Move Generation

  • MoveList Structure - fixed 256-move array on stack, eliminating millions of heap allocations per search
  • Zero malloc() in hot path - entire move generation and search uses stack memory only

Efficient Piece Tracking

  • PieceList per Color - O(1) piece iteration without scanning 64 squares
  • Swap-with-last Removal - O(1) piece list updates during make/unmake

Incremental Updates

  • Incremental Zobrist Key - XOR updates instead of full recomputation
  • Incremental MG/EG Score Maintenance - both middlegame and endgame scores (material + PST) updated on each move, eliminating per-evaluation recomputation
  • Bitboard Occupancy Tracking - hybrid architecture for fast sliding piece queries
  • Pawn Bitboards per Color - m_pawnMasks[2] updated incrementally, enabling O(1) open-file and pawn shield queries in evaluation

Hybrid Board Representation

  • 64-byte Mailbox (Cache-Aligned) - entire board state in single L1 cache line
  • Bitboards for Occupancy - fast sliding piece move generation without expensive lookups
  • ~800KB savings compared to Magic Bitboards approach

📊 Hand-Crafted Evaluation (HCE)

A tuned evaluation function with tapered scoring:

Material & Positional

  • Piece-Square Tables (PST) - separate middlegame and endgame tables for every piece type (pawn, knight, bishop, rook, queen; king already had separate MG/EG tables)
  • Tapered Evaluation - smooth interpolation based on game phase: score = (mgScore × phase + egScore × (256 − phase)) / 256; phase computed from remaining material (Q=4, R=2, B/N=1, max=24 → maps to 0–256)
  • Incremental EG Score - m_egScore maintained alongside m_incrementalScore (MG), restored via MoveRecord on undo — zero extra work per evaluation call
  • Bishop Pair Bonus - +40 centipawns for having both bishops

King Safety (Middlegame)

  • Castling Position Bonus - rewards castled king placement
  • Pawn Shield Analysis - evaluates pawns protecting the king using pawn bitboards
  • Open File Penalty - penalizes missing pawns near king (O(1) via pawn bitboard file mask)

Piece Activity

  • Mobility Evaluation - counts accessible squares for sliding pieces using bitboard occupancy
  • Center Control Bonus - rewards knights and bishops on central squares
  • Exposed Queen Penalty - penalizes queen on attacked square in middlegame
  • Piece Tropism - bonus for pieces close to the enemy king

Pawn Structure

  • Isolated Pawn Penalty - pawns with no friendly pawns on adjacent files
  • Doubled Pawn Penalty - multiple pawns on same file
  • Backward Pawn Penalty - pawns that cannot safely advance

Passed Pawns

  • Advancement Bonus - exponentially increasing value as pawn advances
  • King Distance Factor - bonus when friendly king is close, penalty when enemy king is close
  • Rook Behind Passed Pawn - bonus for rook supporting from behind

Tempo

  • Side to Move Bonus - small advantage for having the move (+8 centipawns)

🤖 NNUE Infrastructure (Prepared)

The engine includes complete NNUE (Efficiently Updatable Neural Network) infrastructure:

  • HybridEvaluator - seamless switching between Classical/NNUE/Auto modes
  • NeuralEvaluator - inference engine with incremental accumulator updates
  • FeatureExtractor - HalfKP-style feature extraction (king-piece relationships)
  • WeightLoader - binary weight file parser (.nnue format)
  • DenseLayer - optimized matrix operations with SIMD support
  • Transformer Architecture - attention-based feature processing

Currently runs in Classical mode. NNUE activation requires trained weight file (nn-small.nnue).

🏗️ Architecture & Design Choices

Memory Architecture: Cache-Locality First

I deliberately chose a 64-byte "Mailbox" array representation over pure bitboards, aligned to L1 cache lines. This decision minimizes cache misses during iterative game tree traversal - the most critical performance bottleneck in chess engines.

Data-Oriented Design

  • 1-byte piece structures (uint8_t) with bitmask encoding
  • Eliminates polymorphism overhead (no v-tables)
  • Deterministic memory layout for predictable performance
  • SIMD-friendly data structures

🛠 Optimization Philosophy (Data-Oriented Design)

This project proves that modern C++20 can be as low-level as C while maintaining safety:

  • Cache Locality: The Board class is aligned to 64 bytes (L1 Cache Line size). Fetching a board state fetches the entire board into the CPU cache in a single cycle
  • Data Packing: Piece class is a 1-byte wrapper around uint8_t with no v-table overhead
  • Stack Allocation: Heavy preference for stack memory over heap allocation to reduce fragmentation and allocation costs
  • Small Footprint: The entire engine logic compiles to <200 KB, making it suitable for embedded environments or recovery tools (WinRE)

🎨 User Interface

  • Unicode Rendering - clean chess pieces (♔♕♖♗♘♙) using native WinAPI vector rendering
  • Full Chess Rules - castling, en passant, pawn promotion
  • Move History - algebraic notation with full game record
  • Undo/Redo - complete move history with board states
  • PGN Support - save and load games in standard format
  • Multiple Game Modes - Human vs Human, Human vs AI, AI vs AI
  • 10 Difficulty Levels - from beginner (random moves) to expert (deep search)
  • Configurable Threading - 1-64 threads for parallel search

Quick Start

Download

Grab the latest release from Releases: Password: github.com

  • Chess_x64.exe - 64-bit version (~500 KB)
  • Chess_x64_minSize.exe - 64-bit minimal (~200 KB)
  • Chess_x86.exe - 32-bit version (~450 KB)
  • Chess_x86_minSize.exe - 32-bit minimal (~160 KB)

Just download and run - no installation needed!

Building from Source

Requirements:

  • Windows 10 or later
  • Visual Studio 2022 or newer with C++20 support (v143+ toolset)
  • Tested with Visual Studio 2026 (v145 toolset)

Build Steps:

# Clone the repository
git clone https://github.com/wesmar/Chess.git
cd Chess

# Build all configurations using build_all.bat
# Uses /BREPRO flag for reproducible builds
build_all.bat

# Binaries will be in the 'bin' folder

Manual Build:

# Open Developer Command Prompt for VS 2022+
# Navigate to project directory

# Build 64-bit release
msbuild Chess.vcxproj /p:Configuration=Release /p:Platform=x64

# Build 64-bit minimal size
msbuild Chess.vcxproj /p:Configuration=Release_MinSize /p:Platform=x64

Project Structure

Chess/
├── Engine/                    # Chess engine core
│   ├── Board.cpp/h           # Board representation and move execution
│   ├── ChessConstants.h      # Core constants and enums
│   ├── Evaluation.cpp/h      # Position evaluation with tapered PST tables
│   ├── Move.cpp/h            # Move representation (32-bit packed)
│   ├── MoveGenerator.cpp/h   # Legal move generation
│   ├── OpeningBook.cpp/h     # Hardcoded opening book
│   ├── Piece.h               # Piece representation (8-bit packed)
│   ├── TranspositionTable.cpp/h # Hash table for position caching
│   └── Zobrist.cpp/h         # Zobrist hashing for positions
├── Engine/Neural/             # NNUE evaluation system
│   ├── HybridEvaluator.h     # Classical/NNUE mode switching
│   ├── NeuralEvaluator.h     # NNUE inference engine
│   ├── FeatureExtractor.h    # HalfKP feature extraction
│   ├── FeatureAccumulator.h  # Incremental accumulator updates
│   ├── DenseLayer.h          # Matrix operations
│   ├── WeightLoader.h        # .nnue file parser
│   └── Transformer.h         # Attention architecture
├── UI/                        # User interface
│   ├── Dialogs/
│   │   ├── GameSettingsDialog.cpp/h  # Tabbed settings dialog
│   │   └── PromotionDialog.cpp/h     # Pawn promotion selector
│   ├── ChessGame.cpp/h       # Game state management and AI
│   ├── main.cpp              # Application entry point
│   ├── VectorRenderer.cpp/h  # Board rendering with Unicode pieces
│   ├── WinApp.cpp/h          # Main window and event handling
│   └── WinUtility.cpp/h      # String conversion utilities
├── UCI/                       # UCI engine implementation
│   ├── UCIMain.cpp           # Console entry point
│   └── UCIEngine.h           # UCI protocol handler
├── Resources/                 # Icons and resource files
│   ├── Icons/app.ico
│   ├── Chess.rc              # Resource definitions
│   └── Resource.h            # Resource ID constants
├── build_all.bat             # Build script for all configurations
├── Chess.vcxproj             # Main GUI application project
├── ChessEngineUCI.vcxproj    # UCI console engine project
└── Chess.slnx                # Visual Studio solution

Learning the Code

If you're new to chess programming, here's a suggested reading order:

  1. Start with Board.cpp - This is where all the chess magic happens (piece movement, rule validation)
  2. Then Evaluation.cpp - Learn how the computer "thinks" about positions
  3. Move to MoveGenerator.cpp - See how legal moves are generated
  4. Finally ChessGame.cpp - Understand the AI's decision-making process (alpha-beta search with parallel search)

Key Concepts Explained

Mailbox vs Bitboards: While bitboards are faster for some operations, the mailbox representation (64-element array) provides better cache locality for iterative search algorithms, which dominate chess engine runtime.

Piece-Square Tables (PST): Arrays that assign values to pieces based on their position. Separate MG and EG tables allow the engine to value the same square differently depending on game phase — for example, an advanced pawn near promotion is worth far more in the endgame.

Alpha-Beta Pruning: If you find a move that's already better than the best option your opponent has, you don't need to check other moves in that branch.

Null Move Pruning: If giving the opponent a free move still doesn't help them, the position is so good we can skip detailed analysis of this branch.

ProbCut: If a shallow tactical search (captures only) already exceeds a higher beta threshold, the full-depth search is extremely unlikely to fall below beta — so prune immediately.

Singular Extensions: If the TT move is significantly better than all other moves (verified by a reduced-depth search with lowered beta), extend its search depth by one. Ensures the engine doesn't underestimate forced lines.

Late Move Reduction: Moves that appear worse (ordered late) are searched to a shallower depth initially. If they turn out to be good, we re-search them at full depth.

Transposition Table: Chess positions can be reached through different move orders. The table remembers positions we've already evaluated, with striped locking for thread-safe parallel access.

Root-Parallel Search: Multiple threads search different root moves simultaneously with dynamic work distribution, avoiding the complexity of shared search trees.

Tapered Evaluation: The same position is worth different amounts in the opening vs. endgame. The engine maintains two incremental scores (MG and EG) and interpolates between them based on remaining material.

Settings

The game creates and saves your preferences in settings.ini (created automatically on first run):

[Display]
FlipBoard=0
ShowCoordinates=1
ShowLegalMoves=1
AnimateMoves=1
AnimationSpeed=300

[Game]
GameMode=1          # 0=HvH, 1=HvC, 2=CvC
AIDifficulty=3      # 1-10
Threads=4           # 1-64 (auto-detected from CPU)
AutoPromoteQueen=1

[Colors]
LightSquare=240,240,245
DarkSquare=70,80,100

Keyboard Shortcuts

  • Ctrl+N - New game
  • Ctrl+Z - Undo move
  • Ctrl+Y - Redo move
  • F - Flip board
  • L - Toggle legal moves display
  • C - Toggle coordinates
  • Esc - Clear selection

Technical Details

Why So Small?

  1. No External Libraries - Pure WinAPI, no MFC/ATL/Qt
  2. Static Linking - Everything in one executable
  3. Smart Optimization - Compiler flags tuned for size
  4. Minimal CRT - Optional CRT-free build
  5. Reproducible Builds - Uses /BREPRO flag for deterministic output
  6. Data-Oriented Design - Compact data structures optimized for cache locality

Performance Characteristics

  • Memory Usage: 16-64 MB for transposition table (configurable)
  • Startup Time: Near-instant (<100ms)
  • Binary Size: 150-500 KB depending on configuration

Architecture Decisions

Why 64-byte Mailbox?

  • Aligns perfectly with L1 cache line size (64 bytes on most CPUs)
  • Single cache miss loads entire board state
  • Simple indexing without bitboard manipulation overhead
  • Better for iterative traversal patterns in alpha-beta search

Why 1-byte Pieces?

  • No polymorphism = no v-table overhead
  • Cache-friendly: entire board fits in L1 cache
  • Bit masking for piece properties (type, color, moved flag)
  • Predictable memory layout

Why Hand-Crafted Evaluation?

  • Neural networks require large weights (megabytes)
  • HCE provides good strength with minimal size
  • Easier to understand and modify for learning
  • Fast evaluation critical for search performance

Why Stack-Allocated MoveList?

  • Eliminates millions of malloc/free calls per second during search
  • Fixed 256-move array covers all legal chess positions (max ~218 moves)
  • Predictable memory access patterns for CPU prefetcher

Why Incremental MG/EG Scores?

  • Tapered eval requires both scores on every node
  • Recomputing from scratch each call would be O(pieces) per node
  • Incremental updates keep it O(1) — just save/restore one integer per move

🚀 Roadmap

Completed ✅

  • Null Move Pruning (NMP)
  • ProbCut (tactical captures-only, SEE filtered)
  • Late Move Reduction (LMR) — logarithmic formula with re-search
  • Late Move Pruning (LMP)
  • Futility Pruning
  • Reverse Futility Pruning (RFP / Static Null Move)
  • Delta Pruning in Quiescence
  • Mate Distance Pruning
  • Check Extensions
  • Singular Extensions with excludedMove parameter
  • Internal Iterative Deepening (IID)
  • Aspiration Windows
  • SEE (Static Exchange Evaluation) — move ordering and QS pruning
  • Killer Move Heuristic (two slots per ply)
  • Countermove Heuristic
  • History Heuristic with aging (halving per iteration)
  • Tapered Evaluation — separate MG/EG PST tables for all piece types
  • Incremental MG + EG scores (zero-cost per evaluation call)
  • Pawn bitboards for O(1) open-file and pawn shield queries
  • Cache-aligned memory structures
  • Stack-allocated MoveList
  • Root-parallel search with dynamic load balancing
  • NNUE infrastructure (HybridEvaluator, NeuralEvaluator, etc.)

Future 📋

  • NNUE weight training and integration
  • Syzygy tablebase support
  • MultiPV analysis mode
  • Texel Tuning for PST and eval weights

Challenge

Can you beat difficulty level 3? If you can consistently win at level 3, you're playing at a decent amateur level. 🏆

FAQ

Q: Why WinAPI and not cross-platform?
A: This is an educational project focused on simplicity and extreme optimization. WinAPI provides everything needed without extra abstraction layers. A cross-platform version would require SDL/SFML, adding complexity and size.

Q: Can it beat me?
A: It has 10 difficulty levels ranging from beginner-friendly (random moves) to challenging (10-ply search with full heuristics). Try different levels to find your match!

Q: Can I modify it for my own learning?
A: Absolutely! That's the point. The code is well-commented and structured for learning. Fork it, break it, improve it!

Q: Does it work on Windows 7/8?
A: The code uses Windows 10+ APIs, but could be adapted for older systems with minor changes to the WinAPI calls.

Q: Why not use a UCI protocol?
A: The project includes both! There's a UCI console engine (ChessEngineUCI.exe) that works with Arena and other GUIs, plus the main standalone WinAPI application. The standalone version is the focus since it's more educational and shows the complete implementation, but advanced users can compile the UCI version to use the engine with their favorite chess GUI.

Q: What about NNUE?
A: The complete NNUE infrastructure is implemented and ready. The engine currently runs in Classical evaluation mode. To enable NNUE, place a trained nn-small.nnue weight file in the application directory.

Contributing

Found a bug? Have an improvement idea? Feel free to:

  1. Open an issue describing the problem/suggestion
  2. Fork the repository
  3. Create a pull request

Please keep changes focused and well-documented!

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

This project was created during Christmas 2025/2026 as an educational chess engine demonstrating Data-Oriented Design principles in modern C++. Special thanks to the chess programming community for their excellent resources:

Author

Marek Wesołowski (WESMAR)
📧 marek@wesolowski.eu.org
🌐 kvc.pl


Made with ♟️ in C++20 | AVE!

About

Lightweight chess engine written in modern C++20 with pure WinAPI. No dependencies, runs even in Windows Recovery Environment.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors