Use compact representation for rust selfplay by jonbinney · Pull Request #369 · jonbinney/deep_rabbit_hole

jonbinney · 2026-05-17T23:25:47Z

I made this change in preparation for adding parallelism and caching to the rust selfplay. The compact states will save a lot of space. In this PR, I also updated the compact representation to handle the full B9W10 case, which takes 24 bytes instead of just 8 for B5W3. This slows down the the generation of the policy database a bit, but it isn't terrible. And the resulting policy database parquet files for B5W2 are actually about the same size, I assume because of compression.

There are a ton of changed lines in this PR for three main reasons:

the game state type shows up in lots of places, so lots of code gets touched
some things like board rotation and NN feature generation is now implemented for both grid and compact representations
more tests

I've tested that B5W2 training still works with this PR, and that the policy database stuff still works. I've also uploaded new policy databases to W&B for B5W2 and B5W1.

On 0.65 I'm getting compilation errors for our code.

Needed for rust self-play code.

ActionSelector, Evaluator, MCTS nodes, and game_runner now operate on (u64 data, &QGameMechanics) instead of cloning GameState. MCTS nodes store their data eagerly at creation, removing the lazy get_or_create_game caching path. Adds compact_state_to_resnet_input and rotate_compact_state mirroring the existing GameState equivalents, plus get_action_mask / apply_action_index / is_game_over / winner helpers on QGameMechanics. game_runner only materializes a GameState for observer callbacks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

QGameMechanics owns goal_rows and does not flip them under rotation, so get_action_mask_immut on rotated data treats walls that block the rotated player's path as legal. Use remap_mask on the original mask instead, and add a test that pins this contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The single-u64 packed state could not hold a 9x9 board (128 wall bits alone exceed u64). Split the layout so the wall bitmap lives in a u128 and the scalar fields in a u64; every accessor stays within one primitive. Policy DB schema switches to FixedSizeBinary(24) with lex byte ordering; PyO3 functions accept/return state as 24-byte bytes buffers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

alejandromarcu

Can't really review the code but what you're doing LGTM

jonbinney and others added 6 commits May 17, 2026 19:21

Fix numba version to <= 0.61

e20c3b2

On 0.65 I'm getting compilation errors for our code.

Add onnx packages to requirements.txt

99ac34d

Needed for rust self-play code.

Fix rust formatting

51d7198

jonbinney marked this pull request as ready for review May 17, 2026 23:48

jonbinney assigned alejandromarcu May 17, 2026

alejandromarcu approved these changes May 18, 2026

View reviewed changes

jonbinney merged commit d5fcb3a into main May 18, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use compact representation for rust selfplay#369

Use compact representation for rust selfplay#369
jonbinney merged 6 commits into
mainfrom
jdb/rust-selfplay-use-compact-repr

jonbinney commented May 17, 2026 •

edited

Loading

Uh oh!

alejandromarcu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jonbinney commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alejandromarcu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jonbinney commented May 17, 2026 •

edited

Loading