Use compact representation for rust selfplay#369
Merged
Conversation
On 0.65 I'm getting compilation errors for our code.
Needed for rust self-play code.
ActionSelector, Evaluator, MCTS nodes, and game_runner now operate on (u64 data, &QGameMechanics) instead of cloning GameState. MCTS nodes store their data eagerly at creation, removing the lazy get_or_create_game caching path. Adds compact_state_to_resnet_input and rotate_compact_state mirroring the existing GameState equivalents, plus get_action_mask / apply_action_index / is_game_over / winner helpers on QGameMechanics. game_runner only materializes a GameState for observer callbacks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
QGameMechanics owns goal_rows and does not flip them under rotation, so get_action_mask_immut on rotated data treats walls that block the rotated player's path as legal. Use remap_mask on the original mask instead, and add a test that pins this contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The single-u64 packed state could not hold a 9x9 board (128 wall bits alone exceed u64). Split the layout so the wall bitmap lives in a u128 and the scalar fields in a u64; every accessor stays within one primitive. Policy DB schema switches to FixedSizeBinary(24) with lex byte ordering; PyO3 functions accept/return state as 24-byte bytes buffers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
alejandromarcu
approved these changes
May 18, 2026
alejandromarcu
left a comment
Collaborator
There was a problem hiding this comment.
Can't really review the code but what you're doing LGTM
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I made this change in preparation for adding parallelism and caching to the rust selfplay. The compact states will save a lot of space. In this PR, I also updated the compact representation to handle the full B9W10 case, which takes 24 bytes instead of just 8 for B5W3. This slows down the the generation of the policy database a bit, but it isn't terrible. And the resulting policy database parquet files for B5W2 are actually about the same size, I assume because of compression.
There are a ton of changed lines in this PR for three main reasons:
I've tested that B5W2 training still works with this PR, and that the policy database stuff still works. I've also uploaded new policy databases to W&B for B5W2 and B5W1.