Draft
Conversation
…search Remove tqdm progress bars, loguru logging, gc.collect calls, EMA, tensorboard, and checkpoint saving from the training loop. Output is now just epoch summaries + structured results block for agent parsing.
Replace brute-force bridge detection (deepcopy + is_connected per edge) with nx.bridges() which runs a single O(V+E) DFS. Also fixes a bug for disconnected molecules (e.g. 1esz, which has a 106-atom component + 1 isolated atom). The old code checked nx.is_connected(G2) after removing each edge, but if the graph was already disconnected, *every* edge removal produced a disconnected G2, so the code never hit `continue`. Then the smallest connected component was always the pre-existing isolated atom (size 1), so every edge was filtered by the `len(l) < 2` guard, returning [] even though the large component had 37 valid bridge torsions. nx.bridges() correctly identifies bridge edges within each connected component regardless of the overall graph connectivity. Verified equivalent output on 500 PDBbind ligands (499/500 match; the 1 difference is the bug fix above).
Add --matching flag (none/original/improved) to select ground truth conformer at training time. Cache now stores both DE-matched and L-BFGS-B-matched conformer positions plus the crystal Mol object. Epoch-end inference uses fresh RDKit conformers for realistic evaluation and reports RMSD against both matched conformer and crystal pose. Fix RDKit 2025 EmbedMolecule crash via RemoveStereochemistry, use fused_tp CUDA kernels for torsion tensor product, and add diagnostic print statements for cache build failures.
When EMA is disabled (the default), the training loop was still allocating shadow params, running per-batch EMA updates, and doing 5 full parameter copies per epoch (store/copy_to/deepcopy/restore/state_dict). This caused visible GPU utilization dips at epoch boundaries. Now the EMA object is simply not created when disabled, making the non-EMA path zero-cost.
Validation inference now supports generating multiple diffusion samples per complex and reporting the best RMSD. Defaults to 1 (existing behavior).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.