解决synthetic tasks每次迭代选出样本点可能重复的情况,修复一定次数迭代后会卡在某一轮rollout情况#18
Open
le876 wants to merge 1 commit intoBop2000:mainfrom
Open
解决synthetic tasks每次迭代选出样本点可能重复的情况,修复一定次数迭代后会卡在某一轮rollout情况#18le876 wants to merge 1 commit intoBop2000:mainfrom
le876 wants to merge 1 commit intoBop2000:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
候选样本去重与老样本过滤: np.concatenate 之后新增 _round_points → np.unique → _filter_unique,把同一坐标的候选合成一个,并剔除已经采过的点,解决“每轮选出的样本会重复” 的问题。
排除选择阶段的 UCT 环: 原版 OptTask 将 (坐标, 预测值) 一起参与哈希/相等性判断,数值噪声会把同一坐标视为不同节点,推测会形成 A → B → C → A 的有向环。现将节点的 hash / eq 改为只依赖坐标,并在选择路径里维护 seen_tups,一旦遇到已出现的坐标就退回父节点,保证搜索结构始终是树形。
子节点生成只在本轮去重: 原逻辑把所有历史 children 都带入下一轮,每次生成子节点时都会从整个历史状态节点中进行比对。现在 _local_seen / _initial_visited 按轮更新,data_process、most_visit_node 都会过滤旧节点,只在本轮内判重;同时 _record_node 使用统一 rounding,防止浮点噪声导致的重复。