Skip to content

Map overlap GPUs through infer placement#19

Open
TianyeGGBond wants to merge 1 commit into
rlops:zhenyu/miles-mvp-e2efrom
TianyeGGBond:tianye/f2-map-overlap-gpus
Open

Map overlap GPUs through infer placement#19
TianyeGGBond wants to merge 1 commit into
rlops:zhenyu/miles-mvp-e2efrom
TianyeGGBond:tianye/f2-map-overlap-gpus

Conversation

@TianyeGGBond
Copy link
Copy Markdown
Collaborator

@TianyeGGBond TianyeGGBond commented May 29, 2026

Context

_wait_for_overlap_engines_offloaded receives physical GPU IDs from the scheduler, while the rollout manager expects local engine indices within the pipeline's inference placement. The previous conversion assumed the inference GPUs were contiguous and derived engine indices by subtracting the first GPU ID.

Change

  • add _resolve_overlap_infer_engines under the file's Helpers section
  • resolve overlapping train GPUs through the configured actor_infer placement
  • return explicit overlap_engine_indices and overlap_gpu_ids for the wait logic
  • reject inference mappings whose length is not divisible by rollout_num_gpus_per_engine

Validation

  • python -m py_compile rlix/pipeline/miles_pipeline.py
  • git diff --check

@TianyeGGBond TianyeGGBond force-pushed the tianye/f2-map-overlap-gpus branch 2 times, most recently from ea4cdc2 to 541844b Compare May 29, 2026 01:08
@TianyeGGBond TianyeGGBond force-pushed the tianye/f2-map-overlap-gpus branch from 541844b to 32a5485 Compare May 29, 2026 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant