Skip to content

Configure overlap free memory threshold#21

Open
TianyeGGBond wants to merge 1 commit into
rlops:zhenyu/miles-mvp-e2efrom
TianyeGGBond:tianye/f2-configure-free-memory
Open

Configure overlap free memory threshold#21
TianyeGGBond wants to merge 1 commit into
rlops:zhenyu/miles-mvp-e2efrom
TianyeGGBond:tianye/f2-configure-free-memory

Conversation

@TianyeGGBond
Copy link
Copy Markdown
Collaborator

Context

The overlap wait checks nvidia-smi free memory before allowing the train actor to resume CUDA allocations. The previous threshold was fixed at 20 GiB, which is not appropriate for every GPU shape or deployment.

Change

  • read the free-memory threshold from MILES_MIN_FREE_GPU_MEM_GB
  • keep the existing default at 20.0 GiB to preserve current behavior when the env var is unset

Validation

  • python -m py_compile rlix/pipeline/miles_pipeline.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant