Conversation
|
The fix looks good to me. Thanks for the fix. Let's wait until pytorch/pytorch#149485 is merged. |
|
Hi, @shengfukevin Then, can this Param PR be merged now? |
|
@shengfukevin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
6b86d24 to
113c54c
Compare
|
@TaekyungHeo has updated the pull request. You must reimport the pull request before landing. |
113c54c to
21efb25
Compare
|
@TaekyungHeo has updated the pull request. You must reimport the pull request before landing. |
21efb25 to
83e653f
Compare
|
@TaekyungHeo has updated the pull request. You must reimport the pull request before landing. |
|
@shengfukevin please help review. thanks! |
|
will get it done this week. |
|
@shengfukevin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary
Fix support to replay all2all.
Depend on PyTorch PR-149485.
Test Plan
constructed 4 rank case to invoke

torch.distributed.all_to_all()andtorch.distributed.all_to_all_single(), then dump trace and replay.