Skip to content

FSDP2 example code for tutorial#1343

Merged
weifengpy merged 4 commits into
pytorch:mainfrom
weifengpy:main
May 9, 2025
Merged

FSDP2 example code for tutorial#1343
weifengpy merged 4 commits into
pytorch:mainfrom
weifengpy:main

Conversation

@weifengpy
Copy link
Copy Markdown
Contributor

run FSDP2 on transformer model:

torchrun --nproc_per_node 2 train.py
  • For 1st time, it creates a "checkpoints" folder and save state dicts there
  • For 2nd time, it loads from previous checkpoints

To enable explicit prefetching

torchrun --nproc_per_node 2 train.py --explicit-prefetch

To enable mixed precision

torchrun --nproc_per_node 2 train.py --mixed-precision

To showcse DCP API

torchrun --nproc_per_node 2 train.py --dcp-api

weifengpy added 2 commits May 8, 2025 16:40
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
@netlify
Copy link
Copy Markdown

netlify Bot commented May 8, 2025

Deploy Preview for pytorch-examples-preview canceled.

Name Link
🔨 Latest commit d281dcd
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-examples-preview/deploys/681d4749abc5e40008eda968

@weifengpy weifengpy requested review from mori360 and wconstab May 8, 2025 23:54
Comment thread distributed/FSDP2/README.md Outdated
torchrun --nproc_per_node 2 train.py --mixed-precision
```

To showcse DCP API
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Comment thread distributed/FSDP2/README.md Outdated
torchrun --nproc_per_node 2 train.py --mixed-precision
```

To showcse DCP API
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Comment thread distributed/FSDP2/README.md Outdated
cd distributed/FSDP2
torchrun --nproc_per_node 2 train.py
```
* For 1st time, it creates a "checkpoints" folder and save state dicts there
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save -> saves

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
@weifengpy weifengpy merged commit 7092296 into pytorch:main May 9, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants