Skip to content

DDP for FineTuning#812

Open
psinger-prior wants to merge 10 commits intomainfrom
psi/finetuning_v1
Open

DDP for FineTuning#812
psinger-prior wants to merge 10 commits intomainfrom
psi/finetuning_v1

Conversation

@psinger-prior
Copy link
Contributor

@psinger-prior psinger-prior commented Mar 9, 2026

Issue

Closing #809

Motivation and Context


Public API Changes

  • No Public API changes
  • Yes, Public API changes (Details below)

How Has This Been Tested?

Running example scripts on single and multi gpu nodes.


Checklist

  • The changes have been tested locally.
  • Documentation has been updated (if the public API or usage changes).
  • A changelog entry has been added (see changelog/README.md), or "no changelog needed" label requested.
  • The code follows the project's style guidelines.
  • I have considered the impact of these changes on the public API.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces Distributed Data Parallel (DDP) support for fine-tuning, which is a great enhancement for multi-GPU training. The implementation is thorough, covering distributed sampling, metric synchronization, and efficient model state saving. I've found one area for improvement regarding optimizer initialization in the DDP setup for better consistency and adherence to best practices.

@psinger-prior psinger-prior marked this pull request as ready for review March 9, 2026 11:42
@psinger-prior psinger-prior requested a review from a team as a code owner March 9, 2026 11:42
@psinger-prior psinger-prior requested review from alanprior and removed request for a team March 9, 2026 11:42
@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant