Skip to content

feat: Add train() and optimize() methods to TrainJobTemplate#347

Open
sujalshah-bit wants to merge 1 commit intokubeflow:mainfrom
sujalshah-bit:add_train_optimize_method
Open

feat: Add train() and optimize() methods to TrainJobTemplate#347
sujalshah-bit wants to merge 1 commit intokubeflow:mainfrom
sujalshah-bit:add_train_optimize_method

Conversation

@sujalshah-bit
Copy link

What this PR does / why we need it:
TrainJobTemplate was a passive data container with no way to actually execute jobs. This change lets the template act as an entrypoint to both TrainerClient and OptimizerClient.

  • train() delegates to TrainerClient.train() using the template's pre-configured runtime, initializer, and trainer
  • optimize() delegates to OptimizerClient.optimize() passing self as the trial_template for hyperparameter tuning

TYPE_CHECKING is used to avoid circular imports since TrainerClient and OptimizerClient both already import types.py.

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):

Fixes NONE

Checklist:

  • Docs included if any changes are user facing

Copilot AI review requested due to automatic review settings March 3, 2026 01:31
@google-oss-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kramaranya for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

🎉 Welcome to the Kubeflow SDK! 🎉

Thanks for opening your first PR! We're happy to have you as part of our community 🚀

Here's what happens next:

  • If you haven't already, please check out our Contributing Guide for repo-specific guidelines and the Kubeflow Contributor Guide for general community standards
  • Our team will review your PR soon! cc @kubeflow/kubeflow-sdk-team

Join the community:

Feel free to ask questions in the comments if you need any help or clarification!
Thanks again for contributing to Kubeflow! 🙏

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR turns TrainJobTemplate from a passive configuration object into an executable entrypoint by adding convenience methods that delegate to TrainerClient and OptimizerClient.

Changes:

  • Added TrainJobTemplate.train() to submit a TrainJob via an injected TrainerClient.
  • Added TrainJobTemplate.optimize() to submit an OptimizationJob via an injected OptimizerClient (using the template as trial_template).
  • Introduced postponed annotation evaluation (from __future__ import annotations) plus TYPE_CHECKING imports to avoid runtime circular imports.


# Change it to list: BUILTIN_CONFIGS, once we support more Builtin Trainer configs.
TORCH_TUNE = BuiltinTrainer.__annotations__["config"].__name__.lower().replace("config", "")
TORCH_TUNE = TORCH_TUNE = TorchTuneConfig.__name__.lower().replace("config", "")
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant assignment TORCH_TUNE = TORCH_TUNE = ... is redundant and looks accidental; simplify to a single assignment to avoid confusion and potential lint issues.

Suggested change
TORCH_TUNE = TORCH_TUNE = TorchTuneConfig.__name__.lower().replace("config", "")
TORCH_TUNE = TorchTuneConfig.__name__.lower().replace("config", "")

Copilot uses AI. Check for mistakes.
Comment on lines +528 to +533
def train(
self,
client: TrainerClient,
options: list | None = None,
) -> str:
"""Create a TrainJob using this template's configuration.
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment above TrainJobTemplate about adding train()/optimize() is now outdated after introducing these methods; remove or update it to avoid misleading future readers.

Copilot uses AI. Check for mistakes.
TrainJobTemplate was a passive data container with no way to
actually execute jobs. This change lets the template act as
an entrypoint to both TrainerClient and OptimizerClient.

- train() delegates to TrainerClient.train() using the
  template's pre-configured runtime, initializer, and trainer
- optimize() delegates to OptimizerClient.optimize() passing
  self as the trial_template for hyperparameter tuning

TYPE_CHECKING is used to avoid circular imports since
TrainerClient and OptimizerClient both already import types.py.

Signed-off-by: Sujal Shah <sujalshah28092004@gmail.com>
@sujalshah-bit sujalshah-bit force-pushed the add_train_optimize_method branch from e32e9e4 to c13ea6b Compare March 3, 2026 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants