Skip to content

multi-algorithm HPO tuning via HyperparameterTuner.create()#5794

Open
ucegbe wants to merge 2 commits intoaws:masterfrom
ucegbe:fix/multi-algo-hpo-and-compression-type
Open

multi-algorithm HPO tuning via HyperparameterTuner.create()#5794
ucegbe wants to merge 2 commits intoaws:masterfrom
ucegbe:fix/multi-algo-hpo-and-compression-type

Conversation

@ucegbe
Copy link
Copy Markdown

@ucegbe ucegbe commented Apr 24, 2026

Issue

Multi-algorithm HPO via HyperparameterTuner.create() is broken

When using HyperparameterTuner.create() with a model_trainer_dict containing multiple trainers, calling tuner.tune() fails with:

AttributeError: 'NoneType' object has no attribute 'training_image'

Root cause: _start_tuning_job unconditionally calls _build_training_job_definition, which accesses self.model_trainer. For multi-algo tuners created via HyperparameterTuner.create(), self.model_trainer is None — only self.model_trainer_dict is populated.

Additionally, the SageMaker CreateHyperParameterTuningJob API expects multi-algo jobs to use the TrainingJobDefinitions parameter (a list) rather than TrainingJobDefinition (singular). The existing code only ever passes the singular form.

Changes

All changes are in sagemaker-train/src/sagemaker/train/tuner.py.

_start_tuning_job

  • Added detection of single vs. multi-algo mode by checking whether self.model_trainer is None and self.model_trainer_dict is populated.
  • For single-algo: behavior is unchanged, passes training_job_definition (singular).
  • For multi-algo: calls new _build_training_job_definitions and passes training_job_definitions (plural) in the tuning request. The underlying HyperParameterTuningJob.create in sagemaker.core.resources already accepts both parameters.

_build_training_job_definitions (new method)

  • Iterates over self.model_trainer_dict and builds a list of HyperParameterTrainingJobDefinition objects, one per trainer.
  • Each definition includes:
    • definition_name — the trainer key from the dict
    • tuning_objective — per-trainer objective from objective_metric_name_dict
    • hyper_parameter_ranges — per-trainer ranges from _hyperparameter_ranges_dict
    • static_hyper_parameters — per-trainer static params from static_hyperparameters_dict
    • metric_definitions — per-trainer metrics from metric_definitions_dict
    • Full passthrough of OutputDataConfig (including compression_type), ResourceConfig, StoppingCondition, environment variables, and VPC config from each ModelTrainer.
  • Input data config handling supports all input types: str, dict, list[Channel], list[InputData], and includes internal ModelTrainer channels (code, sm_drivers).

Testing

Manual validation: Verified that multi-algorithm HPO tuning with HyperparameterTuner.create() using two ModelTrainer instances (XGBoost and LightGBM) successfully launches a tuning job via tuner.tune(). Single-algo tuning behavior is unchanged.

Unit tests: Added tests/unit/train/test_tuner_multi_algo.py (30 tests) covering:

  • TestStartTuningJobBranching (4 tests): Verifies _start_tuning_job routes to _build_training_job_definition (singular) for single-algo and _build_training_job_definitions (plural) for multi-algo, and that the correct key (training_job_definition vs training_job_definitions) is passed in the API request.
  • TestBuildTrainingJobDefinitions (20 tests): Covers the new multi-algo method — one definition per trainer, correct definition_name, per-trainer training images/objectives/HP ranges/static HPs/resource config/stopping condition/role, all input types (string, dict, InputData list, Channel list), internal channel inclusion, deduplication, metric definitions, VPC config, and environment passthrough.
  • TestCompressionTypePassthrough (7 tests): Verifies compression_type (NONE, GZIP) is correctly carried through in both single-algo and multi-algo OutputDataConfig, and that MagicMock values do not leak through the isinstance guard.

All 30 new tests pass. All pre-existing tuner tests continue to pass.

@ucegbe ucegbe changed the title multi-algorithm HPO tuning via HyperparameterTuner.create() and OutputDataConfig compression_type passthrough multi-algorithm HPO tuning via HyperparameterTuner.create() Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant