Skip to content

MLflowTracker.log drops log_kwargs; tracker.log requires step despite docstring #4041

@1fanwang

Description

@1fanwang

System Info

accelerate: 1.14.0.dev0 (commit 29e03d185d6c4608472b3b866964c8942c2fa4a3)
mlflow: 3.12.0
python: 3.11.14
platform: macOS-26.3-arm64

Information

  • My own modified scripts

Tasks

  • My own task or dataset (give details below)

Reproduction

Two bugs in src/accelerate/tracking.py, both observable from the public API.

Setup:

import tempfile
from accelerate import Accelerator
from accelerate.tracking import MLflowTracker

tmp = tempfile.TemporaryDirectory()
tracker = MLflowTracker(experiment_name="repro", logging_dir=tmp.name)
accelerator = Accelerator(log_with=tracker)
accelerator.init_trackers(project_name="repro")

Bug 1 — tracker.log() requires step even though the docstring says it's optional

>>> tracker.log({"loss": 0.1})
Traceback (most recent call last):
  ...
TypeError: MLflowTracker.log() missing 1 required positional argument: 'step'

Looking at src/accelerate/tracking.py at upstream/main:

161  def log(self, values: dict, step: Optional[int], **kwargs):        # GeneralTracker
272  def log_images(self, values: dict, step: Optional[int], **kwargs): # TensorBoardTracker
640  def log(self, values: dict, step: Optional[int], **kwargs):        # AimTracker
816  def log(self, values: dict, step: Optional[int]):                  # MLflowTracker (no default, and no **kwargs)

The docstring on each of these four says step is optional:

step (`int`, *optional*):
    The run step. If included, the log will be affiliated with this step.

WandB / CometML / ClearML / DVCLive / TrackioTracker all declare step: Optional[int] = None. The four above are out of sync with both their own docstrings and their sibling trackers in the same file.

Bug 2 — MLflowTracker.log swallows per-tracker kwargs from Accelerator.log(log_kwargs=...)

Accelerator.log(values, log_kwargs={"mlflow": {...}}) is documented to forward provider-specific kwargs to each tracker via tracker.log(values, step=step, **log_kwargs.get(tracker.name, {})) (src/accelerate/accelerator.py:3381). MLflowTracker.log does not accept **kwargs, so:

>>> accelerator.log({"loss": 0.1}, step=1, log_kwargs={"mlflow": {"synchronous": True}})
Traceback (most recent call last):
  ...
TypeError: MLflowTracker.log() got an unexpected keyword argument 'synchronous'

WandB / CometML / etc. all accept and forward **kwargs.

Regression test (failing on upstream/main, passing with the fix)

#4039 adds two regression tests under MLflowTrackingTest. Captured against upstream/main (pre-fix):

$ pytest tests/test_tracking.py::MLflowTrackingTest::test_log_without_step -v --tb=short
tests/test_tracking.py::MLflowTrackingTest::test_log_without_step FAILED
=================================== FAILURES ===================================
___________________ MLflowTrackingTest.test_log_without_step ___________________
tests/test_tracking.py:331: in test_log_without_step
    tracker.log({"loss": 0.1})
src/accelerate/tracking.py:89: in execute_on_main_process
    return PartialState().on_main_process(function)(self, *args, **kwargs)
E   TypeError: MLflowTracker.log() missing 1 required positional argument: 'step'
========================= 1 failed, 1 warning in 3.95s =========================

$ pytest tests/test_tracking.py::MLflowTrackingTest::test_log_accepts_extra_kwargs -v --tb=short
tests/test_tracking.py::MLflowTrackingTest::test_log_accepts_extra_kwargs FAILED
=================================== FAILURES ===================================
_______________ MLflowTrackingTest.test_log_accepts_extra_kwargs _______________
tests/test_tracking.py:346: in test_log_accepts_extra_kwargs
    accelerator.log({"loss": 0.1}, step=1, log_kwargs={"mlflow": {"synchronous": True}})
src/accelerate/accelerator.py:3381: in log
    tracker.log(values, step=step, **log_kwargs.get(tracker.name, {}))
src/accelerate/tracking.py:89: in execute_on_main_process
    return PartialState().on_main_process(function)(self, *args, **kwargs)
E   TypeError: MLflowTracker.log() got an unexpected keyword argument 'synchronous'
========================= 1 failed, 1 warning in 3.08s =========================

With #4039 applied: both PASS.

Expected behavior

  • tracker.log({"loss": 0.1}) (and the same for GeneralTracker.log, TensorBoardTracker.log_images, AimTracker.log) should succeed with step defaulting to None, matching the docstring.
  • accelerator.log({"loss": 0.1}, step=1, log_kwargs={"mlflow": {"synchronous": True}}) should forward synchronous to mlflow.log_metrics rather than raising.

Fix in #4039.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions