Add Auto-FL literature events and local compute budget search by holgerroth · Pull Request #4523 · NVIDIA/NVFlare

holgerroth · 2026-05-05T23:22:19Z

Summary

add Auto-FL literature-review event tracking and progress plotting updates
simplify devcontainer/git setup around local experiment branches and commits
add local-compute search support with --local_train_steps alongside epoch-based training
refresh the Auto-FL README, program guidance, and bundled agent skill/runbook

Validation

python3 -m py_compile research/auto-fl-research/client.py research/auto-fl-research/job.py research/auto-fl-research/scripts/append_result.py research/auto-fl-research/scripts/validate_contract.py
black --check research/auto-fl-research/client.py research/auto-fl-research/job.py
isort --check-only --profile black research/auto-fl-research/client.py research/auto-fl-research/job.py
flake8 research/auto-fl-research
git diff --check
make validate from research/auto-fl-research
make smoke from research/auto-fl-research passed static checks and skipped runtime launch because the host Python has incompatible NVFlare API paths installed

Notes

Runtime experiment artifacts such as local results.tsv are not included in this PR.

holgerroth · 2026-05-05T23:29:20Z

/build

greptile-apps · 2026-05-05T23:34:56Z

Greptile Summary

This PR extends the Auto-FL research harness with three related capabilities: literature-review event tracking (new log_literature_review.py, plateau_watchdog.py, and plotting support), a step-based local compute budget (--local_train_steps) alongside the existing epoch-based mode, and tightened experiment provenance guards in the shell scripts.

Literature-review loop: log_literature_review.py records timed review cycles into results.tsv as status=literature rows; plateau_watchdog.py counts scored-candidate runs since the last material improvement or literature reset and emits a recommendation=literature/continue signal; plot_progress.py renders vertical event markers and annotation labels for these rows and separates literature runtime from candidate runtime in titles and the summary box.
Step-based training: --local_train_steps > 0 replaces the epoch loop with a step-counted iterator that recycles the data loader, calls scheduler.step() per optimizer step, and computes TensorBoard scalars relative to global step rather than global epoch; job.py, mutation_schema.yaml, and run_iteration.sh are updated consistently.
Provenance guards: init_run.sh and run_iteration.sh now hard-fail (exit 2) when run outside a git clone or outside an autoresearch/* branch, instead of silently degrading.

Confidence Score: 5/5

Safe to merge — all changed code paths are well-bounded, and no training or scoring logic is altered in a way that would corrupt existing results.

The step-based training path is properly isolated behind local_train_steps > 0, with matching validation in both job.py and client.py, correct NUM_STEPS_CURRENT_ROUND bookkeeping, and consistent scheduler and scaffold handling. The new literature-event scripts are append-only helpers that do not touch training outcomes. The only non-trivial concern is a cosmetic ambiguity in the repeated_terms diagnostic printed by plateau_watchdog.py, which does not affect the recommendation output the automation consumes.

No files require special attention beyond a close read of the plateau_watchdog.py diagnostic output logic.

Important Files Changed

Filename	Overview
research/auto-fl-research/client.py	Adds --local_train_steps for step-based training alongside epoch mode; correctly guards against empty data loaders, updates scheduler T_max, steps scheduler per optimizer step inside the new branch, and adjusts TensorBoard scalars and NUM_STEPS_CURRENT_ROUND.
research/auto-fl-research/job.py	Mirrors the --local_train_steps argument and passes it through to client.py; validation guards match client.py.
research/auto-fl-research/scripts/plateau_watchdog.py	New script detecting search plateaus to recommend literature review; the repeated-term deduplication heuristic has an edge case where equal-count wildcard and specific entries may both appear in output, but this only affects informational diagnostics and not the recommendation field.
research/auto-fl-research/scripts/log_literature_review.py	New script for start/finish/log literature-review timing events; timer state persisted via a tmp-dir JSON keyed on the results parent directory hash; correctly handles missing timer files.
research/auto-fl-research/scripts/plot_progress.py	Adds literature-event vertical markers and annotation labels to the progress plot; select_literature_labels correctly caps at max_labels; separates candidate and literature runtimes in the title and summary box.
research/auto-fl-research/scripts/append_result.py	Adds --init-only mode and makes most fields optional, with post-parse validation that requires all non-score fields for non-init invocations and requires --score unless --status=literature.
research/auto-fl-research/scripts/validate_contract.py	Updates the evaluate-branch contract check from ParamsType.DIFF to FLModel with metrics to match the new metrics-only send in the eval path; new contains_metrics_flmodel helper correctly uses the optional-predicate overload of call_has_keyword.
research/auto-fl-research/scripts/finalize_batch_status.py	Fixes --last N semantics to select the last N candidate rows instead of the last N ledger rows; adds literature to the allowed status set.
research/auto-fl-research/scripts/run_iteration.sh	Adds an autoresearch/* branch guard before launching experiments with result logging, and bumps the default timeout from 600s to 1200s to match the updated budget in mutation_schema.yaml.
research/auto-fl-research/mutation_schema.yaml	Adds local_train_steps to mutable args and budget defaults; relaxes aggregation_epochs from the fixed_within_campaign list now that it is a local-compute knob alongside local_train_steps; doubles run_timeout_seconds to 1200.

_{Reviews (7): Last reviewed commit: "Merge branch 'main' into codex/auto-fl-l..." | Re-trigger Greptile}

holgerroth · 2026-05-06T00:17:09Z

/build

ZiyueXu77

nice to have literature and plateau watch

holgerroth · 2026-05-06T16:29:38Z

/build

holgerroth · 2026-05-06T16:56:13Z

/build

holgerroth · 2026-05-06T19:24:23Z

/build

holgerroth · 2026-05-06T21:29:07Z

/build

holgerroth · 2026-05-06T21:52:09Z

/build

holgerroth added 13 commits May 4, 2026 13:02

Track Auto-FL literature review events

7c1c3a8

Ignore local devcontainer config

697a152

Fix Auto-FL cross-site eval metrics

7fc8080

Ensure Auto-FL ledger exists before runs

952a6b5

Clarify Auto-FL devcontainer setup entrypoint

916b341

Update Auto-FL experiment README defaults

6ce373e

Use devcontainer template setup before Auto-FL runs

75ccd4f

Clarify Auto-FL devcontainer GPU run args

ec8ff08

Start Auto-FL devcontainer from harness directory

6c5d91f

Run Auto-FL devcontainer from NVFlare root

d0e57bf

Simplify Auto-FL devcontainer git guidance

25d2951

Add Auto-FL local compute budget search

4faf7f9

update figure

d0585d4

holgerroth marked this pull request as ready for review May 5, 2026 23:29

Merge branch 'main' into codex/auto-fl-literature-events

cca1734

holgerroth requested review from ZiyueXu77 and pcnudde May 5, 2026 23:29

greptile-apps Bot reviewed May 5, 2026

View reviewed changes

Comment thread research/auto-fl-research/scripts/plot_progress.py

Cap Auto-FL literature plot labels

43a6543

holgerroth enabled auto-merge (squash) May 6, 2026 00:17

holgerroth disabled auto-merge May 6, 2026 13:56

Add Auto-FL plateau literature watchdog

a2fd35a

holgerroth enabled auto-merge (squash) May 6, 2026 16:18

Merge branch 'main' into codex/auto-fl-literature-events

555b733

ZiyueXu77 approved these changes May 6, 2026

View reviewed changes

Merge branch 'main' into codex/auto-fl-literature-events

6eb6f7a

Merge branch 'main' into codex/auto-fl-literature-events

02b0581

Merge branch 'main' into codex/auto-fl-literature-events

28344d2

holgerroth merged commit e275c37 into NVIDIA:main May 6, 2026
24 of 25 checks passed

holgerroth mentioned this pull request May 7, 2026

[2.8] Add Auto-FL literature events and local compute budget search #4542

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Auto-FL literature events and local compute budget search#4523

Add Auto-FL literature events and local compute budget search#4523
holgerroth merged 20 commits intoNVIDIA:mainfrom
holgerroth:codex/auto-fl-literature-events

holgerroth commented May 5, 2026

Uh oh!

holgerroth commented May 5, 2026

Uh oh!

greptile-apps Bot commented May 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

ZiyueXu77 left a comment

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

holgerroth commented May 5, 2026

Summary

Validation

Notes

Uh oh!

holgerroth commented May 5, 2026

Uh oh!

greptile-apps Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

ZiyueXu77 left a comment

Choose a reason for hiding this comment

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

holgerroth commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented May 5, 2026 •

edited

Loading