Skip to content

Add transfer inference control-CFG and per-hint defaults#16

Open
trungtpham wants to merge 3 commits into
NVIDIA:mainfrom
trungtpham:feature/transfer-control-guidance
Open

Add transfer inference control-CFG and per-hint defaults#16
trungtpham wants to merge 3 commits into
NVIDIA:mainfrom
trungtpham:feature/transfer-control-guidance

Conversation

@trungtpham
Copy link
Copy Markdown

Summary

  • Add control-CFG (control_guidance, optional interval) to sampling args and OmniMoTModel.generate_samples_from_batch, including a no-control branch for multi-vision transfer samples.
  • Wire transfer inference in OmniInference: dedicated batch path (batch size 1), skip generic get_sample_data when transfer hints are set, pass control-CFG through generate_transfer_sample.
  • Extend sample args with transfer hint fields on SampleData (edge / blur / depth / seg / wsm), _TRANSFER_DEFAULTS (guidance, control_guidance, shift; WSM 101f @ 10 fps), and control-only specs via control_path without vision_path.
  • Load JSON prompts and negative_prompt_file only for transfer specs.

Test plan

  • Run transfer cookbook / torchrun -m cosmos_framework.scripts.inference with specs/edge.json on Cosmos3-Nano (--no-guardrails).
  • Verify control_guidance != 1.0 runs without fallback warning when control + target vision items are present.
  • Smoke each hint type (edge, blur, depth, seg, wsm) with precomputed control_path only.

Wire control_guidance through args, inference routing, and OmniMoTModel;
expose transfer hints on sample args with tuned per-control defaults and
control_path-only specs; load JSON prompts and negative captions for transfer.
@trungtpham trungtpham force-pushed the feature/transfer-control-guidance branch from 0c7b571 to 24d78b9 Compare June 3, 2026 21:36
dummy_sa = sample_args_list[0].model_copy(
update={"output_dir": None, "name": "padding", "num_steps": 1, "guidance": 1.0}
)
dummy_sa = sample_args_list[0].model_copy(update={"output_dir": None, "name": "padding"})
Copy link
Copy Markdown
Collaborator

@foreverlms foreverlms Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this reverted? The previous change is for reduce potential computation redundancy.

config = None
else:
model_dict = setup_args.load_model_config_dict()
if setup_args.vlm_processor_from_checkpoint:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this removed? this is for cosmos-nano-test-to-image series models.

revision="main",
),
),
# Task-specialized Super variants published as diffusers HF checkpoints.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@lfengad
Copy link
Copy Markdown
Collaborator

lfengad commented Jun 4, 2026

@trungtpham Seems that we need first rebase on some newest features from the current main like @foreverlms noted. Currently seems that some new features are occasionally reverted. THX!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants