[Audio support] merge into main by alex-t-hu · Pull Request #9 · DDVD233/mirl

alex-t-hu · 2025-09-09T22:46:49Z

merge audio support in. i assume Qwen2_5OmniThinkerForConditionalGeneration is in HF so no need for custom logic; keep some debugging stuff here and there (can remove if needed); there is this video budget change in vision_utils.py , that should be a setting?
in rl_dataset , there's a ton of changes, and i try to respect the new audio support which uses self.modalities . there were some intriguing areas like here where self.apply_chat_template_kwargs gets removed
raw_prompt = self.processor.apply_chat_template(
messages, add_generation_prompt=True, tokenize=False, **self.apply_chat_template_kwargs
)

Keane Fix Audio

DDVD233 · 2025-09-09T23:07:36Z

verl/utils/dataset/rl_dataset.py

            raw_prompt = self.processor.apply_chat_template(
                messages, add_generation_prompt=True, tokenize=False, **self.apply_chat_template_kwargs
            )
            multi_modal_data = {}


I think self.apply_chat_template_kwargs is passed here. Is there a place where it's not passed?

DDVD233 added 30 commits August 11, 2025 14:54

Adapt to Our Datasets (#1)

1303a69

Experimental: Add support for audio training

a824ed5

Debug for audios

30e1e06

Debug for audios

732e70b

Add omni support

3892b05

Add omni support

364713f

Add omni support

954a52d

Add omni support

e6ee543

Use torchaudio

313846a

Debug for audio

fbe5a7f

Debug for audio

704a666

Debug for audio

ee0f78f

Debug for audio

3cb5165

Update prompt

902007c

debug

f976def

debug

95735c8

debug

2fdd078

debug

461e995

debug

4b6ee75

debug

551fde3

debug

c3a0f66

debug

3bae315

debug

2552705

debug

27b0dee

debug

78b97e3

debug

ada904d

debug

b0162af

debug

efbab94

Debug

6985493

Reduce batch size / remove kl

0f94e3b

keanepotato and others added 28 commits August 19, 2025 21:20

_

eaa2f40

_

c2e0c3d

_

55d108e

_

650550e

debug_off

b0e7a9b

_

3a73517

_

7d11247

_

d68f375

_

eca7a4a

_

8ecf257

_

8dfbd2a

_

ee9d04c

push modality sampler

bbfca47

push modality sampler

68a07b3

add debug

620f56d

_

1e10146

add to config

0d5d23f

add features

9ceabdc

touch up dataloader

9e21b46

debug write

564b966

debug write

3c74a53

debug counterfactual

67b9387

debug counterfact

bd9ca3f

_

9465b06

_

772a471

_

6a6f11f

Merge pull request #3 from DDVD233/keane

682673d

Keane Fix Audio

Merge branch 'main' into audio_support

694c2d3

alex-t-hu requested a review from DDVD233 September 9, 2025 22:46

DDVD233 reviewed Sep 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Audio support] merge into main#9

[Audio support] merge into main#9
alex-t-hu wants to merge 234 commits intomainfrom
audio_support

alex-t-hu commented Sep 9, 2025

Uh oh!

DDVD233 Sep 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alex-t-hu commented Sep 9, 2025

Uh oh!

DDVD233 Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants