Skip to content

[FEA]: nemo_retriever: Automatically have .extract() task handle delegating to .extract_audio #1671

@randerzander

Description

@randerzander

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Significant improvement

Please provide a clear description of problem this feature solves

To ingest audio and video files, the user must setup the ingestor specifically to handle audio extraction:

ingestor = create_ingestor(run_mode="batch")
ingestor = ingestor.files([str(INPUT_AUDIO)]).extract_audio()

Describe the feature, and optionally a solution or implementation and any alternatives

Ideally the user should not have to tell the ingestor to handle audio specially via explicit .extract_audio() task

ingestor = create_ingestor(run_mode="batch")
# the .extract() call should handle routing any audio and video files in INPUT_DIR
ingestor = ingestor.files([str(INPUT_DIR)]).extract()

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions