Skip to content

[FEA]: Make it easier to configure different inference options in nemo_retriever library #1669

@randerzander

Description

@randerzander

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Significant improvement

Please provide a clear description of problem this feature solves

We should make the no GPU required experience simpler.

Describe the feature, and optionally a solution or implementation and any alternatives

Currently the user has to specify all the build.nvidia.com URLs. It would be much nicer if they could just setup the ingestor more simply with something like:

ingestor = create_ingestor(run_mode="batch", inference="build.nvidia.com")

So inference could have several values:
local: use visible local GPUs (and fail with relevant error details if none found)

build.nvidia.com - use build hosted inference

nims - self hosted NIM services . This probably needs an additional argument endpoints which could be path to a yaml file defining the endpoints, or a python dict of NIM endpoints

openrouter - future need once more retriever models are hosted on openrouter

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions