Skip to content

[Bugfix][Router] Preserve full backend model metadata in /v1/models#927

Open
yzhan1 wants to merge 2 commits intovllm-project:mainfrom
yzhan1:codex/debug-892-fork-updated
Open

[Bugfix][Router] Preserve full backend model metadata in /v1/models#927
yzhan1 wants to merge 2 commits intovllm-project:mainfrom
yzhan1:codex/debug-892-fork-updated

Conversation

@yzhan1
Copy link
Copy Markdown
Contributor

@yzhan1 yzhan1 commented Apr 21, 2026

Summary

This fixes #892 by preserving extra fields returned by backend /v1/models responses instead of rebuilding a minimal router-only model card.

The change stores unknown backend model fields in ModelInfo.extra_fields and serializes them back out from the router's /v1/models endpoint. That keeps metadata such as max_model_len and permission available to clients.

Validation

  • uv run --group test python -m pytest -q src/tests/test_main_router_models.py
  • uv run --group test python -m pytest -q src/tests
  • Live check against a real Dockerized vLLM backend (vllm/vllm-openai-cpu:latest-arm64):
    • backend /v1/models included max_model_len and permission
    • fork/main router output dropped those fields
    • this branch preserved them (base_missing=['max_model_len', 'permission'], fix_missing=[])

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables the preservation of extra metadata fields in ModelInfo and ensures they are returned by the models endpoint. While the implementation includes new tests, feedback highlights that instantiating ModelCard with these extra fields may cause excessive warning logs due to existing model validators. Additionally, it is recommended to move the known_fields set to a class-level constant to improve performance.

Comment thread src/vllm_router/routers/main_router.py
Comment thread src/vllm_router/service_discovery.py Outdated
@yzhan1 yzhan1 force-pushed the codex/debug-892-fork-updated branch from 6d3e092 to 8d88a03 Compare April 21, 2026 04:20
Signed-off-by: Yaoming Zhan <yzhan@Mac.attlocal.net>
@yzhan1 yzhan1 force-pushed the codex/debug-892-fork-updated branch from 8d88a03 to 576f488 Compare April 21, 2026 04:23
Signed-off-by: Yaoming Zhan <yzhan@Mac.attlocal.net>
@yzhan1 yzhan1 marked this pull request as ready for review April 21, 2026 04:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Server doesn't forward the full header to requests

1 participant