Skip to content

[Feat] feat(aliases): allow setting reasoning level#912

Open
Killusions wants to merge 1 commit intovllm-project:mainfrom
Killusions:feat/allow-setting-reasoning-level-in-aliases
Open

[Feat] feat(aliases): allow setting reasoning level#912
Killusions wants to merge 1 commit intovllm-project:mainfrom
Killusions:feat/allow-setting-reasoning-level-in-aliases

Conversation

@Killusions
Copy link
Copy Markdown
Contributor

@Killusions Killusions commented Apr 10, 2026

Adds optional reasoning_effort support to static model aliases. The main use case is replacing non-reasoning models with reasoning ones without changing client behaviour: map the old alias to the new model with reasoning_effort=none and clients see no difference. It also makes it easy to expose multiple capability tiers (fast, thorough, etc.) from a single engine.

Aliases can be configured as alias:model|reasoning_effort=high (CLI/JSON) or as a dict with model and reasoning_effort keys (YAML). The router injects the configured value into forwarded requests unless the client already set one.

FIX #911


  • Make sure the code changes pass the pre-commit checks.
  • Sign-off your commit by using -s when doing git commit
  • Try to classify PRs for easy understanding of the type of changes, such as [Bugfix], [Feat], and [CI].
Detailed Checklist (Click to Expand)

Thank you for your contribution to production-stack! Before submitting the pull request, please ensure the PR meets the following criteria. This helps us maintain the code quality and improve the efficiency of the review process.

PR Title and Classification

Please try to classify PRs for easy understanding of the type of changes. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:

  • [Bugfix] for bug fixes.
  • [CI/Build] for build or continuous integration improvements.
  • [Doc] for documentation fixes and improvements.
  • [Feat] for new features in the cluster (e.g., autoscaling, disaggregated prefill, etc.).
  • [Router] for changes to the vllm_router (e.g., routing algorithm, router observability, etc.).
  • [Misc] for PRs that do not fit the above categories. Please use this sparingly.

Note: If the PR spans more than one category, please include all relevant prefixes.

Code Quality

The PR need to meet the following code quality standards:

  • Pass all linter checks. Please use pre-commit to format your code. See README.md for installation.
  • The code need to be well-documented to ensure future contributors can easily understand the code.
  • Please include sufficient tests to ensure the change is stay correct and robust. This includes both unit tests and integration tests.

DCO and Signed-off-by

When contributing changes to this project, you must agree to the DCO. Commits must include a Signed-off-by: header which certifies agreement with the terms of the DCO.

Using -s with git commit will automatically add this header.

What to Expect for the Reviews

We aim to address all PRs in a timely manner. If no one reviews your PR within 5 days, please @-mention one of YuhanLiu11
, Shaoting-Feng or ApostaC.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an enhanced alias configuration system that allows for parameter overrides, specifically adding support for 'reasoning_effort'. It implements a new AliasConfig dataclass designed for backward compatibility with existing string-based aliases while enabling more complex configurations via CLI, JSON, or YAML. The routing logic has been updated to inject these configured parameters into backend requests if they are not already provided by the client. Review feedback focuses on improving the robustness of the alias string parsing to handle empty entries and ensuring that null values in YAML configurations are correctly handled to avoid validation errors.

Comment thread src/vllm_router/utils.py Outdated
Comment thread src/vllm_router/parsers/yaml_utils.py Outdated
@Killusions Killusions changed the title feat(aliases): allow setting reasoning level [Feat] feat(aliases): allow setting reasoning level Apr 10, 2026
@Killusions Killusions force-pushed the feat/allow-setting-reasoning-level-in-aliases branch from 2f21394 to d00e070 Compare April 10, 2026 13:31
@Killusions
Copy link
Copy Markdown
Contributor Author

@nejch Could you maybe take a look at this?

@Killusions Killusions force-pushed the feat/allow-setting-reasoning-level-in-aliases branch 3 times, most recently from 0783bff to 7b0ff46 Compare April 24, 2026 10:55
Signed-off-by: Linus Schlumberger <linus.schlumberger@siemens.com>
@Killusions Killusions force-pushed the feat/allow-setting-reasoning-level-in-aliases branch from 7b0ff46 to 0b6c5e0 Compare April 24, 2026 11:38
@Killusions
Copy link
Copy Markdown
Contributor Author

@ruizhang0101 Thank you for merging my bugfix, could you maybe also have a look here, I'm also open to alternate approaches, but we are currently using this patch internally and we'd really appreciate having some way to configure reasoning and especially disable it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature: per-alias reasoning_effort for backward-compatible model migration

1 participant