Skip to content

Integrate upstream/main into feat/integration#186

Open
RhizoNymph wants to merge 850 commits into
feat/integrationfrom
chore/integrate-upstream
Open

Integrate upstream/main into feat/integration#186
RhizoNymph wants to merge 850 commits into
feat/integrationfrom
chore/integrate-upstream

Conversation

@RhizoNymph

Copy link
Copy Markdown
Owner

What

Merges ~846 commits from upstream/main (through a331589394, 2026-06-18) into the integration fork.

Why

Keeps the steering/capture fork current with upstream. Merge-base was ~3.5 weeks stale (2026-05-24).

Conflict resolution

81 files overlapped; 15 conflicted. Notable resolutions:

  • LLM internals moved upstream: LLM private methods were extracted into a new OfflineInferenceMixin (vllm/entrypoints/offline_utils.py). Ported the inline-steering hook (_maybe_pack_inline_steering + its _add_request call site) into the mixin; removed the now-duplicate block from llm.py (constructor capture-consumer logic auto-merged and is intact).
  • Scheduler: kept _set_request_block_hash_steering_overrides; took upstream's new schedule(throttle_prefills=...) + current_step increment.
  • GGUF removed from tree: upstream extracted GGUF into the external vllm-gguf-plugin. Accepted the deletions; repointed the extra-quant optional dependency at the RhizoNymph/vllm-gguf-plugin fork (adds gemma4 support).
  • Removed models: accepted upstream removal of dots1 and internlm2_ve (our changes there were only hooks).
  • Remaining conflicts were additive (kept both sides).

Runtime-validated on node0 (RTX 3090): engine initializes and generates correctly (Qwen3-4B). The ~67 textually auto-merged files (incl. gpu_model_runner.py) build and run but warrant deeper capture/steering validation.

ZJY0516 and others added 30 commits June 11, 2026 11:36
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
…vllm-project#42331)

`unregister_vllm_metrics()` currently uses "vllm" in `collector._name` to decide
which collectors to remove from the Prometheus registry, removing every even
metrics registered by other subsystems or downstream extensions like "vllm_omni:"

Signed-off-by: vraiti <vraiti@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
…vllm-project#36902)

Signed-off-by: Sean Chen <seachen@redhat.com>
Co-authored-by: Yanan Cao <gmagogsfm@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…l (SM 100) (vllm-project#45251)

Signed-off-by: Wentian Byte <3400259131@qq.com>
…oject#43965)

Co-authored-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: RickyChen / 陳昭儒 <ricky.chen@infinirc.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
…nector teardown (vllm-project#45206)

Signed-off-by: Dao Le <Dao007forever@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
…#45217)

Signed-off-by: jpwang <jpwang@smail.nju.edu.cn>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…llm-project#44592)

Signed-off-by: Hsiao-Yuan Chen <hy.c@Hsiao-YuandeMacBook-Pro.local>
Signed-off-by: littlecircle0730 <littlecircle0730@gmail.com>
Signed-off-by: littlecircle0730 <43994952+littlecircle0730@users.noreply.github.com>
Co-authored-by: Hsiao-Yuan Chen <hy.c@Hsiao-YuandeMacBook-Pro.local>
Co-authored-by: Or Ozeri <or@ozery.com>
…x downloads (vllm-project#45308)

Signed-off-by: Ting Sun <suntcrick@gmail.com>
…ns (vllm-project#44383)

Signed-off-by: Sasindharan Sankar <sasindharansankar@email.com>
Co-authored-by: Sasindharan Sankar <sasindharansankar@email.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
…-project#44612)

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Signed-off-by: varun sundar rabindranath <vsundarr@redhat.com>
Co-authored-by: varun sundar rabindranath <vsundarr@redhat.com>
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Signed-off-by: Chris Leonard <chleonar@redhat.com>
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Martin Kukla <martin.kukla@cantab.net>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Dipika Sikka <dsikka@redhat.com>
Co-authored-by: NickLucche <nlucches@redhat.com>
Co-authored-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
Co-authored-by: Alec Kohlhoff <134344302+aleckohlhoff@users.noreply.github.com>
Co-authored-by: Porras Huang <20535584+porrashuang@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: scoootscooob <167050519+scoootscooob@users.noreply.github.com>
…40660)

Signed-off-by: allgather <all2allops@gmail.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
…llm-project#44893)

Signed-off-by: Rohan Potdar <rohan.potdar@amd.com>
Signed-off-by: Rohan138 <rohanpotdar138@gmail.com>
Signed-off-by: Rohan Potdar <66227218+Rohan138@users.noreply.github.com>
Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
…stats to the managed Python engine (vllm-project#45300)

Signed-off-by: Will Eaton <weaton@redhat.com>
danisereb and others added 30 commits June 17, 2026 15:24
…45917)

Signed-off-by: Daniel Serebrenik <daserebrenik@nvidia.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
…apture`, 26.8% ~ 27.9% E2E TTFT improvement (vllm-project#45309)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Qiang Li <qiang.li2@amd.com>
vllm-project#45794)

Signed-off-by: wangjiaxin99 <jiaxwang@amd.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
Co-authored-by: Douglas Lehr <91553416+dllehr-amd@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
…ls (vllm-project#45867)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
…h clear errors (vllm-project#45196)

Signed-off-by: Ting Sun <suntcrick@gmail.com>
…#45849)

Signed-off-by: shanjiaz <hezhao@redhat.com>
Co-authored-by: shanjiaz <hezhao@redhat.com>
…licated KV heads (vllm-project#45879)

Signed-off-by: Oseltamivir <58582368+Oseltamivir@users.noreply.github.com>
Co-authored-by: waynehacking8 <waynehacking8@gmail.com>
…during_capture`" (vllm-project#45309) (vllm-project#45972)

Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
…#45826)

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
…lm-project#43958)

Signed-off-by: Lai, Yejing <yejing.lai@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
Co-authored-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
…ect#39726)

Signed-off-by: Jonathan Chen <chenleejonathan@gmail.com>
Signed-off-by: Jonathan <chenleejonathan@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Signed-off-by: Jee Jee Li <jeejeelee@inferact.ai>
…rs drain (vllm-project#45823)

Signed-off-by: Ronen Schaffer <ronen.schaffer@ibm.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Shengqi Chen <harry-chen@outlook.com>
…or (vllm-project#45905)

Signed-off-by: Alex <alex.tech.lab@outlook.com>
Signed-off-by: AlexHuang <jihuihuang@alexai.com>
Co-authored-by: Or Ozeri <oro@il.ibm.com>
…offloading scheduler (vllm-project#45679)

Signed-off-by: Alex <alex.tech.lab@outlook.com>
Signed-off-by: AlexHuang <jihuihuang@future.com>
Co-authored-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Integrates ~846 upstream commits. Resolves 15 conflicts (steering/capture
hooks vs upstream refactors); ports inline-steering hook into the new
OfflineInferenceMixin (offline_utils.py); accepts upstream removal of GGUF
(now the external vllm-gguf-plugin) and of dots1/internlm2_ve models.
Repoints extra-quant at the RhizoNymph/vllm-gguf-plugin fork (gemma4 support).
The vllm_c rms_norm/fused_add_rms_norm guards claimed support for
weight=None, but torch.ops._C.rms_norm cannot take a None/undefined weight
(fails with 'Not yet supported ScalarType'). Weightless norms (e.g. Gemma4
v_norm, has_weight=False) now correctly fall back to the native impl.
The SetSteeringRequest.vectors field is intentionally dict[str, Any] (to
admit the packed wire form), so the model does not coerce inner layer keys;
coerce_steering_spec does. Test the actual coercion seam (which had no
direct coverage) instead of obsolete model-level behavior.
A single third-party capture-consumer plugin that fails to import (e.g.
one referencing a module not present in this build) previously crashed
_load_entry_points and took down all capture admission. Skip it with a
warning so other consumers keep working.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.