Skip to content

fix: Bumps to transformers==5.0.0#418

Draft
nrfulton wants to merge 1 commit intogenerative-computing:mainfrom
nrfulton:bobby_and_nathan/transformers_v5
Draft

fix: Bumps to transformers==5.0.0#418
nrfulton wants to merge 1 commit intogenerative-computing:mainfrom
nrfulton:bobby_and_nathan/transformers_v5

Conversation

@nrfulton
Copy link
Member

@nrfulton nrfulton commented Feb 5, 2026

Bump to transformers v5

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

This PR updates our KV smash code to use transformers v5. This requires moving away form the Legacy Cache implementation. The code here is originally from @csbobby.

This PR is still a draft. There are several changes needed:

  • Investigate the reason for our vllm transformers pin
  • Ask Docling folks if they plan on removing their transformers version upper bound / investigate work-arounds if not.
  • Update huggingface.py backend and unit tests.

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code as added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

@nrfulton nrfulton requested a review from guicho271828 February 5, 2026 23:19
@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@nrfulton
Copy link
Member Author

nrfulton commented Feb 5, 2026

@guicho271828 Why do we have a version pin for transformers in the vllm dependency?

@mergify
Copy link

mergify bot commented Feb 5, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

@psschwei
Copy link
Member

psschwei commented Feb 6, 2026

Just a heads up, when I tried bumping the transformers version it opened a whole can of worms: it required a vllm bump, which required a bump to outlines, which would've required code changes in the backends.
I didn't dig into it too deeply though (had that as one of my todos)

@nrfulton nrfulton assigned nrfulton and unassigned nrfulton Feb 6, 2026
@nrfulton nrfulton force-pushed the bobby_and_nathan/transformers_v5 branch from 7de778d to 32fcaab Compare February 6, 2026 22:31
@nrfulton
Copy link
Member Author

nrfulton commented Feb 6, 2026

Just a heads up, when I tried bumping the transformers version it opened a whole can of worms: it required a vllm bump, which required a bump to outlines, which would've required code changes in the backends. I didn't dig into it too deeply though (had that as one of my todos)

Yeah.

The vllm/outlines thing seems like we should be able to work-around. @guicho271828 has already looked into removing outlines entirely. Now might be the time to do that if we need to make changes to bump the version.

There's also a transformers v4 dependency in docling; I'm not sure what their status is on supporting the latest transformers, though.

I think those are the only two blockers. They are both pretty annoying blockers, though :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants