Draft
Conversation
0f3f8eb to
1599bfc
Compare
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> Fix #1013 Transformers v5 introduces a new attribute `rope_parameters` in model config, containing all rope related parameters, and deprecate standalone rope attribute, such as `rope_scaling`, `rope_theta`, etc. Most `TokenizerFast`s are now default tokenizers in v5, hence `tokenization_xxx_fast` paths are removed This PR - replaces deprecated configs with `rope_parameters` - replaces fast tokenizers path with default ones <!--- ## Details This is an optional section; is there anything specific that reviewers should be aware of? ---> ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: <BLANK> - [ ] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence --------- Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> Follow-up to #1014 Change all occurences in all convergence tests. <!--- ## Details This is an optional section; is there anything specific that reviewers should be aware of? ---> ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: <BLANK> - [ ] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> <!--- ## Details This is an optional section; is there anything specific that reviewers should be aware of? ---> `position_id` has been removed from `apply_rotary_pos_emb` in huggingface/transformers#43255 ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> ``` ❯ python3 -m pytest test/transformers/test_rope.py -q test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-1-128-32-32-64] PASSED [ 2%] test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-2-128-32-32-64] PASSED [ 5%] test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-1-128-32-8-64] PASSED [ 8%] test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-2-128-32-8-64] PASSED [ 11%] test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-3-423-73-213-92] PASSED [ 13%] test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-3-423-73-155-92] PASSED [ 16%] test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-1-128-32-32-64] PASSED [ 19%] test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-2-128-32-32-64] PASSED [ 22%] test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-1-128-32-8-64] PASSED [ 25%] test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-2-128-32-8-64] PASSED [ 27%] test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-3-423-73-213-92] PASSED [ 30%] test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-3-423-73-155-92] PASSED [ 33%] test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-1-128-32-32-64] PASSED [ 36%] test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-2-128-32-32-64] PASSED [ 38%] test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-1-128-32-8-64] PASSED [ 41%] test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-2-128-32-8-64] PASSED [ 44%] test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-3-423-73-213-92] PASSED [ 47%] test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-3-423-73-155-92] PASSED [ 50%] test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-1-128-32-32-64] PASSED [ 52%] test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-2-128-32-32-64] PASSED [ 55%] test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-1-128-32-8-64] PASSED [ 58%] test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-2-128-32-8-64] PASSED [ 61%] test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-3-423-73-213-92] PASSED [ 63%] test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-3-423-73-155-92] PASSED [ 66%] test/transformers/test_rope.py::test_functional_correctness[True-dtype0-1e-05-1e-05-1-2-2-2-8] PASSED [ 69%] test/transformers/test_rope.py::test_functional_correctness[True-dtype0-1e-05-1e-05-1-2-1-2-8] PASSED [ 72%] test/transformers/test_rope.py::test_functional_correctness[True-dtype0-1e-05-1e-05-9-7-41-41-41] PASSED [ 75%] test/transformers/test_rope.py::test_functional_correctness[True-dtype1-0.1-1e-05-1-2-2-2-8] PASSED [ 77%] test/transformers/test_rope.py::test_functional_correctness[True-dtype1-0.1-1e-05-1-2-1-2-8] PASSED [ 80%] test/transformers/test_rope.py::test_functional_correctness[True-dtype1-0.1-1e-05-9-7-41-41-41] PASSED [ 83%] test/transformers/test_rope.py::test_functional_correctness[False-dtype0-1e-05-1e-05-1-2-2-2-8] PASSED [ 86%] test/transformers/test_rope.py::test_functional_correctness[False-dtype0-1e-05-1e-05-1-2-1-2-8] PASSED [ 88%] test/transformers/test_rope.py::test_functional_correctness[False-dtype0-1e-05-1e-05-9-7-41-41-41] PASSED [ 91%] test/transformers/test_rope.py::test_functional_correctness[False-dtype1-0.1-1e-05-1-2-2-2-8] PASSED [ 94%] test/transformers/test_rope.py::test_functional_correctness[False-dtype1-0.1-1e-05-1-2-1-2-8] PASSED [ 97%] test/transformers/test_rope.py::test_functional_correctness[False-dtype1-0.1-1e-05-9-7-41-41-41] PASSED [100%] ``` <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: <BLANK> - [ ] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
df188d7 to
2cd6e39
Compare
## Summary Update Gemma tokenizer usage in convergence tests for Transformers v5 by removing deprecated `GemmaTokenizerFast` imports and renaming usages to the supported non-fast tokenizer class. This fixes the `No module named transformers.models.gemma.tokenization_gemma_fast` error when running convergence tests under Transformers v5. ## Details Transformers v5 moves away from parallel “fast” and “slow” tokenizer implementations and adopts a single tokenizer implementation (see [huggingface/transformers#40936](huggingface/transformers#40936 (comment))). - Convergence tests were importing and instantiating the fast tokenizer class, causing import errors. - This change updates both: 1) the import path, and 2) the tokenizer class name used in code (`GemmaTokenizerFast` → `GemmaTokenizer`), following the new Transformers v5 API. ## Testing Done - Hardware Type: A100-40G-PCIe - [ ] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence
Contributor
|
@Tcc0403 in the CI tests, in my opinion we should not be setting any rope based setting in the model. The model will read the defaults based on whatever version of transformers one has and setting the rope params. etc. does not do anything in terms of the test. What do you think? |
kashif
reviewed
Jan 26, 2026
kashif
reviewed
Jan 26, 2026
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> Fix multiple failing monkey_patch tests for transformers v5. The tests failed because of huggingface/transformers#41541. These changes are backward compatible. This change fixes the following failing tests: ``` FAILED test/transformers/test_monkey_patch.py::test_apply_liger_kernel_to_instance_for_qwen3_vl_moe_for_conditional_generation - AttributeError: 'Qwen3VLMoeTextConfig' object has no attribute 'pad_token_id' FAILED test/transformers/test_monkey_patch.py::test_apply_liger_kernel_to_instance_for_qwen3_vl_moe - AttributeError: 'Qwen3VLMoeTextConfig' object has no attribute 'pad_token_id' FAILED test/transformers/test_monkey_patch.py::test_apply_liger_kernel_to_instance_for_qwen3_vl_moe_text - AttributeError: 'Qwen3VLMoeTextConfig' object has no attribute 'pad_token_id' FAILED test/transformers/test_monkey_patch.py::test_apply_liger_kernel_to_instance_for_llama4_for_conditional_generation - AttributeError: 'Llama4Config' object has no attribute 'pad_token_id' FAILED test/transformers/test_monkey_patch.py::test_apply_liger_kernel_to_instance_for_glm4v - AttributeError: 'Glm4vTextConfig' object has no attribute 'pad_token_id' ``` Fixes #1059. <!--- ## Details This is an optional section; is there anything specific that reviewers should be aware of? ---> ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: <BLANK> - [ ] run `make test` to ensure correctness - [ ] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence
## Summary Fixes #1013 - Remove deprecated `rope_theta` and `rope_scaling` from test configurations - Retain `mrope_section` in `rope_parameters` for VL models (Qwen2-VL, Qwen2.5-VL) as it's required by Transformers v5's `config.rope_parameters["mrope_section"]` direct access pattern - Remove deprecated `rope_parameters` containing only `rope_theta` from non-VL models (relying on model defaults) ## Testing Done - Hardware Type: RTX3090 (Nvidia Ampere) - [x] run `python -m pytest test/transformers/test_monkey_patch.py -v` to ensure correctness - [x] run `make checkstyle` to ensure code style - Linter checks pass with no new issues introduced. - [x] run `make test-convergence` to ensure convergence ### test_monkey_patch.py - **5 failed** - Unrelated to this PR (MoE experts structure changes in Transformers v5): - `test_apply_liger_kernel_to_instance_for_mixtral` - `'MixtralDecoderLayer' has no attribute 'block_sparse_moe'` - `test_apply_liger_kernel_to_instance_for_qwen3_moe` - `'Qwen3MoeExperts' object is not iterable` - `test_apply_liger_kernel_to_instance_for_glm4v_moe` - `'Glm4vMoeTextNaiveMoe' object is not iterable` - `test_apply_liger_kernel_to_instance_for_qwen3_next` - `'Qwen3NextExperts' object is not iterable` - `test_apply_liger_kernel_to_instance_for_hunyuan_v1_moe` - `'HunYuanMoEV1Experts' object is not iterable` ### convergence tests | Error Type | Affected Models | Root Cause | |------------|-----------------|------------| | `AttributeError: '...Config' has no attribute 'pad_token_id'` | Qwen3VLMoe, Glm4v, Llama4, Exaone4 | Transformers v5 Config structure changes | | `RuntimeError: _histc_cuda...deterministic` | Qwen3Moe, GptOss, Glm4vMoe, Qwen3Next, HunYuanV1Moe | CUDA deterministic implementation issue | | `AssertionError: [Loss] Number of mismatched` | Llama4 | Numerical precision issue | | `TypeError: 'NoneType' object is not subscriptable` | Llava | Model initialization issue | ## Out of Scope The following issues require separate PRs: 1. MoE models' `experts` structure changes (no longer iterable in Transformers v5) 2. `pad_token_id` attribute location changes in composite configs 4. CUDA deterministic implementation issues
Collaborator
|
@Mecoli1219 @Tcc0403 Should we fix this branch to make all the tests pass for v4.49.0 and v4.57.6 and then merge it in main? After that we can keep adding new PRs for v5 support in main directly. Once all the issues with v5 are fixed, we can release a new version? Wdyt? |
…v5 (#1062) Fixes #1012⚠️ Dependency: This PR depends on #1060. Please review and merge #1060 first. ## Summary - Fix `AttributeError: 'Qwen2VLConfig' object has no attribute 'hidden_size'` for Qwen2-VL and Qwen2.5-VL models - Update test configurations to use the new `text_config` structure required by Transformers v5 ## Changes 1. **Model code** (`src/liger_kernel/transformers/model/qwen2_vl.py`, `qwen2_5_vl.py`): - Changed `self.config.hidden_size` → `self.config.text_config.hidden_size` - Changed `self.config.vocab_size` → `self.config.text_config.vocab_size` 2. **Test configurations** (`test/convergence/bf16/test_mini_models.py`, `fp32/test_mini_models.py`): - Restructured `mini_qwen2_vl` and `mini_qwen2_5_vl` configurations to use `text_config` dictionary for text-related parameters ## Background In Transformers v5, `Qwen2VLConfig` and `Qwen2_5_VLConfig` moved text-related parameters (such as `hidden_size`, `vocab_size`) into a nested `text_config` attribute, following the pattern used by other multimodal models. ## Test plan - [x] `python -m pytest test/convergence/bf16/test_mini_models.py -k "qwen2_vl or qwen2_5_vl"` passes - [x] `python -m pytest test/convergence/fp32/test_mini_models.py -k "qwen2_vl or qwen2_5_vl"` passes
Contributor
|
Sounds great. Let me fix this branch for testing in v4. |
3 tasks
## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> This PR restores backward compatibility for convergence tests with transformers v4 (v4.49.0 ~ v4.57.6). During the initial development phase for transformers v5 support, backward compatibility was intentionally deprioritized, leading to significant test regressions. This PR fixes those regressions while maintaining a stable foundation for the ongoing v5 integration. ## Related Issues & PRs - #978 - #994 ## Details The current codebase assumes transformers v5 conventions, which broke compatibility with the v4.x series in two major areas: 1. RoPE Parameters: Some model miss some rope parameters (`rope_scaling`) since they are unified to `rope_parameters` in transformer v5. 2. Tokenizer Consistency: v5 and v4 handle the Tokenizer interfaces differently. V5's Tokenizer will select the appropriate backend, while v4's Tokenizer is the python-based implementation using SentencePiece as backend. Key Fixes: - Added conditional logic to provide different rope parameters for different transformers versions. - Enforced TokenizerFast usage for transformers < v5 to resolve interface mismatches. ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> I ran `python -m pytest test/convergence/*` on different versions of transformers on the original branch and after making changes. The result is shown below: | Branches| v4.49.0 | v4.57.6 | v5.0.0 | |---|---|---|---| | transformer-5.0.0rc1 | 8 failed, 37 passed, 98 skipped, 1 warning | 42 failed, 92 passed, 9 skipped, 3 warnings| 19 failed, 115 passed, 9 skipped, 29 warnings | | This PR | 0 failed, 45 passed, 98 skipped, 1 warning | 0 failed, 134 passed, 9 skipped, 19 warnings | 19 failed, 115 passed, 9 skipped, 29 warnings | All of the failed tests in v5 are inspected carefully that all of them are identical to the previously thrown error. </div></b> <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: H100 - [x] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [x] run `make test-convergence` to ensure convergence
Collaborator
|
@Tcc0403 @Mecoli1219 should we merge this to main now as #978 is updated now? |
Contributor
|
Sure |
) ## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> Fix pad_token_id in convergence tests for transformers v5 support. Fixes #1081. <!--- ## Details This is an optional section; is there anything specific that reviewers should be aware of? ---> Fixes the following tests for transformers v5.0.0 compatibility while maintaining backward compatibility with < v5.0.0: ``` pytest test/convergence/fp32/test_mini_models.py::test_mini_model[mini_glm4v-32-0.0001-dtype16-1e-08-1e-05-0.005-1e-05-0.005-1e-05] \ test/convergence/fp32/test_mini_models.py::test_mini_model[mini_exaone4-32-1e-05-dtype30-1e-08-1e-05-0.005-1e-05-0.005-1e-05] \ test/convergence/fp32/test_mini_models_with_logits.py::test_mini_model[mini_glm4v-32-0.0001-dtype15-1e-08-1e-05-0.005-1e-05-0.005-1e-05] \ test/convergence/fp32/test_mini_models_with_logits.py::test_mini_model[mini_exaone4-32-1e-05-dtype29-1e-08-1e-05-0.005-1e-05-0.005-1e-05] \ test/convergence/bf16/test_mini_models.py::test_mini_model[mini_glm4v-32-1e-05-dtype18-0.01-0.02-0.1-0.01-0.01-0.01] \ test/convergence/bf16/test_mini_models.py::test_mini_model[mini_exaone4-32-1e-05-dtype29-0.01-0.05-0.1-0.01-0.01-0.01] \ test/convergence/bf16/test_mini_models_with_logits.py::test_mini_model[mini_exaone4-32-1e-05-dtype28-0.01-0.05-0.1-0.01-0.01-0.01] \ test/convergence/bf16/test_mini_models_with_logits.py::test_mini_model[mini_glm4v-32-1e-05-dtype19-0.01-0.02-0.1-0.01-0.01-0.01] ``` ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> Yes, on 1xH100. <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: <BLANK> - [x] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [x] run `make test-convergence` to ensure convergence
…for Transformers v5 (#1079) ## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> Fix these unit tests error for Transformers v5. ``` FAILED test/transformers/test_monkey_patch.py::test_apply_liger_kernel_to_instance_for_mixtral - AttributeError: 'MixtralDecoderLayer' object has no attribute 'block_sparse_moe' FAILED test/transformers/test_monkey_patch.py::test_apply_liger_kernel_to_instance_for_qwen3_moe - TypeError: 'Qwen3MoeExperts' object is not iterable FAILED test/transformers/test_monkey_patch.py::test_apply_liger_kernel_to_instance_for_glm4v_moe - TypeError: 'Glm4vMoeTextNaiveMoe' object is not iterable FAILED test/transformers/test_monkey_patch.py::test_apply_liger_kernel_to_instance_for_qwen3_next - TypeError: 'Qwen3NextExperts' object is not iterable FAILED test/transformers/test_monkey_patch.py::test_apply_liger_kernel_to_instance_for_hunyuan_v1_moe - TypeError: 'HunYuanMoEV1Experts' object is not iterable ``` ## Related Issues & PRs - #978 ## Details The implementation of MoE experts are updated in transformers v5. Originally, it is a module list of MLPs, but some models updated their implementation to a single class `<ModelName>Experts`. These classes' implementation are identical, so this PR creates a `LigerExperts` class that utilizes `LigerSiLUMulFunction` to replace the experts of these models. - [x] mixtral - [x] qwen3_moe - [x] glmv4_moe - [x] qwen3_next - [x] hunyuan_v1_moe > [!NOTE] > The `LigerExperts` in this PR is the workaround to fix the unit test for v5. We can replace it with better kernel as mentioned in #958 in the future ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> Unit tests are added to check the correctness of original implementation of `LigerBlockSparseTop2MLP` for v4 and `LigerExperts` for v5. <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: H100 * 1 - [x] run `make test` to ensure correctness in v5.0.0 - [x] run `make test` to ensure correctness in v4.57.6 - [x] run `make test` to ensure correctness in v4.49.0 - [x] run `make checkstyle` to ensure code style - [x] run `make test-convergence` to ensure convergence in v5.0.0 - 5 unrelated existing errors - [x] run `make test-convergence` to ensure convergence in v4.57.6 - [x] run `make test-convergence` to ensure convergence in v4.49.0
## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> <!--- ## Details This is an optional section; is there anything specific that reviewers should be aware of? ---> ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: <BLANK> - [ ] run `make test` to ensure correctness - [ ] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
545fb48 to
8e3f826
Compare
… transformers v5 (#1061) Fixes #1011 ## Summary - Fix `TypeError: 'NoneType' object is not subscriptable` error in Llava convergence test caused by `image_outputs.hidden_states` returning `None` - Remove unnecessary `importlib.reload(modeling_clip)` from `revert_liger_kernel_to_llava()` function ## Problem In transformers v5, calling `importlib.reload(modeling_clip)` breaks `CLIPVisionModel`'s `output_hidden_states` functionality. When the Llava convergence test runs: 1. First run (without Liger): creates model, runs test, then calls `revert_liger_kernel_to_llava()` which reloads `modeling_clip` 2. Second run (with Liger): creates new model, but `CLIPVisionModel.forward()` now returns `hidden_states=None` even when `output_hidden_states=True` is passed This causes the error at: ```python selected_image_feature = image_outputs.hidden_states[vision_feature_layer] # TypeError: 'NoneType' object is not subscriptable ``` ## Solution Remove `importlib.reload(modeling_clip)` from `revert_liger_kernel_to_llava()`. This is safe because: - Liger kernel does not patch `modeling_clip` when `model=None` (which is the case in convergence tests) - Only `modeling_llava` and `modeling_llama` need to be reloaded to revert Liger patches ## Test plan - [x] `python -m pytest test/convergence/bf16/test_mini_models_multimodal.py -k llava` passes - [x] Verified `hidden_states` is no longer `None` after the fix
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Important
Do not merge this PR before all issues are resolved!
Testing with three versions:
4.49.0,4.57.6and the latest stable versionNote
nvi-ci is split into correctness test ci and convergence test ci to speed up testing in this PR
, and more jobs for testing bc with transformers v4 (4.49.0 and 4.57.6)
Whether keeping this change or not is yet to be discussed
Summary
This is a dev branch for aggregating PRs related to transformers v5 changes.
Testing Done
make testto ensure correctnessmake checkstyleto ensure code stylemake test-convergenceto ensure convergence