Skip to content

[pull] master from ggml-org:master#1169

Merged
pull[bot] merged 5 commits into
LongLeCE:masterfrom
ggml-org:master
May 11, 2026
Merged

[pull] master from ggml-org:master#1169
pull[bot] merged 5 commits into
LongLeCE:masterfrom
ggml-org:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 11, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

danbev and others added 5 commits May 11, 2026 12:07
* convert : add image break token fallback

This commit adds a image_break_token_id fallback for mistral where the
config contains a image_break_token_id of -1:
```console
  "vision_encoder": {
    "image_token_id": 10,
    "image_break_token_id": -1,
    ...
```
But the tokenizer.json has this token:
```console
115       "id": 12,
116       "content": "[IMG_BREAK]",
117       "single_word": false,
118       "lstrip": false,
119       "rstrip": false,
120       "normalized": false,
121       "special": true
122     },
```
If we look in convert_hf_to_gguf.py we have:
```python
        elif self.is_mistral_format:
            # hparams is already vision config here so norm_eps is only defined in global_config.
            self.hparams["norm_eps"] = self.global_config.get("norm_eps", None)
            assert self.hparams["norm_eps"] is not None, "norm_eps not found in params.json"
            if self.use_break_tok:
                self.img_break_tok_id = self.find_vparam(["image_break_token_id"])
```

The motivation for this is that currently converting this models
results in the following error:
```console
load_hparams: model size:         5131.60 MiB
load_hparams: metadata size:      0.15 MiB
clip_init: failed to load model 'models/mmproj-Mistral-Medium-3.5-128B.gguf': operator(): unable to find tensor v.token_embd.img_break

mtmd_init_from_file: error: Failed to load CLIP model from models/mmproj-Mistral-Medium-3.5-128B.gguf

Failed to load vision model from models/mmproj-Mistral-Medium-3.5-128B.gguf
```

With this fallback the model loads successfully.

Resolves: #22901

* Revert "convert : add image break token fallback"

This reverts commit 292e40c.

* convert : add image break token fallback

This commit adds a image_break_token_id fallback for mistral where the
config contains a image_break_token_id of -1:
```console
  "vision_encoder": {
    "image_token_id": 10,
    "image_break_token_id": -1,
    ...
```
But the tokenizer.json has this token:
```console
115       "id": 12,
116       "content": "[IMG_BREAK]",
117       "single_word": false,
118       "lstrip": false,
119       "rstrip": false,
120       "normalized": false,
121       "special": true
122     },
```
If we look in convert_hf_to_gguf.py we have:
```python
        elif self.is_mistral_format:
            # hparams is already vision config here so norm_eps is only defined in global_config.
            self.hparams["norm_eps"] = self.global_config.get("norm_eps", None)
            assert self.hparams["norm_eps"] is not None, "norm_eps not found in params.json"
            if self.use_break_tok:
                self.img_break_tok_id = self.find_vparam(["image_break_token_id"])
```

The motivation for this is that currently converting this models
results in the following error:
```console
load_hparams: model size:         5131.60 MiB
load_hparams: metadata size:      0.15 MiB
clip_init: failed to load model 'models/mmproj-Mistral-Medium-3.5-128B.gguf': operator(): unable to find tensor v.token_embd.img_break

mtmd_init_from_file: error: Failed to load CLIP model from models/mmproj-Mistral-Medium-3.5-128B.gguf

Failed to load vision model from models/mmproj-Mistral-Medium-3.5-128B.gguf
```

With this fallback the model loads successfully.

Co-authored-by: Pascal <admin@serveurperso.com>

Resolves: #22901

* convert : allow zero value for img_break_tok_id
Before, we relied on a transient import from `cub/cub.cuh`, which is
bad practice to do as cub may not always expose cuda/iterator
This commit updates the command line arguments to use the correct names
and values which are now required.

The motivation for this change is that currently running the example
command as is will generate the following errors:
```console
error while handling argument "--color": error: unknown value for --color: '--sampling-seq'

usage:
-co,   --color [on|off|auto]            Colorize output to distinguish prompt and user input from generations
                                        ('on', 'off', or 'auto', default: 'auto')
                                        'auto' enables colors when output is to a terminal

error while handling argument "-fa": error: unknown value for --flash-attn: '--temp'

usage:
-fa,   --flash-attn [on|off|auto]       set Flash Attention use ('on', 'off', or 'auto', default: 'auto')
                                        (env: LLAMA_ARG_FLASH_ATTN)

error while handling argument "--draft-max": the argument has been removed. use --spec-draft-n-max or --spec-ngram-mod-n-max

usage:
--draft, --draft-n, --draft-max N       the argument has been removed. use --spec-draft-n-max or
                                        --spec-ngram-mod-n-max
                                        (env: LLAMA_ARG_DRAFT_MAX)

error while handling argument "--draft-min": the argument has been removed. use --spec-draft-n-min or --spec-ngram-mod-n-min

usage:
--draft-min, --draft-n-min N            the argument has been removed. use --spec-draft-n-min or
                                        --spec-ngram-mod-n-min
                                        (env: LLAMA_ARG_DRAFT_MIN)
```
* [ggml-virtgpu] Add a GHA build check

* Apply suggestions from code review

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@pull pull Bot locked and limited conversation to collaborators May 11, 2026
@pull pull Bot added the ⤵️ pull label May 11, 2026
@pull pull Bot merged commit 928b486 into LongLeCE:master May 11, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants