Skip to content

Misc. bug: HIP WIN11 loading Qwen3.6-35B-A3B-UD-Q6_K.gguf around 120k context crashes server and even whole system #22135

@vevi33

Description

@vevi33

Name and Version

b8850 llama.cpp server
RX 7800XT (latest driver)
R7 7700x
32gb DDR5 6000 MHz RAM

Operating systems

Windows

Which llama.cpp modules do you know to be affected?

llama-server

Command line

-m "unsloth\Qwen\Qwen3.6-35B-A3B-UD-Q6_K.gguf"
      --flash-attn on
      --ctx-size 120000
      --fit on
      --fit-target 384
      --threads 16
      --parallel 1
      --no-mmap
      --mlock
      --cache-ram 2048
      --ctx-checkpoints 12
      --temp 0.75
      --repeat-penalty 1.08
      --repeat-last-n 384
      --min-p 0.05
      --top-k 30
      --alias Qwen3.6-35B_L
      --chat-template-kwargs "{\"preserve_thinking\":true}"
      --reasoning on

Problem description & steps to reproduce

Something really odd happens with the b8850 and this quant. When I load it up with 100k context I have 15.5 gb VRAM usage and around 16GB ram usage. But when I increase the context to 120k my whole ram is filled up (32gb) and even page file, even had a whole system freeze because of it. It never happened with any previous models or in older versions.

The crash is 0xc0000005 = Windows STATUS_ACCESS_VIOLATION

First Bad Commit

No response

Relevant log output

sched_reserve: reserving ...
sched_reserve: resolving fused Gated Delta Net support:
sched_reserve: fused Gated Delta Net (autoregressive) enabled
sched_reserve: fused Gated Delta Net (chunked) enabled
sched_reserve: ROCm0 compute buffer size = 659.77 MiB
sched_reserve: ROCm_Host compute buffer size = 242.52 MiB
sched_reserve: graph nodes = 3729
sched_reserve: graph splits = 74 (with bs=512), 52 (with bs=1)
sched_reserve: reserve took 17654.00 ms, sched copies = 1
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
[WARN] <Qwen3.6-35B_L> ExitError >> exit status 0xc0000005, exit code: 3221225477
[INFO] <Qwen3.6-35B_L> process exited but not StateStopping, current state: starting
[WARN] metrics skipped, HTTP status=502, path=/v1/chat/completions

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions