Eval bug: context length incorrectly capped in server for yarn extendable context models

### Name and Version

latest codebase at b8850

### Operating systems

Linux

### GGML backends

CUDA

### Hardware

9900k/4070

### Models

Qwen 3 8B

### Problem description & steps to reproduce

If yarn is used to extend context length (many qwen models such as Qwen3 use yarn extension), the extension is blocked by this logic in server-context.cpp which incorrectly limits the context length to the base training context:

       int n_ctx_slot = llama_n_ctx_seq(ctx);
        if (n_ctx_slot > n_ctx_train) {
            SRV_WRN("the slot context (%d) exceeds the training context of the model (%d) - capping\n", n_ctx_slot, n_ctx_train);
            n_ctx_slot = n_ctx_train;
        }

Instead just a warning should be output as follows:

       int n_ctx_slot = llama_n_ctx_seq(ctx);
        if (n_ctx_slot > n_ctx_train) {
            SRV_WRN("the slot context (%d) exceeds the training context of the model (%d)\n", n_ctx_slot, n_ctx_train);
        }

### First Bad Commit

Unknown, whenever the cap was added.

### Relevant log output

<details>
<summary>Logs</summary>


```console

```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: context length incorrectly capped in server for yarn extendable context models #22140

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: context length incorrectly capped in server for yarn extendable context models #22140

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions