Skip to content

Enable nothink models - silent responses for Qwen3 nothink models under default thinking config.#34

Open
gherlein wants to merge 2 commits into
dimetron:mainfrom
gherlein:enable-nothink-models
Open

Enable nothink models - silent responses for Qwen3 nothink models under default thinking config.#34
gherlein wants to merge 2 commits into
dimetron:mainfrom
gherlein:enable-nothink-models

Conversation

@gherlein

Copy link
Copy Markdown

Problem

Qwen3 -nothink model variants (e.g. qwen3-coder-next-nothink:latest) produced
no visible response in the TUI. Two bugs combined to cause this:

  1. The default thinkingLevel: "medium" config caused every Ollama chat request to
    include think: "medium", which overrides the /no_think token baked into
    nothink model templates — forcing the model into thinking mode.

  2. In ollamaRunStreaming, when a model responds entirely via thinking tokens
    response was built from aggregatedText only. The thinking content was silently
    discarded, producing an empty turn — "prompts do nothing".

Fix

GenerateContent — detect nothink in the model name and explicitly set
think: false, preventing the configured thinking level from overriding the
model's intent.

ollamaRunStreaming — when aggregatedText is empty but aggregatedThinking
is not, fall back to surfacing the thinking content as the response. This mirrors
the existing behaviour in ollamaRunNonStreaming and acts as a safety net for any
model that ends up in this state.

Testing

Manually verified with qwen3-coder-next-nothink:latest (previously silent) and
qwen3.5:35b-a3b-q4_k_m (working model, confirmed no regression).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant