Enable nothink models - silent responses for Qwen3 nothink models under default thinking config.#34
Open
gherlein wants to merge 2 commits into
Open
Enable nothink models - silent responses for Qwen3 nothink models under default thinking config.#34gherlein wants to merge 2 commits into
gherlein wants to merge 2 commits into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Qwen3
-nothinkmodel variants (e.g.qwen3-coder-next-nothink:latest) producedno visible response in the TUI. Two bugs combined to cause this:
The default
thinkingLevel: "medium"config caused every Ollama chat request toinclude
think: "medium", which overrides the/no_thinktoken baked intonothink model templates — forcing the model into thinking mode.
In
ollamaRunStreaming, when a model responds entirely via thinking tokensresponse was built from
aggregatedTextonly. The thinking content was silentlydiscarded, producing an empty turn — "prompts do nothing".
Fix
GenerateContent— detectnothinkin the model name and explicitly setthink: false, preventing the configured thinking level from overriding themodel's intent.
ollamaRunStreaming— whenaggregatedTextis empty butaggregatedThinkingis not, fall back to surfacing the thinking content as the response. This mirrors
the existing behaviour in
ollamaRunNonStreamingand acts as a safety net for anymodel that ends up in this state.
Testing
Manually verified with
qwen3-coder-next-nothink:latest(previously silent) andqwen3.5:35b-a3b-q4_k_m(working model, confirmed no regression).