Skip to content

server: Allow continue in thinking (reasoning prefill)#22162

Open
roj234 wants to merge 2 commits intoggml-org:masterfrom
roj234:thinking_prefill
Open

server: Allow continue in thinking (reasoning prefill)#22162
roj234 wants to merge 2 commits intoggml-org:masterfrom
roj234:thinking_prefill

Conversation

@roj234
Copy link
Copy Markdown
Contributor

@roj234 roj234 commented Apr 20, 2026

Overview

Make prefill available for (probably only some) reasoning model

Additional information

Current change may lack the ability to add '\n'

Requirements

@roj234 roj234 force-pushed the thinking_prefill branch from 2a47990 to 0276d4e Compare April 20, 2026 11:31
this also make reasoning-budget works better without prompt
@roj234 roj234 marked this pull request as ready for review April 20, 2026 20:04
@roj234 roj234 requested review from a team as code owners April 20, 2026 20:04
@roj234
Copy link
Copy Markdown
Contributor Author

roj234 commented Apr 20, 2026

by make chat_params.thinking_end_tag from </think> to \n</think> I have found Qwen3.5 can correctly stop thinking after reasoning_budget triggered.

@roj234
Copy link
Copy Markdown
Contributor Author

roj234 commented Apr 20, 2026

fixes #21889 (although must use reasoning_content field rather than raw content)
What is correctly stop thinking (after reasoning_budget triggered):

Before
2026-4-21 4-13-9
After
2026-4-21 4-11-16

Which means Qwen3.5 requires \n</think> to end thinking not just </think>

@roj234
Copy link
Copy Markdown
Contributor Author

roj234 commented Apr 20, 2026

BTW, can you merge (or review) my other PRs 🙏, like OpenRouter-compatible reasoning_budget API

#20069
#20088
#22038

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant