Skip to content

server: Support chat_template_kwargs for /v1/messages#22154

Open
Soreepeong wants to merge 1 commit intoggml-org:masterfrom
Soreepeong:anthropic-oai-conv-chat-template-kwargs
Open

server: Support chat_template_kwargs for /v1/messages#22154
Soreepeong wants to merge 1 commit intoggml-org:masterfrom
Soreepeong:anthropic-oai-conv-chat-template-kwargs

Conversation

@Soreepeong
Copy link
Copy Markdown

@Soreepeong Soreepeong commented Apr 20, 2026

Overview

This PR passthroughs chat_template_kwargs when the feature is invoked using Anthropic compatible endpoint.

OpenAI compatible endpoint /v1/completions will handle chat_template_kwargs, but Anthropic compatible endpoint /v1/messages does not. As it is a nonstandard option in OAI anyway, I thought it's fine to add nonstandard feature to Anthropic handler too.

Additional information

Use case

I am using a custom proxy application that transforms model name suffix to parameter presets, such as /think suffix will add {"chat_template_kwargs": true} to the completions/messages request body.

Testing

The model is configured to have thinking disabled.

Command

curl -v http://127.0.0.1:12345/v1/messages ^
  -H "anthropic-version: 2023-06-01" ^
  -H "content-type: application/json" ^
  --data-raw "{\"model\": \"Qwen3.6-35B-A3B\", \"chat_template_kwargs\": {\"enable_thinking\": true}, \"messages\": [{\"role\": \"user\", \"content\": \"What is 1 + 1?\"}]}"

Before this PR

{"id":"chatcmpl-AzD7ukLinv1EGcbg3fX6gV7Sy1AH54tq","type":"message","role":"assistant","content":[{"type":"text","text":"1 + 1 equals **2**."}],"model":"Qwen3.6-35B-A3B","stop_reason":"end_turn","stop_sequence":null,"usage":{"cache_read_input_tokens":0,"input_tokens":20,"output_tokens":9}}

After this PR

{"id":"chatcmpl-O9T75TOmt4gRo5NnzlaVLQcIVuZAOopd","type":"message","role":"assistant","content":[{"type":"thinking","thinking":"Here's a thinking process:\n\n1.  **Analyze User Input:** The user asks \"What is 1 + 1?\"\n2.  **Identify Core Task:** This is a basic arithmetic question.\n3.  **Calculate Result:** 1 + 1 = 2.\n4.  **Formulate Response:** State the answer clearly and concisely. \"1 + 1 equals 2.\"\n5.  **Check for Edge Cases/Context:** None. It's a straightforward math question with no tricks or additional context needed.\n6.  **Final Output Generation:** Provide the direct answer.✅\n","signature":""},{"type":"text","text":"1 + 1 equals **2**."}],"model":"Qwen3.6-35B-A3B","stop_reason":"end_turn","stop_sequence":null,"usage":{"cache_read_input_tokens":0,"input_tokens":18,"output_tokens":144}}*

Requirements

@Soreepeong Soreepeong marked this pull request as ready for review April 20, 2026 07:07
@Soreepeong Soreepeong requested a review from a team as a code owner April 20, 2026 07:07
@Soreepeong Soreepeong changed the title Support chat_template_kwargs for /v1/messages server: Support chat_template_kwargs for /v1/messages Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants