Conversation
There was a problem hiding this comment.
Pull request overview
This pull request implements support for returning finish_reason: "tool_calls" in OpenAI API responses when tool calls are generated, addressing issue #3927. Previously, the system always returned finish_reason: "stop" even when tool calls were made, which broke compatibility with coding agents that rely on this field to determine if a tool call is needed.
Changes:
- Added logic to detect tool calls in both streaming and unary responses and set finish_reason accordingly
- Introduced helper functions to map GenAI finish reasons to OpenAI API format with tool_calls support
- Added comprehensive unit tests covering streaming and unary response scenarios with tool calls
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/llm/apis/openai_completions.cpp | Core implementation: added mapFinishReason(), isToolCallsFinishReason(), and hasToolCallsInStreamingDelta() helpers; updated serialization methods to use tool_calls finish reason when appropriate |
| src/llm/visual_language_model/legacy/servable.cpp | Updated to pass toolParserName and reasoningParserName parameters to OpenAIChatCompletionsHandler constructor |
| src/test/http_openai_handler_test.cpp | Added unit tests for streaming and unary responses with tool_calls finish reason; includes test helper for creating llama3 tool call tokens |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| static std::optional<std::string> mapFinishReason(ov::genai::GenerationFinishReason finishReason, bool hasToolCalls) { | ||
| // GenerationFinishReason::TOOL_CALLS is not available in GenAI yet. | ||
| // Use feature detection based on presence of tool calls as a workaround until GenAI exposes TOOL_CALLS. |
There was a problem hiding this comment.
Do you know of any task for that in GenAI scope?
There was a problem hiding this comment.
@apaniukov is there a task for that in GenAI?
| #else | ||
| const std::string llama3TokenizerPathForHandlerTests = "/ovms/src/test/llm_testing/unsloth/Llama-3.1-8B-Instruct"; | ||
| #endif | ||
| constexpr int64_t llama3BotTokenIdForHandlerTests = 128010; |
There was a problem hiding this comment.
do we have to hardcode it? isn't it possible to get that token from Tokenizer object?
| #ifdef _WIN32 | ||
| const std::string llama3TokenizerPathForHandlerTests = getWindowsRepoRootPath() + "\\src\\test\\llm_testing\\unsloth\\Llama-3.1-8B-Instruct"; | ||
| #else | ||
| const std::string llama3TokenizerPathForHandlerTests = "/ovms/src/test/llm_testing/unsloth/Llama-3.1-8B-Instruct"; |
There was a problem hiding this comment.
I wonder if our mechanism to replace paths would handle it?
Do we use such ifdefs to select model path in other places?
| results.texts = {"dummy"}; | ||
| std::string serialized = apiHandler->serializeUnaryResponse(results, 1); | ||
|
|
||
| // ASSERT_NE(serialized.find("\"finish_reason\":\"tool_calls\""), std::string::npos) << serialized; |
There was a problem hiding this comment.
remove or add more context
There was a problem hiding this comment.
my next PR is introducing tool calling for vlm legacy pipeline, I dont think we need any more context, it is above the test, check
// This is unsupported, once we have tool calling for VLM legacy pipeline, change the test
| if (hasToolCalls && finishReason == ov::genai::GenerationFinishReason::STOP) { | ||
| return "tool_calls"; | ||
| } |
There was a problem hiding this comment.
Could be prone to any behavior changes (if model generates tool calls + something more) in the future, but looks like it's fine for now.
🛠 Summary
#3927
CVS-180728
🧪 Checklist