Skip to content

Support enable_thinking config for reasoning models (GLM-5, etc.) #983

@liushuangls

Description

@liushuangls

Problem

When using reasoning models like GLM-5 (ZhiPu/智谱) as the VLM backend, the model spends all tokens on reasoning_content and returns empty content. This is because GLM-5 requires enable_thinking: false to disable chain-of-thought reasoning for structured tasks like memory extraction.

Current Behavior

The OpenAIVLM backend has _supports_enable_thinking() which only recognizes DashScope (Alibaba Cloud) endpoints:

_DASHSCOPE_HOSTS = {
    "dashscope.aliyuncs.com",
    "dashscope-intl.aliyuncs.com",
}

The LiteLLM backend does not support enable_thinking at all.

For ZhiPu (open.bigmodel.cn) and other OpenAI-compatible providers with reasoning models, there is no way to pass enable_thinking: false through configuration.

Proposed Solution

Add a config option in ov.conf to control thinking behavior, e.g.:

{
  "vlm": {
    "backend": "openai",
    "model": "glm-5",
    "api_base": "https://open.bigmodel.cn/api/coding/paas/v4",
    "enable_thinking": false
  }
}

This would be more general than hardcoding host allowlists, and would work for any provider that supports the enable_thinking parameter (ZhiPu, DashScope, DeepSeek, etc.).

Alternative: support an extra_body config field to pass arbitrary parameters to the API call.

Workaround

Currently we patch openai_vlm.py directly to add open.bigmodel.cn to the host allowlist, but this gets overwritten on every upgrade.

Environment

  • OpenViking: 0.2.12
  • Model: GLM-5 (ZhiPu)
  • Backend: openai (switched from litellm due to same limitation)
  • Python: 3.14

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions