Skip to content

fix: omit max_tokens from OpenAI requests when None (Azure compatibility)#396

Open
thuantan2060 wants to merge 4 commits intoNevaMind-AI:mainfrom
parasite2060:main
Open

fix: omit max_tokens from OpenAI requests when None (Azure compatibility)#396
thuantan2060 wants to merge 4 commits intoNevaMind-AI:mainfrom
parasite2060:main

Conversation

@thuantan2060
Copy link
Copy Markdown

📝 Pull Request Summary

Fix Azure OpenAI compatibility by omitting max_tokens from API requests when the value is None. Azure OpenAI rejects null for this field, while standard OpenAI silently accepts it.


✅ What does this PR do?

  • Fixes 5 methods across 2 files that pass max_tokens=None directly to the OpenAI API, which serializes to
    max_tokens: null in the JSON payload.
  • Changes the pattern from always including max_tokens in the request dict to only including it when the value
    is not None.
  • Affected files and methods:
    • src/memu/llm/openai_sdk.py: chat(), summarize(), vision()
    • src/memu/llm/backends/openai.py: build_summary_payload(), build_vision_payload()
  • No behavioral change when max_tokens is explicitly set — only the None/default case is fixed.

🤔 Why is this change needed?

  • Azure OpenAI's API returns a 400 Bad Request when max_tokens: null is included in chat completion
    requests:
    Invalid type for 'max_tokens': expected an unsupported value, but got null instead.
  • Standard OpenAI's API silently ignores null and uses its default, masking the issue.
  • This prevents memU from working with Azure OpenAI deployments (via the OPENAI_BASE_URL override), which is a
    common enterprise setup.
  • Other backends (doubao, openrouter) and the http_client already handle this correctly with if max_tokens is not None guards — this PR aligns the OpenAI SDK client and OpenAI HTTP backend with that same
    pattern.

🔍 Type of Change

Please check what applies:

  • Bug fix
  • New feature
  • Documentation update
  • Refactor / cleanup
  • Other (please explain)

✅ PR Quality Checklist

  • PR title follows the conventional format (feat:, fix:, docs:)
  • Changes are limited in scope and easy to review
  • Documentation updated where applicable
  • No breaking changes (or clearly documented)
  • Related issues or discussions linked

📌 Optional

  • Edge cases considered
  • When max_tokens is explicitly set to an integer: behavior unchanged (included in payload)
  • When max_tokens is None (default): now omitted from payload instead of sending null
  • Other backends (doubao, openrouter, http_client): already use this pattern, no changes needed
  • Follow-up tasks mentioned
  • The lazyllm_client.py backend may have the same issue but was not modified since it uses a different SDK
    interface — worth auditing separately

thuantan2060 and others added 4 commits March 30, 2026 21:33
Azure OpenAI rejects max_tokens=null in chat completion requests.
Only include max_tokens in the request payload when it has a value.

Affected methods:
- OpenAISDKClient.chat()
- OpenAISDKClient.summarize()
- OpenAISDKClient.vision()
- OpenAILLMBackend.build_summary_payload()
- OpenAILLMBackend.build_vision_payload()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Builds linux-x86_64 wheel on every push to main and uploads as
'memu-wheel' artifact with 7-day retention for cross-repo consumption
by memu-server Docker CI pipeline.
- actions/checkout v4 → v6
- astral-sh/setup-uv v7 → v8
- actions/upload-artifact v6 → v7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant