Skip to content

[codex] add GLM 5.2 and OpenAI-compatible endpoint mode#44

Merged
CollierKing merged 2 commits into
mainfrom
feat/add-glm-5-2
Jun 22, 2026
Merged

[codex] add GLM 5.2 and OpenAI-compatible endpoint mode#44
CollierKing merged 2 commits into
mainfrom
feat/add-glm-5-2

Conversation

@CollierKing

@CollierKing CollierKing commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds two related Workers AI chat-model updates for langchain-cloudflare:

  • Adds explicit support for @cf/zai-org/glm-5.2 across model behavior, examples, REST integration coverage, and Worker integration coverage.
  • Adds endpoint_format="openai_compatible" to ChatCloudflareWorkersAI so REST callers can route chat requests through Cloudflare's OpenAI-compatible chat completions endpoint.

The default remains endpoint_format="workers_ai", which preserves existing /ai/run/{model} behavior. The OpenAI-compatible mode is REST-only; Worker bindings still use env.AI.run() and now raise clearly if that mode is requested with a binding.

Current Status

CI follow-up commits fixed the first runner failures and the separate Worker example dependency issue:

  • 2d6a114 fixes mypy under CI's newer floating dependency resolution and newer langchain-tests model-override conformance.
  • 54fef21 fixes the Worker example dependency layout that triggered the separate Dependabot security-update error on main.
  • The Worker example no longer declares direct langchain in [project.dependencies]; pywrangler sync keeps only Pyodide-compatible dependencies, and scripts/setup_pyodide_deps.sh remains responsible for the known-working create_agent vendored stack.
  • Worker create_agent integration tests now fail on 501 instead of skipping, so unavailable create_agent is no longer hidden.

AI Gateway update: the account now has the test-ai-gateway gateway configured. Targeted Worker binding and REST AI Gateway tests pass against it.

Why

Cloudflare's native Workers AI run endpoint and OpenAI-compatible chat completions endpoint can behave differently for multimodal and structured-output workflows. This exposes the endpoint choice directly without changing existing defaults.

GLM-5.2 is also now registered with model-specific parameter handling so it keeps supported OpenAI-compatible fields like max_tokens and tool_choice while stripping unsupported fields like top_k and repetition_penalty.

User Impact

Users can now instantiate:

ChatCloudflareWorkersAI(
    model="@cf/moonshotai/kimi-k2.6",
    endpoint_format="openai_compatible",
)

and continue using the existing default behavior without changes:

ChatCloudflareWorkersAI(
    model="@cf/moonshotai/kimi-k2.6",
    endpoint_format="workers_ai",
)

Validation

  • make lint - passed (ruff check, ruff format --diff, mypy)
  • make test - 110 passed, 2 skipped
  • pre-commit run --all-files - passed
  • uv run pytest --disable-socket --allow-unix-socket tests/unit_tests/test_chat_models.py -q - 57 passed, 2 skipped before the CI follow-up test was added
  • uv run pytest tests/integration_tests/test_workersai_models.py::TestOpenAICompatibleEndpoint -v -s - 3 passed
  • uv run pytest tests/integration_tests/test_worker_integration.py::TestWorkerAIGateway -v -s - 3 passed
  • AI_GATEWAY=test-ai-gateway uv run pytest tests/integration_tests/test_workersai_models.py::TestAIGatewayHeaders -v -s - 3 passed
  • cd examples/workers && uv run pywrangler sync --force - passed
  • cd examples/workers && ./scripts/setup_pyodide_deps.sh - passed; installs the known-working LangChain 1.0 Worker stack
  • Representative Worker create_agent coverage on Qwen - 3 passed:
    • TestWorkerAgentTools::test_agent_tools[@cf/qwen/qwen3-30b-a3b-fp8]
    • TestWorkerAgentStructuredOutput::test_agent_structured_output[@cf/qwen/qwen3-30b-a3b-fp8]
    • TestWorkerAgentStructuredJsonSchema::test_agent_structured_json_schema[@cf/qwen/qwen3-30b-a3b-fp8]
  • Full REST integration: uv run pytest tests/integration_tests/test_workersai_models.py -v -s - 128 passed, 64 skipped, 1 failed
  • Full Worker integration before test-ai-gateway was configured: uv run pytest tests/integration_tests/test_worker_integration.py -v -s - 158 passed, 6 skipped, 5 failed

Known Unrelated Failures

Full REST integration has the known Mistral structured-output batch failure where one result returns None.

Full Worker integration has known/environment-sensitive model-output failures:

  • Mistral structured-output batch returned None
  • gpt-oss-20b structured-output batch returned None

A targeted Worker rerun reproduced the Mistral and gpt-oss-20b structured-output batch failures while gpt-oss-120b passed, which points to model-output quality rather than this endpoint-routing change.

@CollierKing CollierKing marked this pull request as ready for review June 21, 2026 22:49
@CollierKing CollierKing merged commit bd3d10d into main Jun 22, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant