[codex] add GLM 5.2 and OpenAI-compatible endpoint mode by CollierKing · Pull Request #44 · cloudflare/langchain-cloudflare

CollierKing · 2026-06-21T22:35:54Z

Summary

Adds two related Workers AI chat-model updates for langchain-cloudflare:

Adds explicit support for @cf/zai-org/glm-5.2 across model behavior, examples, REST integration coverage, and Worker integration coverage.
Adds endpoint_format="openai_compatible" to ChatCloudflareWorkersAI so REST callers can route chat requests through Cloudflare's OpenAI-compatible chat completions endpoint.

The default remains endpoint_format="workers_ai", which preserves existing /ai/run/{model} behavior. The OpenAI-compatible mode is REST-only; Worker bindings still use env.AI.run() and now raise clearly if that mode is requested with a binding.

Current Status

CI follow-up commits fixed the first runner failures and the separate Worker example dependency issue:

2d6a114 fixes mypy under CI's newer floating dependency resolution and newer langchain-tests model-override conformance.
54fef21 fixes the Worker example dependency layout that triggered the separate Dependabot security-update error on main.
The Worker example no longer declares direct langchain in [project.dependencies]; pywrangler sync keeps only Pyodide-compatible dependencies, and scripts/setup_pyodide_deps.sh remains responsible for the known-working create_agent vendored stack.
Worker create_agent integration tests now fail on 501 instead of skipping, so unavailable create_agent is no longer hidden.

AI Gateway update: the account now has the test-ai-gateway gateway configured. Targeted Worker binding and REST AI Gateway tests pass against it.

Why

Cloudflare's native Workers AI run endpoint and OpenAI-compatible chat completions endpoint can behave differently for multimodal and structured-output workflows. This exposes the endpoint choice directly without changing existing defaults.

GLM-5.2 is also now registered with model-specific parameter handling so it keeps supported OpenAI-compatible fields like max_tokens and tool_choice while stripping unsupported fields like top_k and repetition_penalty.

User Impact

Users can now instantiate:

ChatCloudflareWorkersAI(
    model="@cf/moonshotai/kimi-k2.6",
    endpoint_format="openai_compatible",
)

and continue using the existing default behavior without changes:

ChatCloudflareWorkersAI(
    model="@cf/moonshotai/kimi-k2.6",
    endpoint_format="workers_ai",
)

Validation

make lint - passed (ruff check, ruff format --diff, mypy)
make test - 110 passed, 2 skipped
pre-commit run --all-files - passed
uv run pytest --disable-socket --allow-unix-socket tests/unit_tests/test_chat_models.py -q - 57 passed, 2 skipped before the CI follow-up test was added
uv run pytest tests/integration_tests/test_workersai_models.py::TestOpenAICompatibleEndpoint -v -s - 3 passed
uv run pytest tests/integration_tests/test_worker_integration.py::TestWorkerAIGateway -v -s - 3 passed
AI_GATEWAY=test-ai-gateway uv run pytest tests/integration_tests/test_workersai_models.py::TestAIGatewayHeaders -v -s - 3 passed
cd examples/workers && uv run pywrangler sync --force - passed
cd examples/workers && ./scripts/setup_pyodide_deps.sh - passed; installs the known-working LangChain 1.0 Worker stack
Representative Worker create_agent coverage on Qwen - 3 passed:
- TestWorkerAgentTools::test_agent_tools[@cf/qwen/qwen3-30b-a3b-fp8]
- TestWorkerAgentStructuredOutput::test_agent_structured_output[@cf/qwen/qwen3-30b-a3b-fp8]
- TestWorkerAgentStructuredJsonSchema::test_agent_structured_json_schema[@cf/qwen/qwen3-30b-a3b-fp8]
Full REST integration: uv run pytest tests/integration_tests/test_workersai_models.py -v -s - 128 passed, 64 skipped, 1 failed
Full Worker integration before test-ai-gateway was configured: uv run pytest tests/integration_tests/test_worker_integration.py -v -s - 158 passed, 6 skipped, 5 failed

Known Unrelated Failures

Full REST integration has the known Mistral structured-output batch failure where one result returns None.

Full Worker integration has known/environment-sensitive model-output failures:

Mistral structured-output batch returned None
gpt-oss-20b structured-output batch returned None

A targeted Worker rerun reproduced the Mistral and gpt-oss-20b structured-output batch failures while gpt-oss-120b passed, which points to model-output quality rather than this endpoint-routing change.

Collier King added 2 commits June 21, 2026 17:35

add glm 5.2 and openai endpoint mode

e2d5997

fix chat model ci conformance

2d6a114

CollierKing marked this pull request as ready for review June 21, 2026 22:49

CollierKing merged commit bd3d10d into main Jun 22, 2026
9 checks passed

This was referenced Jun 22, 2026

[codex] fix Worker create_agent vendoring #45

Merged

Expose OpenAI-compatible chat completions endpoint mode for Workers AI chat models #43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[codex] add GLM 5.2 and OpenAI-compatible endpoint mode#44

[codex] add GLM 5.2 and OpenAI-compatible endpoint mode#44
CollierKing merged 2 commits into
mainfrom
feat/add-glm-5-2

CollierKing commented Jun 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

CollierKing commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Current Status

Why

User Impact

Validation

Known Unrelated Failures

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

CollierKing commented Jun 21, 2026 •

edited

Loading