Skip to content

fix(codex): allow image_generation tool advertisement on backend responses path#439

Open
ihazgithub wants to merge 2 commits intoSoju06:mainfrom
ihazgithub:fix/codex-responses-tools-compat
Open

fix(codex): allow image_generation tool advertisement on backend responses path#439
ihazgithub wants to merge 2 commits intoSoju06:mainfrom
ihazgithub:fix/codex-responses-tools-compat

Conversation

@ihazgithub
Copy link
Copy Markdown
Contributor

Summary

Current Codex clients advertise a top-level image_generation tool on /backend-api/codex/responses requests, for example:

{"type":"image_generation","output_format":"png"}

codex-lb currently routes backend Codex Responses payloads through shared request validation, which rejects that tool and surfaces:

Invalid request payload, param: "tools"

This change keeps the fix narrow and backend-Codex-specific. It strips only that top-level image_generation advertisement on the backend Codex responses path before shared validation, while leaving public /v1/* validation unchanged.

What Changed

  • Added backend-Codex-specific tool sanitization before shared Responses validation
  • Applied that sanitization to both:
    • HTTP /backend-api/codex/responses
    • websocket /backend-api/codex/responses response.create handling
  • Preserved the existing global unsupported-tool policy for public /v1/*
  • Added focused regression coverage for backend HTTP, backend websocket, and unchanged /v1/responses rejection behavior

Scope

This change does not:

  • remove image_generation from the global unsupported tool list
  • loosen public /v1/responses validation
  • change public /v1/chat/completions behavior

It affects only top-level backend-Codex tools entries and preserves other advertised tools such as function, custom, web_search, and namespace.

Validation

Focused coverage added or updated:

  • test_normalize_responses_request_payload_strips_backend_codex_image_generation_tools
  • test_normalize_responses_request_payload_without_codex_compat_still_rejects_image_generation
  • test_backend_responses_strip_image_generation_tool_advertisement
  • test_backend_responses_websocket_strips_image_generation_tool_advertisement
  • test_v1_responses_rejects_builtin_tools now also pins param == "tools"

Commands run:

  • .venv/bin/python -m pytest tests/unit/test_proxy_utils.py -k "normalize_responses_request_payload"
  • .venv/bin/python -m pytest tests/integration/test_openai_compat_features.py -k "test_v1_responses_rejects_builtin_tools or test_backend_responses_allows_web_search or test_backend_responses_strip_image_generation_tool_advertisement"
  • .venv/bin/python -m pytest tests/integration/test_proxy_websocket_responses.py -k "test_backend_responses_websocket_proxies_upstream_and_persists_log or test_backend_responses_websocket_strips_image_generation_tool_advertisement"
  • openspec validate --specs

I also re-validated the real captured Codex payload that originally failed; it now normalizes successfully while preserving the client’s function, custom, web_search, and namespace tools.

Copy link
Copy Markdown
Contributor Author

Context / reviewer note

I wanted to add a little plain-English context for why I dug into this path.

This change came from a real compatibility problem I hit while using Codex through codex-lb, not from a theoretical cleanup. After getting past the earlier proxy-auth/locality issue, the current Codex client was still failing on /backend-api/codex/responses with:

  • Invalid request payload, param: "tools"

I captured and inspected the actual client payload and found that it validated cleanly except for one top-level tool advertisement:

  • {"type":"image_generation","output_format":"png"}

Everything else in the same payload that mattered for my workflow was still valid, including the client's function, custom, web_search, and namespace tools.

That is why this PR is intentionally narrow. Rather than changing the global unsupported-tool policy, it only strips that top-level image_generation advertisement on the backend Codex responses path before shared validation. Public /v1/* behavior remains unchanged.

So the goal here is not to broaden tool support generally. The goal is to restore backend Codex client compatibility for the real payload being sent today, while keeping the shared public validation behavior as strict as it was before.

I also validated the fix against the real captured payload after the change: it now normalizes successfully with the expected tools preserved and only the blocking top-level image_generation advertisement removed.

Copy link
Copy Markdown
Contributor Author

Follow-up fix for the backend HTTP regression in the first version of this PR.

The initial compatibility change unintentionally changed /backend-api/codex/responses validation semantics on the HTTP path. I switched that route from direct ResponsesRequest binding to raw-payload normalization so it could strip the top-level Codex image_generation tool advertisement before validation, but I passed openai_compat=True there.

That caused payloads with missing required fields like instructions to normalize through V1ResponsesRequest instead of failing the way the backend HTTP route did before. In practice, requests like {"model":"gpt-5.1","input":[]} started returning 200 instead of 400.

This follow-up keeps the narrow image_generation compatibility fix, but restores the previous strict required-field behavior on /backend-api/codex/responses by validating the sanitized HTTP payload with ResponsesRequest semantics again. Websocket behavior and public /v1/* behavior remain as intended.

Copy link
Copy Markdown
Contributor Author

Local validation update:

  • Built and ran this branch locally as codex-lb:local-codex-fixes
  • Reused the existing codex-lb-data volume and the same bootstrap/proxy CIDR env config
  • Verified /health on 2455 returns {"status":"ok"}
  • Confirmed the current Codex desktop app now works end-to-end against the local Docker build
  • Logs show successful POST /backend-api/codex/responses requests and accepted websocket connections after the fix

One small ops note: the image still exposes 1455, but the current container entrypoint only starts the FastAPI app on 2455, which matches the README examples and the runtime logs. So the 1455 port behavior does not appear to be part of the Codex compatibility fix itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant