Skip to content

Memory leak in parse_response usage of pydantic#3068

Open
avilaton wants to merge 1 commit intoopenai:mainfrom
avilaton:fix/responses-parse-memory-leak
Open

Memory leak in parse_response usage of pydantic#3068
avilaton wants to merge 1 commit intoopenai:mainfrom
avilaton:fix/responses-parse-memory-leak

Conversation

@avilaton
Copy link
Copy Markdown

@avilaton avilaton commented Apr 7, 2026

  • I understand that this repository is auto-generated and my pull request may not be merged

Changes being requested

Address a memory leak in parse_response detected using the OpenAI client in a webserver context (gunicorn, gevent)

Additional context & links

Problem

client.responses.parse() can trigger sustained memory growth on pydantic >= 2.11 (see #1181).

The issue is that parse_response() used subscripted runtime generic aliases (for example ParsedResponse[T]) when calling construct_type_unchecked(). In pydantic v2, this can cause repeated runtime generic specialization/schema work in a hot path.

Fix

Use the non-subscripted runtime classes in parse_response() when calling construct_type_unchecked():

  • ParsedResponseOutputText[TextFormatT]ParsedResponseOutputText
  • ParsedResponseOutputMessage[TextFormatT]ParsedResponseOutputMessage
  • ParsedResponse[TextFormatT]ParsedResponse

Why this is safe

construct_type_unchecked() constructs models loosely and does not require runtime generic specialization for correctness here. parse_text() still produces the typed parsed payload, and the return type for callers remains ParsedResponse[TextFormatT].

Tests

  • Added a regression test in tests/lib/responses/test_parsing.py that fails if parse_response() routes through _validate_non_model_type.
  • Added an opt-in memory characterization test (OPENAI_RUN_MEMORY_TESTS=1) in tests/lib/responses/test_parsing.py.
  • Ran responses suites against the mock server: 158 passed.

@avilaton avilaton requested a review from a team as a code owner April 7, 2026 18:51
@avilaton avilaton changed the title fix(responses): avoid runtime generic specialization in parse_response Memory leak in parse_response usage of pydantic Apr 7, 2026
@savvasp-123
Copy link
Copy Markdown

This is a real issue

@afurm
Copy link
Copy Markdown

afurm commented Apr 13, 2026

cast(Any, ParsedResponseOutputText) bypasses the type checker entirely, making it impossible to catch accidental type mismatches in refactors. A safer pattern would be to define non-generic aliases at the module level — e.g., _ParsedResponseOutputTextBase = ParsedResponseOutputText.__pydantic_generic_metadata__['args'][0] — and use those directly. This preserves type-checker coverage while avoiding the subscripted generic path that triggers the pydantic overhead. Is there a reason cast(Any, ...) was preferred over a named type alias here?

@avilaton
Copy link
Copy Markdown
Author

cast(Any, ParsedResponseOutputText) bypasses the type checker entirely, making it impossible to catch accidental type mismatches in refactors. A safer pattern would be to define non-generic aliases at the module level — e.g., _ParsedResponseOutputTextBase = ParsedResponseOutputText.__pydantic_generic_metadata__['args'][0] — and use those directly. This preserves type-checker coverage while avoiding the subscripted generic path that triggers the pydantic overhead. Is there a reason cast(Any, ...) was preferred over a named type alias here?

No preference, we are just trying to contain a memory leak that caused us a lot of headaches which are much worse than any type checking error we might have wanted to avoid. Any solution to the problem works, we are in fact only consumers of the library and wanted to see if we could raise this issue. Thanks for replying, we don't have a preference on how it gets resolved but did see at least 3 memory related issues. This made it to production for us before we noticed the oomkilled errors on our pods.

@savvasp-123
Copy link
Copy Markdown

savvasp-123 commented Apr 13, 2026

I tried this PR and it didn't seem to work for me. Perhaps I did something incorrectly. But there are definitely memory leak issues with the Responses async .parse()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants