Skip to content

[codex] Make seeded tool IDs deterministic#115

Open
audreyt wants to merge 1 commit into
antirez:mainfrom
audreyt:codex/upstream-deterministic-tool-ids
Open

[codex] Make seeded tool IDs deterministic#115
audreyt wants to merge 1 commit into
antirez:mainfrom
audreyt:codex/upstream-deterministic-tool-ids

Conversation

@audreyt
Copy link
Copy Markdown

@audreyt audreyt commented May 13, 2026

Summary

Seeded requests were still not fully reproducible when the model emitted tool calls, because ds4-server generated missing OpenAI/Anthropic tool call IDs with random bytes after decoding. That meant otherwise identical seeded runs could differ in call_... / toolu_... identifiers, making decision traces noisy to replay and diff.

This changes missing tool ID generation so that requests with a positive seed derive deterministic IDs from the request context, API style, tool index, and tool name. Unseeded requests keep the existing random ID path. OpenAI live tool streaming now uses the same deterministic ID path, so the streamed tool-call start delta and the final parsed tool call agree.

Impact

  • Positive-seed runs produce stable tool-call IDs across repeat executions.
  • Time-based/unseeded direct ds4-server use keeps random IDs.
  • Existing generated IDs supplied by the model/client are still preserved.
  • Reproducible local wrappers can now make answers, tool calls, and tool results easier to audit and compare.

Validation

Passing locally on Apple M5 Max:

  • ./ds4_test --server
  • ./ds4_test --tool-call-quality

Also ran full make test against this upstream-based branch. It currently fails in my checkout on pre-existing long-context/logprob vector expectation mismatches with the local model/fixtures, while tool-call-quality, metal-kernels, and server pass. The new deterministic-ID test is part of the --server group above.

@audreyt audreyt marked this pull request as ready for review May 13, 2026 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant