From 9da6623af76f41968cc0525109e45084c81a1ab6 Mon Sep 17 00:00:00 2001 From: OmarAlJarrah Date: Mon, 15 Jun 2026 22:09:59 +0300 Subject: [PATCH 1/2] test: pin and document that a body-less non-idempotent request is not retried MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The retry-safety gate keys body-less requests off method idempotency rather than off the absence of a body: with no body the request is retried only when its method is idempotent, and with a body only when the body is replayable. A consequence that was easy to miss is that a bare POST (a body-less, non-idempotent request, e.g. a trigger/activate-style endpoint) is NOT retried even though it carries no payload to re-send — replaying it could duplicate a side effect the server may already have applied. This behavior was intended but undocumented and untested. Make it explicit: - Spell out the body-less branch in the KDoc of DefaultRetryStep (isRetrySafe and the "Body replayability" section) and the recovery-aware RetryStep (canRetry), stating that body-less retry safety keys off method idempotency. - Tighten the retry idempotency note in docs/pipelines.md to match the actual per-axis gate and call out the bare-POST case. - Add regression tests: a body-less POST against a retryable 503 results in exactly one attempt (no retry), with a body-less PUT control proving the same branch retries an idempotent method. KDoc, docs, and tests only; no public API or behavior change. --- docs/pipelines.md | 11 ++-- .../http/pipeline/steps/DefaultRetryStep.kt | 36 ++++++++----- .../sdk/core/pipeline/step/retry/RetryStep.kt | 8 +-- .../core/http/pipeline/steps/RetryStepTest.kt | 53 +++++++++++++++++++ 4 files changed, 88 insertions(+), 20 deletions(-) diff --git a/docs/pipelines.md b/docs/pipelines.md index 4130dc23..36f220a0 100644 --- a/docs/pipelines.md +++ b/docs/pipelines.md @@ -361,9 +361,14 @@ It retries only when the outcome is a `Failure` whose throwable is classified re `RetrySettings.retryableStatuses`. - A `NetworkException` (a transport failure with no response on the wire — always retryable). -Idempotency is enforced independently of classification: a request is eligible only when its -method is in `RetrySettings.retryableMethods` **or** its body is replayable. Non-idempotent -methods (`POST`/`PATCH`) with a non-replayable body are never re-sent. +Idempotency is enforced independently of classification, keyed off whether the request carries +a body. A request **with a body** is eligible only when its body is replayable (a non-replayable +body cannot be re-sent — the second `writeTo` would trip its consume-once guard). A request +**with no body** is eligible only when its method is in `RetrySettings.retryableMethods`. +Body-less retry safety keys off method idempotency, not off the absence of a body: a body-less +non-idempotent request — a bare `POST`/`PATCH` to a trigger / activate-style endpoint — is +therefore never re-sent, even though there is no payload to replay, because the server may +already have applied the side effect. Waits between attempts use a `ScheduledExecutorService` plus `CompletableFuture.get`, never `Thread.sleep`, so virtual-thread carriers can unmount during the delay. An interrupt restores diff --git a/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/DefaultRetryStep.kt b/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/DefaultRetryStep.kt index 9ecb37c3..380436cf 100644 --- a/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/DefaultRetryStep.kt +++ b/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/DefaultRetryStep.kt @@ -46,15 +46,20 @@ import java.util.concurrent.ThreadLocalRandom * * ## Body replayability * - * Eligibility is gated on re-sendability. A body-less request is retried only when its method - * is idempotent; a body-bearing request is retried only when its body is replayable — a - * non-replayable body physically cannot be re-sent (the second `writeTo` trips the body's - * consume-once guard and surfaces as a confusing wrapped [IllegalStateException] that masks the - * real failure). A replayable body is re-sendable regardless of method; making the re-sent - * request idempotent (for a non-idempotent method, e.g. via an idempotency key) is the caller's - * responsibility. When the request is not re-sendable the loop runs exactly one attempt and - * returns the response (or rethrows the exception) as-is. This mirrors - * `pipeline.step.retry.RetryStep.canRetry`. + * Eligibility is gated on re-sendability, keyed off whether the request carries a body: + * - **No body** — retried only when the method is idempotent ([IDEMPOTENT_METHODS]). Body-less + * retry safety keys off method idempotency, NOT off the absence of a body, so a body-less + * non-idempotent request — e.g. a bare `POST` to a trigger / activate-style endpoint — is NOT + * retried even though there is no payload to re-send: replaying it could duplicate the side + * effect the server may already have applied. + * - **Has a body** — retried only when the body is replayable. A non-replayable body physically + * cannot be re-sent (the second `writeTo` trips the body's consume-once guard and surfaces as + * a confusing wrapped [IllegalStateException] that masks the real failure). A replayable body + * is re-sendable regardless of method; making the re-sent request idempotent (for a + * non-idempotent method, e.g. via an idempotency key) is the caller's responsibility. + * + * When the request is not re-sendable the loop runs exactly one attempt and returns the response + * (or rethrows the exception) as-is. This mirrors `pipeline.step.retry.RetryStep.canRetry`. * * ## Delay precedence (highest to lowest) * @@ -246,11 +251,14 @@ public open class DefaultRetryStep /** * Returns `true` when [request] may be re-sent. A body-less request is retry-safe only - * when its method is idempotent; a body-bearing request is retry-safe only when its body - * is replayable — a non-replayable body cannot be re-sent (the second - * `RequestBody.writeTo` trips the body's consume-once guard and surfaces as a confusing - * wrapped [IllegalStateException]). Making a re-sent body-bearing request idempotent is - * the caller's responsibility. Mirrors `pipeline.step.retry.RetryStep.canRetry`. + * when its method is idempotent ([IDEMPOTENT_METHODS]) — the gate keys off method + * idempotency, not off the absence of a body, so a body-less non-idempotent request (a + * bare `POST`) is NOT retry-safe even though there is no payload to re-send. A body-bearing + * request is retry-safe only when its body is replayable — a non-replayable body cannot be + * re-sent (the second `RequestBody.writeTo` trips the body's consume-once guard and + * surfaces as a confusing wrapped [IllegalStateException]). Making a re-sent body-bearing + * request idempotent is the caller's responsibility. Mirrors + * `pipeline.step.retry.RetryStep.canRetry`. */ private fun isRetrySafe(request: Request): Boolean { val body = request.body ?: return request.method in IDEMPOTENT_METHODS diff --git a/sdk-core/src/main/kotlin/org/dexpace/sdk/core/pipeline/step/retry/RetryStep.kt b/sdk-core/src/main/kotlin/org/dexpace/sdk/core/pipeline/step/retry/RetryStep.kt index 8d50c8ec..3161785e 100644 --- a/sdk-core/src/main/kotlin/org/dexpace/sdk/core/pipeline/step/retry/RetryStep.kt +++ b/sdk-core/src/main/kotlin/org/dexpace/sdk/core/pipeline/step/retry/RetryStep.kt @@ -344,9 +344,11 @@ public class RetryStep /** * Returns true when [request] is safe to retry. A body-less request is retry-safe only - * when its method is in [RetrySettings.retryableMethods] (idempotent); a body-bearing - * request is retry-safe only when its body is replayable — a non-replayable body cannot - * be re-sent (the second `writeTo` trips the consume-once guard). Ensuring a re-sent + * when its method is in [RetrySettings.retryableMethods] (idempotent) — the gate keys off + * method idempotency, not off the absence of a body, so a body-less non-idempotent request + * (a bare `POST`) is NOT retried even though there is no payload to re-send. A body-bearing + * request is retry-safe only when its body is replayable — a non-replayable body cannot be + * re-sent (the second `writeTo` trips the consume-once guard). Ensuring a re-sent * body-bearing request is idempotent is the caller's responsibility. */ private fun canRetry(request: Request): Boolean { diff --git a/sdk-core/src/test/kotlin/org/dexpace/sdk/core/http/pipeline/steps/RetryStepTest.kt b/sdk-core/src/test/kotlin/org/dexpace/sdk/core/http/pipeline/steps/RetryStepTest.kt index 12fd9b1b..ac8b22c3 100644 --- a/sdk-core/src/test/kotlin/org/dexpace/sdk/core/http/pipeline/steps/RetryStepTest.kt +++ b/sdk-core/src/test/kotlin/org/dexpace/sdk/core/http/pipeline/steps/RetryStepTest.kt @@ -685,6 +685,59 @@ class RetryStepTest { assertEquals(2, fake.callCount, "replayable POST must retry") } + @Test + fun `body-less POST is NOT retried on a retryable response`() { + // Body-less retry safety keys off METHOD idempotency, not off the absence of a body. A + // bare POST (no payload to re-send) is non-idempotent, so it must NOT be retried even on a + // retryable status — a second POST could duplicate a side effect the server already + // applied. The 503 is returned as-is after exactly one attempt. This exercises the + // body == null branch of isRetrySafe on a non-idempotent method. + val fake = + FakeHttpClient() + .enqueue { status(503) } + .enqueue { status(200) } // must never be reached + + val pipeline = + HttpPipelineBuilder(fake) + .append(DefaultRetryStep(HttpRetryOptions(maxRetries = 3), zeroDelayClock())) + .build() + + val request = + Request.builder() + .method(Method.POST) + .url("https://api.example.com/x") + .build() + + val response = pipeline.send(request) + assertEquals(503, response.status.code) + assertEquals(1, fake.callCount, "body-less POST must not be retried — POST is non-idempotent") + } + + @Test + fun `body-less PUT IS retried because PUT is idempotent`() { + // Control for the body == null branch: with no body the gate falls through to method + // idempotency. PUT is idempotent, so a body-less PUT is retry-safe and retries normally. + val fake = + FakeHttpClient() + .enqueue { status(503) } + .enqueue { status(200) } + + val pipeline = + HttpPipelineBuilder(fake) + .append(DefaultRetryStep(HttpRetryOptions(maxRetries = 3), zeroDelayClock())) + .build() + + val request = + Request.builder() + .method(Method.PUT) + .url("https://api.example.com/x") + .build() + + val response = pipeline.send(request) + assertEquals(200, response.status.code) + assertEquals(2, fake.callCount, "body-less PUT must retry — PUT is idempotent") + } + @Test fun `non-replayable PUT body is NOT retried even though PUT is idempotent`() { // Both gates are required: PUT is idempotent, but a non-replayable body physically From 6d0ac91e1f47884954d04223a692c99fa3fd5b30 Mon Sep 17 00:00:00 2001 From: OmarAlJarrah Date: Tue, 16 Jun 2026 21:23:51 +0300 Subject: [PATCH 2/2] test: mirror body-less retry-eligibility cases in the recovery-pipeline RetryStep suite The http.pipeline DefaultRetryStep suite pins that a body-less request's retry eligibility keys off method idempotency (bare POST not retried, body-less PUT retried). The recovery-oriented pipeline.step.retry.RetryStep gate has the same rule but only covered body-bearing requests (non-replayable POST/PUT). Add the two body-less cases there so both gates carry the regression, and cross-reference the mirrored cases in both suites. --- AUDIT.md | 149 ++ docs/openai-java-deep-dive.md | 1514 +++++++++++++++++ .../core/http/pipeline/steps/RetryStepTest.kt | 4 +- .../core/pipeline/step/retry/RetryStepTest.kt | 45 + 4 files changed, 1711 insertions(+), 1 deletion(-) create mode 100644 AUDIT.md create mode 100644 docs/openai-java-deep-dive.md diff --git a/AUDIT.md b/AUDIT.md new file mode 100644 index 00000000..75fe2d5d --- /dev/null +++ b/AUDIT.md @@ -0,0 +1,149 @@ +# Codebase Audit v2 — dexpace java-sdk (deep pass) + +Audit v2 date: 2026-06-10. Audited at HEAD **`1f233bd`** ("fix: correctness and resource-safety fixes across the HTTP stack"), which landed in response to audit v1 (run at `a78a693`). This pass goes deeper into every v1 item: it verifies each fix's mechanism line-by-line, hunts for fix-introduced bugs, re-traces the residual paths, and reviews the new tests for coverage of each claim. No code was modified by the audit. + +**Verification method:** static deep-trace of every changed file plus the surrounding call graph; review of all 16 new/extended test files for coverage of each fix claim. The test suite was **not executed** in this environment (sandbox has JDK 11 only, no Gradle distribution/dependency cache; the build needs provisioned 8/11/21 toolchains). Where a claim would need a runtime check, it is marked Needs-verification. + +**Headline:** all 18 v1 dispositions verified against the diff — 15 fully fixed, M3 mitigated-as-designed, M5 half-fixed (values yes, names no), M7 resolved-as-documented-limitation. The deep pass found **8 new findings (V2-1 … V2-8)**, all in or adjacent to the fix commit itself; two are Medium. L6′ (the stray-token challenge parse) remains open by design. + +--- + +## 1. System model (unchanged from v1, abbreviated) + +HTTP-client toolkit: `sdk-core` owns immutable models, the `Io.installProvider` seam, two pipeline layers (stage-based `http.pipeline`, recovery-based `pipeline`); transports (OkHttp/JDK) and async adapters plug in via SPIs. Invariants: consume-once bodies unless replayable; response bodies must be closed; blocking calls honor interrupts (`InterruptedIOException`); retries only re-send replayable bodies; body logging is now **bounded** capture (post-fix). + +--- + +## 2. Item-by-item deep verification of v1 findings at `1f233bd` + +### H1 — unbounded body-logging capture / streaming hang → **FIXED, with residuals V2-1, V2-4, V2-5, V2-8** +- **Fix mechanism (verified):** `LoggableResponseBody` rewritten around a `maxCaptureBytes` bounded drain (`drainAndCache`, `LoggableResponseBody.kt:231-281`): within-cap bodies are fully captured + delegate closed (repeatable `peek()` views, old behavior); over-cap bodies retain the live delegate as `liveTail` and `source()` returns a one-shot `PrefixThenTailSource` (prefix replay + live tail, `:290-311`) guarded single-use by a CAS (`tailHandedOut`, `:142-147`). `TeeSink` gained a `tapLimit` budget (`mirrorPrefix`, `TeeSink.kt`) mirrored across the typed-write, `drainScratch`, and `outputStream()` paths — the full payload always reaches the primary sink. Both instrumentation steps now construct `bounded(...)` wrappers at `bodyPreviewMaxBytes` (`DefaultInstrumentationStep.kt:151-156`, `DefaultAsyncInstrumentationStep.kt:241-256`), and the async step **skips wrapping entirely for `contentLength() < 0`** so the completion thread can never block on a streaming producer (`:249`). Public constructors keep the unbounded default — no API break. `HttpInstrumentationOptions` and the body-logging doc were rewritten to match. +- **Test coverage:** `LoggableResponseBodyTest` (+73 lines: over-cap streams full body, prefix repeatability, partial-failure), `TeeSinkTest` (+87: budget across all three write paths), `InstrumentationStepTest`/`AsyncInstrumentationStepTest` (+77: unknown-length skip). +- **Residuals:** the **sync** step has no unknown-length skip (→ **V2-1**); the new drain loop lacks a `read == 0` guard (→ **V2-4**); over-cap consumption can double-close the delegate source (→ **V2-5**); log-field semantics changed silently (→ **V2-8**). Boundary note: a body exactly equal to the cap exits the loop without observing EOF, so it takes the over-cap one-shot path where it could have stayed repeatable — correct output, minor behavioral wart. + +### H2 — retry re-sends one-shot bodies on idempotent methods → **FIXED, verified** +- Both gates now read `body == null → method ∈ idempotent-set; body != null → body.isReplayable()` (`RetryStep.kt` `canRetry`, `DefaultRetryStep.kt` `isRetrySafe` — single-line change each, same shape). Deep-checked the semantic corners: PUT+one-shot no longer retried (the bug); PUT/POST+replayable retried (unchanged, with the "caller owns idempotency-key" caveat now documented); body-less POST not retried; `NetworkException` retries also flow through the gate, correctly refusing physically-unsendable bodies even when the server never saw the request. `RetryStepTest` files grew +163/+36 lines including the exact "idempotent method + one-shot body + retryable status" scenario v1 flagged as untested. + +### H3 — throwing response/recovery step leaks the open response → **FIXED; one contract gap remains** +- `ResponsePipeline.applyResponseSteps` captures the in-hand response before the step runs and `closeQuietly(inResponse, t)` on throw, attaching close failures as suppressed; `invokeRecovery` closes a `Success`'s response when the recovery step throws. Verified close idempotency makes the throw-path close safe even if the step already closed. +- **Remaining gap (carried, documented here):** a recovery step that *returns* a `Failure` for a `Success` input (transform, not throw) still drops the response unclosed — that is necessarily the step author's responsibility (the framework can't know whether the step transferred ownership), but the `ResponseRecoveryStep` KDoc does not yet say so. One-line doc fix. + +### H4 — JDK streaming publisher: shared pipe + eager writer → **FIXED for replayable bodies; fix introduced V2-2 and V2-3** +- **Fix mechanism (verified):** `streamingPublisher` (`BodyPublishers.kt:164-184`) now returns `ofInputStream { newSubscriptionStream(replayable) }` — a **fresh pipe pair + fresh writer task per supplier invocation** (`:201-227`), so the JDK's per-subscription supplier calls (407 proxy-auth retry, H2 GOAWAY replay) each get a complete body. The returned stream is a `KillSwitchInputStream` (`:255-286`) whose `close()` cancels the writer `Future` with interrupt **before** closing both pipe ends — the cancel is what unblocks a writer parked in `PipedOutputStream.write` (closing `pipeOut` alone would not). The writer task restores the interrupt flag per repo convention. Tests: `streamingPublisherSurvivesResubscription`, `abandonedStreamingPublisherDoesNotStrandWriter`, `streamingBodyRoundTripsThroughTransport` (`JdkHttpTransportTest.kt:307-371`) cover the two v1 failure modes directly. +- **But:** to make every subscription re-readable, a non-replayable body is first coerced via `toReplayable()` — i.e. fully buffered in memory (→ **V2-2**), and the `IOException` fallback path is itself buggy (→ **V2-3**). + +### H5 — uninterruptible `join()` in blocking bridges → **FIXED, verified** +- Both `asBlocking()` (`AsyncHttpClient.kt`) and `toBlocking()` (`AsyncPipelineBridges.kt`) now hold the future, block in `get()`, and on `InterruptedException`: restore the flag, `future.cancel(true)`, throw `InterruptedIOException` with cause — the exact `RetryStep.awaitDelay` pattern v1 recommended. `ExecutionException` is unwrapped via `Futures.unwrap`. `CancellationException` still propagates raw (same as pre-fix `join()`; acceptable). Tests in `AsyncHttpClientTest`/`AsyncHttpPipelineTest` (+43/+35). + +### M1 — `java.net.URL` equals/hashCode DNS → **FIXED, verified** +- `Request` keeps `data class` (so `copy`/`componentN`/`toString` are binary-stable) but overrides `equals`/`hashCode` to compare `url.toExternalForm()` textually (`Request.kt:52-…`). No network I/O, no virtual-host conflation; transitively de-fangs `Response`/`ResponseOutcome` equality. Two accepted consequences, both documented: textual comparison is *stricter* (`http://a` ≠ `http://a/`), and `body` still compares by identity (as it did pre-fix). 91 lines of `RequestTest` added. The Phase-2 `URL → URI` migration remains the better long-term shape but is no longer urgent. + +### M2 — `PagedIterable` leaks the in-flight page on abandonment → **FIXED, verified** +- The iterator now closes each page **immediately after** taking `next.value.iterator()` (the v1-recommended cheap fix — items are a fully-materialized list, so they survive the close). The `currentPage` tracking field is gone entirely; there is no window in which an open page waits on a consumer pull. `PagedIterableTest` +88 lines including partial-consume (`first()`/`take(n)`/short-circuit stream) scenarios. Minor note: `next.close()` failure now aborts iteration even though the items were already in hand — `closeQuietly` + log would be friendlier, but failing loud on a close error is defensible. +- `byPage()` direct callers still own page lifecycle (documented; unchanged by design). + +### M3 — unbounded `ContextStore` → **MITIGATED as designed (latent hazard, bounded backstop)** +- `drainToCap()` after every insert bounds the map at `MAX_TRACKED_CONTEXTS = 4096`, mirroring the Digest nonce-counter pattern v1 pointed at; the KDoc now states the close contract, the pin-the-graph hazard, why eviction is an arbitrary victim, and why `WeakReference` was rejected (the store is the only strong reference on the live path — a weak ref could collect an in-flight context). Deep-checked the drain loop for the convergence property under concurrent inserts: each insert drains until under cap, so the map cannot ratchet upward. Honest residual, stated in the KDoc itself: under heavy leak pressure a *live* call's entry can be evicted (arbitrary victim). Acceptable for a backstop. `ContextStoreTest` +28. + +### M4 — unbounded request-side logging tap → **FIXED, verified** (same mechanism as H1: `TeeSink` tap budget + `LoggableRequestBody.bounded`, cap preserved across `toReplayable` rewraps; `LoggableRequestBodyTest` +40.) + +### M5 — transport-divergent illegal header handling → **HALF-FIXED → residual V2-6** +- Header **values** are now validated at the model layer: `Headers.Builder.add/set` (all four String/typed overloads, `Headers.kt:163-249`) reject `\r`/`\n` with a clear splitting-vector message; the policy choice (CR/LF only, not OkHttp's printable-ASCII rule) is reasoned in the KDoc and is the right transport-agnostic contract. `HeadersTest` +100. `addAll(Headers)` bypasses validation but only accepts already-validated built instances — closed loop, verified. +- Header **names** remain unvalidated (→ **V2-6**), and the new transport-adapter comments overstate the guarantee. + +### M6 — predicate/delay-override throw leaks the retryable response → **FIXED, verified** (`DefaultRetryStep.decideRetryResponse` wraps predicate + `computeResponseDelay` in try/catch, `closeQuietly(response)` before rethrow; close idempotency makes the double-close-on-happy-path concern moot. Tests cover predicate-throw and `Error`-passthrough paths.) + +### M7 — `ProxyOptions.challengeHandler` silently ignored → **RESOLVED as documented limitation; new doc nit V2-7** +- v1 offered "wire it or fail loudly + fix the docs"; the fix chose the latter: `ProxyOptions` KDoc now states the handler is "currently not honoured by any shipped transport", and **both** transports emit a WARNING event (`proxy.auth.challenge_handler.unsupported`) when it is set, each with an accurate per-transport rationale (OkHttp: not wired into `proxyAuthenticator`; JDK: no per-407 hook exists). `proxyChallengeHandlerIsAcceptedAndSurfacedAsUnsupported` test added. This is a legitimate resolution — except one new doc claim is suspect (→ **V2-7**). + +### L1 — OkHttp async cancel/complete race drops the adapted response → **FIXED, verified (and the v1 "needs-verification" is now resolved)** +- `onResponse` now splits adapt from complete and `closeQuietly(adapted)` when `future.complete(adapted)` returns false — exactly mirroring the JDK bridge. The new test (`asyncResponseThatLosesTheRaceIsClosed`, `OkHttpTransportTest.kt`) reproduces the race **deterministically** with an interceptor that parks the completed exchange while the test settles the future — a genuinely clever construction that also confirms v1's uncertainty about OkHttp routing cancelled-but-completed exchanges to `onFailure`: the test deliberately avoids `cancel()` and uses a decoy `complete()` instead, which is the same lost-race code path. v1's Needs-verification is closed as Confirmed-and-fixed. + +### L2 — `writeAllInto` silent truncation on `read == 0` → **FIXED, verified** (now throws `IOException` naming the contract violation, mirroring `TeeSink.writeAll`; `WriteAllIntoTest` updated. Ironically the *new* drain loop in `LoggableResponseBody` reintroduces the unguarded pattern — see V2-4.) + +### L3 — `stripUserInfo` re-encodes the Location URI → **FIXED, verified** +- Rebuilt textually from `rawPath`/`rawQuery`/`rawFragment` (`DefaultRedirectStep.kt`), so `%2F`/`%26` survive byte-exact. Deep-checked the edge: `uri.host` for an IPv6 literal includes its brackets in the Java URI API, so `scheme://host:port` reassembly is correct there; `userInfo != null` implies a server-based authority, so `host` is non-null on every path that reaches this function. `RedirectStepTest` +25 including an encoded-path round-trip. + +### L4 — broken KDoc package link → **FIXED** (both `ResponseBody.kt` references now plain `[LoggableResponseBody]`). + +### L5 — redundant per-instance lock in `RetryStep` → **FIXED** (lock deleted; `resolveScheduler` reads the companion `by lazy` directly; misleading comment corrected — the class now genuinely has no mutable instance state). + +### L6 — phantom challenge from malformed continuation param → **RETRACTED in v1 review; regression test confirmed present** (`AuthChallengeParserTest.kt:398-411`). Stands retracted; the no-phantom behavior is now pinned. + +### L6′ — stray trailing token after `key= ` → **OPEN (unchanged, deliberate)** +- No main-source change to `AuthChallengeParser` in `1f233bd`; the reviewer explicitly scoped it out. Still bounded (composite handler ignores unsatisfiable challenges; nothing shipped consumes the parser — see M7). Recommend a small follow-up with its own regression test, mostly for pinning value. + +### L7 — fixtures violate SDK conventions → **FIXED, verified** (`FakeHttpClient` restores the interrupt flag and throws `InterruptedIOException`; `RequestRecorder` uses `ReentrantLock.withLock`.) + +--- + +## 3. New findings from the deep pass (all introduced or exposed by `1f233bd`) + +### MEDIUM + +#### V2-1. Sync instrumentation step still wraps unknown-length bodies — bounded drain stalls the caller until the preview cap fills +- **Where:** `DefaultInstrumentationStep.kt` `wrapResponseForLogging` (no `contentLength() < 0` guard) vs. `DefaultAsyncInstrumentationStep.kt:249` (guard present). +- **What's wrong:** the async step skips capture for streaming bodies precisely because the bounded drain "could block on a slow/idle producer" — but the *sync* step has the identical exposure on the caller's thread and wasn't given the skip. The bounded drain loops until **cap reached or EOF**, so for an SSE/long-poll/trickle stream under `BODY_AND_HEADERS`, the caller doesn't see the response until 8 KiB (default) of stream has accumulated. +- **Failure mode:** enable body logging against an SSE endpoint emitting ~50 B keep-alives: time-to-first-event goes from milliseconds to "however long 8 KiB takes" — minutes to never. v1's H1 hang is *bounded* now, not gone. +- **Fix:** apply the same `contentLength() < 0L → return response` guard to the sync step (and say so in `HttpInstrumentationOptions`, which currently describes the skip as an async-only difference as if the sync eager drain were safe). +- **Confidence:** Confirmed (code asymmetry is plain; both files were edited in the same commit, which is also fresh evidence for the dedup item — the two copies have now *actually* diverged in behavior, not just text). + +#### V2-2. JDK streaming path now fully buffers non-replayable bodies in memory — unbounded heap traded for resend correctness +- **Where:** `sdk-transport-jdkhttp/.../internal/BodyPublishers.kt:164-184` (`streamingPublisher` → `body.toReplayable()`), `:240-244` (`bufferToByteArray`). +- **What's wrong:** the per-subscription design requires re-readable bytes, so any non-replayable body — which on this path is by definition **larger than 64 KiB or of unknown length**, exactly the bodies the streaming path exists to avoid materializing — is drained whole into an in-memory `Buffer` by `toReplayable()`. A 5 GB one-shot `InputStream` upload now means ~5 GB of heap; beyond `Buffer.MAX_BYTE_ARRAY_SIZE` (~2 GiB) the related eager fallback's `snapshot()` throws `IllegalStateException`. This re-imports the H1-class hazard on the request path of one transport. +- **Failure mode:** large streaming upload (non-replayable) through `JdkHttpTransport` → heap blow-up or `IllegalStateException`, where pre-fix it streamed (albeit with the resend bug) and where `OkHttpTransport` still streams it fine (its `isOneShot()` contract lets OkHttp fail a resend cleanly without buffering). +- **Fix:** stream the **first** subscription directly from the one-shot body and make the supplier's second invocation throw `IllegalStateException("one-shot body cannot be re-sent")` — matching the consume-once discipline used everywhere else in the SDK (and OkHttp's `isOneShot` behavior). Buffering should at most apply below some bounded threshold. +- **Confidence:** Confirmed (mechanism); severity assumes consumers send large one-shot bodies through the JDK transport. + +### LOW + +#### V2-3. `streamingPublisher`'s `IOException` fallback re-drives a consumed body — masks the original error with `IllegalStateException` +- **Where:** `BodyPublishers.kt:169-182`. +- When `toReplayable()` throws `IOException` mid-buffer, the catch falls back to `bufferToByteArray(body)` — but `toReplayable()` already flipped the body's consume-once guard (e.g. `OneShotInputStreamRequestBody.consumed`), so the fallback's `writeTo` throws `IllegalStateException`, masking the real I/O failure — the precise masking pattern H2 was fixed for. The comment ("emitting whatever was captured") describes bytes the code does not actually have: the partial buffer was local to `toReplayable` and is lost. **Fix:** rethrow the `IOException` (wrapped with context) instead of falling back. **Confirmed.** + +#### V2-4. `LoggableResponseBody`'s new drain loop lacks the `read == 0` contract-violation guard +- **Where:** `LoggableResponseBody.kt:239-248` (`val n = capturedSource.read(buf, chunk)`; only `-1` and positive values are handled). +- A misbehaving `Source` returning `0` for a positive `byteCount` leaves `remaining` unchanged → infinite loop. The same commit *added* this exact guard to `writeAllInto` (L2 fix) and it already exists in `TeeSink.writeAll` — this is the third copy of the loop and the only unguarded one. **Fix:** `n == 0L → throw IOException(...)` like its siblings. **Confirmed** (misbehaving-source trigger only, same class as L2). + +#### V2-5. Over-cap path can double-close the delegate's source +- **Where:** `PrefixThenTailSource.close` (`LoggableResponseBody.kt:303-310`) closes the live tail (the delegate's source) but does not set `delegateClosed`; a subsequent `LoggableResponseBody.close()` (`:184-200`) then calls `delegate.close()`, which for `ResponseBody.create`-style bodies closes the same source again. +- Safe today: both transports' bodies route to `OkioBufferedSource`, whose `close()` is CAS-guarded idempotent, and `ResponseBody.close()`'s own contract demands idempotency. Fragile for exotic delegate implementations whose close has side effects beyond `source().close()` — the exact "some sockets throw on double-close" concern this class's own comments cite. **Fix:** have the one-shot source's `close()` route through the wrapper (set `delegateClosed`, call `delegate.close()`), so ownership stays single-threaded through one path. **Confirmed** (by reading; benign with shipped transports). + +#### V2-6. Header **names** still bypass validation — the M5 divergence survives for names, and new comments overstate the fix +- **Where:** `Headers.kt:311` (`sanitizeName` = lowercase + `trim()` — strips only *leading/trailing* whitespace; embedded `\r`/`\n` survive), vs. `validateValues` (`:325`) which covers values only. The new comment in the OkHttp `RequestAdapter` ("Header names/values are validated upstream by Headers.Builder") claims more than is true; the JDK adapter's equivalent comment is accurate (it scopes itself to restricted *names*). +- **Failure mode:** `add("X-Evil\r\nInjected", "v")` is accepted by the model layer; OkHttp then throws an unchecked `IllegalArgumentException` out of `execute()` (bypassing `catch (IOException)`) while the JDK adapter silently drops the header — the exact per-transport divergence M5 was about, now only for names. Attacker-controlled header *names* are rarer than values, hence Low. +- **Fix:** add `validateName` (reject CR/LF at minimum; arguably restrict to RFC 7230 `tchar`) beside `validateValues`, and correct the OkHttp adapter comment. **Confirmed.** + +#### V2-7. New `ProxyOptions` KDoc claims the JDK stack negotiates "Basic **or Digest**" proxy auth +- **Where:** `ProxyOptions.kt:32-33` (added by the fix). +- The JDK `java.net.http` client's `Authenticator` integration (internal `AuthenticationFilter`) supports **Basic only**; Digest via `Authenticator` is a documented JDK limitation. If that's right, the new doc sends Digest-proxy users to a transport that can't do it, while the JDK transport's *own* new KDoc ("use the OkHttp transport for Digest proxy auth") simultaneously points the other way — and OkHttp's authenticator here is also Basic-only, so that pointer is wrong too. The two new doc blocks contradict each other; at most one is right, plausibly neither. +- **Fix:** verify against the target JDK (one integration test with a Digest-challenging proxy), then make both KDocs consistent with reality (likely: "Basic only, on both shipped transports"). +- **Confidence:** Needs-verification (JDK behavior not testable in this sandbox); the *mutual contradiction* between the two new doc blocks is Confirmed regardless. + +#### V2-8. Observability semantics changed silently: `request.body.size` / `response.body.size` now report the preview-capped size +- **Where:** `DefaultInstrumentationStep.kt` / `DefaultAsyncInstrumentationStep.kt` `emitResponseEvent` (unchanged code, changed inputs — `snapshot(bodyPreviewMaxBytes)` over a tap/capture that now holds at most the cap). +- Pre-fix these fields reported the actual body size; post-fix they report `min(size, bodyPreviewMaxBytes)` (default 8 KiB), with nothing in the event distinguishing "body was 8 KiB" from "body was 8 GB". Dashboards or alerting keyed on body-size fields will silently flatline at the cap. **Fix:** emit `…body.size` from `contentLength()` when known and add a `…body.preview_truncated` boolean (or rename the field to `…body.preview_size`). **Confirmed** (consequence of the H1/M4 fix; arguably intentional but undocumented — the commit message and docs don't mention the field-semantics change). + +--- + +## 4. Updated improvement opportunities (ranked) + +1. **Deduplicate the two instrumentation steps — now a correctness matter, not hygiene.** V2-1 exists *because* the same fix was hand-applied to two copies and landed asymmetrically; the `TODO(omar 2026-08-01)` extraction marker is still present (`DefaultAsyncInstrumentationStep.kt:237`). Extract the shared emitter/wrapping logic before the next divergence. +2. **JDK streaming-body design follow-up** (V2-2/V2-3): first-subscription streaming + loud one-shot resubscribe failure restores streaming for one-shot bodies without re-importing the resend bug. +3. **Unify the three drain/pump loops** (`TeeSink.writeAll`, `writeAllInto`, `LoggableResponseBody.drainAndCache`) behind one helper with the `0`-read guard — V2-4 is the third hand-rolled copy of the same loop with the same forgotten edge. +4. **Sync/async parity for the unknown-length skip** (V2-1) — falls out of item 1. +5. **Header-name validation** (V2-6) — one small function beside `validateValues`. +6. **`URL → URI` in models** — M1's textual-equality fix removed the urgency; still the right end state (also kills `URL`'s other landmines: `toExternalForm` re-serialization cost in hot equality, stream-handler coupling). +7. **Doc truthing pass on proxy auth** (V2-7) and on the body-size log fields (V2-8). +8. **L6′** parser follow-up with a pinned regression test. +9. **Repo hygiene:** `.claude/worktrees/agent-aa2f74e153671cbbf/` stale tree (with its phantom `sdk-auth` module) still present; `QueryParam` `TODO()` stub, unwired `AuthMetadata`/`HttpTracer`/`ServerSentEventListener` surfaces unchanged from v1 item 11. +10. **`ResponseRecoveryStep` KDoc:** document that a step transforming `Success → Failure` owns closing the response it discards (H3's remaining contract gap). + +--- + +## 5. Coverage note (v2) + +**Read for this pass:** the complete `1f233bd` diff (48 files, +2241/−269) — every changed main-source file re-read in full post-fix (`LoggableResponseBody`, `TeeSink`, `LoggableRequestBody`, both instrumentation steps + options, both retry steps, `ResponsePipeline`, `PagedIterable`, `ContextStore`, `Headers`, `Request`, `DefaultRedirectStep`, `ProxyOptions`, `AsyncHttpClient`, `AsyncPipelineBridges`, `WriteAllInto`, `BodyPublishers` (full re-read), `JdkHttpTransport`, both `RequestAdapter`s, `OkHttpTransport`, `ResponseBody`, both fixtures) plus the surrounding unchanged call sites needed to validate each fix (`RequestBody.toReplayable`, `OkioBufferedSource.close`, `HttpExceptionFactory` consumers, `PagedResponse`, `Paginator`). New/changed tests reviewed by content for the five High fixes and by name/structure for the rest. v1's full-codebase coverage (every main-source file in all nine modules + test fixtures) carries over; nothing outside the diff was re-read line-by-line in v2 except as call-graph context. +**Not done in this environment:** executing the test suite (JDK 11-only sandbox, no Gradle/toolchain cache) — all "tests cover X" statements are from reading the test code, not from a green run; and the V2-7 JDK Digest question needs a live probe. + +**Bottom line:** `1f233bd` is a high-quality fix pass — every v1 High is genuinely closed for the mainstream paths, with deterministic tests for the racy ones. The residual risk has moved from "production-facing defects in core paths" to: one behavioral asymmetry between duplicated files (V2-1), one deliberate-but-costly trade-off in the JDK transport (V2-2), and a tail of small hardening items (V2-3 … V2-8). The single most valuable next change is the instrumentation-step deduplication, which retires V2-1's class of bug permanently. diff --git a/docs/openai-java-deep-dive.md b/docs/openai-java-deep-dive.md new file mode 100644 index 00000000..dbe3c6d6 --- /dev/null +++ b/docs/openai-java-deep-dive.md @@ -0,0 +1,1514 @@ +# openai-java — Deep-Dive Reference Study + +> A line-level study of [openai-java](https://github.com/openai/openai-java) (the Stainless-generated OpenAI SDK) to extract concrete learnings for the dexpace SDK toolkit. Companion to `refs-comparison.md`. +> +> **Method.** 17 subsystems, each read line-by-line by a dedicated agent, then every finding re-checked by a second adversarial agent against our *actual* source (claims re-verified, "we already do this better" items dropped, constraint conflicts flagged), then synthesized. 35 agents · ~3.3M tokens · 977 tool calls. Every claim about our code below was independently spot-checked before publication (the A1 transport bug, the dead `fromResponse`, the two pagination stacks, the `QueryParam` TODO stub, the missing serde exception — all confirmed in-tree). +> +> Findings are tagged **[sdk-core]** (toolkit), **[codegen]** (future KotlinPoet generator), or **[process]** (build/CI/docs), and categorized **COPY** / **ADOPT** / **SIMPLIFY** / **LEARN**. + +--- + +# Reference Study: openai-java (Stainless) → dexpace java-sdk + +Synthesis of 17 verified subsystem analyses. Audience: engineers who know this codebase. Every recommendation is mapped to **[sdk-core]** (toolkit), **[codegen]** (future KotlinPoet generator), or **[process]** (build/docs/CI), and checked against our zero-dep-core / Java-8 / toolkit-not-client constraints. + +--- + +## 1. Executive summary (ranked by leverage) + +1. **`JsonField` four-state value model is the single load-bearing decision for codegen — but split core-vs-adapter exactly like `Tristate` already is.** [codegen] Their entire 850K-LOC generated layer rests on one sealed type; we must build the dep-free analogue before any other model work. +2. **Stand up CI.** [process] We built strong gates (apiCheck over 9 `.api` files, Kover 80% floor wired to `check`, detekt, ktlint, explicit-API strict, allWarningsAsErrors) and they fire **nowhere** — no `.github` exists. A binary-compat validator not in CI is decorative. S-effort, highest ROI in the set. +3. **Fix the bodyless-POST crash in the OkHttp transport.** [sdk-core] A POST/PUT/PATCH with null body throws `IllegalArgumentException` (not IOException, so it bypasses retry) from okhttp 5.0.0 — confirmed in bytecode against our pinned version, and constructible by any caller. Real bug, 3-line guard. +4. **Wire `HttpExceptionFactory.fromResponse` and add a serde-failure exception — both are dead/missing seams that leak.** [sdk-core] `fromResponse` has zero call sites; our Jackson adapter throws raw `com.fasterxml.jackson.*` straight through the zero-dep SPI. Both undermine the abstraction boundary the whole architecture exists to protect. +5. **Adopt a per-call options channel on the context chain — never on the transport SPI.** [sdk-core] We have zero per-call override capability (timeout, validation, ad-hoc credential). Carry it on `RequestContext`; keep `HttpClient`/`AsyncHttpClient` single-method `fun interface`s. +6. **Ship async retry + async auth steps.** [sdk-core] Neither pipeline family has an async retry or async auth step despite `AsyncHttpStep` existing specifically to host them. The only genuinely novel behavior to lift is OAuth *background-refresh* (return valid-but-expiring token instantly, refresh off-thread). +7. **Resolve the two pre-codegen architectural forks now (pre-1.0).** [sdk-core] (a) resolved-`java.net.URL` vs deconstructed request; (b) unify the **two** pagination stacks (`pagination/*` and `http/paging/*`). Both block clean codegen output and are breaking changes we can only make cheaply now. +8. **Harden the Jackson adapter: strict coercion + binary-body sniff + charset-aware previews.** [sdk-core/adapter] Today `"5"`→Long coerces silently, gzip bodies log as U+FFFD mojibake, and a declared latin-1 charset is ignored. All cheap, all in `sdk-serde-jackson`/instrumentation. +9. **Extract a `build-logic` convention plugin.** [process] The publishing+signing block is duplicated across all nine modules (verified). A Sonatype/POM change is a 9-file edit. +10. **Do NOT mirror their decorator stack, god-object `ClientOptions`, `Thread.sleep` retry, deny-list redaction, or `850K`-LOC inlined codegen.** [all] We are deliberately and correctly the inverse on each; §4 enumerates. + +--- + +## 2. Cross-cutting themes + +### Theme A — One value-model abstraction underpins the entire generated layer +`JsonField` (Values.kt:39) is a sealed type with two arms: `KnownValue` and a seven-variant `JsonValue` tree (missing/null/bool/number/string/array/object). Every generated field is a `JsonField`, encoding four orthogonal states: present-typed, present-but-wrong-type, explicit-null, absent. This is what makes their generated models forward-compatible (an older SDK round-trips an unknown server shape losslessly via `additionalProperties` + raw `JsonValue`), and it powers `validate()`/`validity()` union disambiguation. + +**Lesson for us:** This is the right *shape* for codegen output, and it is a strict superset of our `Tristate` (three states, no escape hatch). But openai-java welds Jackson directly onto the value type — `@JsonDeserialize` on the sealed roots (Values.kt:38/270), `@JsonValue`/`@JsonCreator` on every leaf, and a private static `JSON_MAPPER = jsonMapper()` on the companion (Values.kt:319). We **cannot** copy it verbatim without dragging Jackson into `sdk-core`. The correct move is the split `Tristate` already proves: an annotation-free `JsonField` + a dep-free `RawJson` tree in `sdk-core`, with all Jackson↔RawJson conversion behind the `Serde` SPI in `sdk-serde-jackson`. The *new* cost the analyses surfaced: `JsonField` needs that whole embedded `RawJson` tree (7 variants), which `Tristate` never required — that, not the deserializer plumbing, is the bulk of the effort. + +### Theme B — Forward-compatibility is a *read-path* discipline, distinct from PATCH semantics +Three mechanisms recur: dual accessors (`completionTokens()` throws vs `_completionTokens()` returns raw, CompletionUsage.kt:64/108), `additionalProperties` pass-through (`@JsonAnySetter`/`@JsonAnyGetter`, CompletionUsage.kt:146-154), and two-enum forward-compat (`Known` closed vs `Value`+`_UNKNOWN`, ReasoningEffort.kt:70-81). Our default mapper **silently drops** unknown fields (verified: no `@JsonAnySetter` anywhere) — a GET-modify-PUT loses server-added fields, exactly the silent-data-loss class flagged from the Expedia SDK. + +**Lesson:** `Tristate` (request/PATCH three-state) and the `JsonField` analogue (response forward-compat four-state) are **complementary, not redundant** — neither subsumes the other, and a future contributor must not merge them. `Tristate` has no wrong-type escape hatch; `JsonField`'s `JsonMissing` is a wire-omission marker, not a PATCH-clear affordance. Document this boundary in the codegen design doc. They may even compose (`Tristate>`). + +### Theme C — Decorator `HttpClient` stack vs our Stage pipeline +They compose cross-cutting concerns by **nesting** `HttpClient` decorators: `Retrying ⊃ Logging ⊃ WorkloadIdentity ⊃ PhantomReachable ⊃ base`, with order encoded purely by construction sequence in `ClientOptions.build()` (ClientOptions.kt:678-690). No registry, no exactly-one enforcement. We use one ordered `Stage` pipeline with pillar exactly-one enforcement (a second retry install *replaces* and logs a warning, StagedSteps.kt:179) and surgical `insertAfter`/`replace`/`remove`. + +**Lesson:** We are correctly sized for a toolkit — our model structurally prevents double-retry and logging-outside-retry, which theirs allows. The **cost** we pay is real and must be documented: `HttpStep.process`'s `copy()`-to-re-drive contract (HttpStep.kt:36-41) is more error-prone than a decorator's plain `inner.execute()` loop — a step author who forgets `next.copy()` silently resumes past visited steps. Do **not** migrate concerns into pre-wrapped clients; do add a "Why stages, not decorators" note to `docs/pipelines.md`. + +### Theme D — Phantom-reachable GC auto-close: a clever trick wrapped around a philosophy we reject +`PhantomReachable.kt:31-56` obtains `java.lang.ref.Cleaner` **reflectively** (`Class.forName` + cached `by lazy`), so the same Java-8 bytecode opportunistically uses it on JDK 9+ and no-ops on 8 — a genuinely good fit for our Java-8-everywhere constraint. The `observed !== closeable` self-reference guard (line 15) is the load-bearing subtlety. **But** they pair it with ownership-*transfer* ("This class takes ownership of the client and closes it when closed", ClientOptions.kt:246) — the exact opposite of our "SDK closes only SDK-managed resources, BYO never closed" contract. + +**Lesson:** Take the *detection* value, drop the *silent-auto-close* harm. The reflective-Cleaner technique is liftable into a dep-free `closeWhenPhantomReachable` util, but it should power an **opt-in leak detector** (log a WARN on phantom-reachable-before-close), never default auto-close. Our deterministic, idempotent, ownership-aware `close()` (AtomicBoolean CAS at OkHttpTransport.kt:184) is already better than theirs. Cleaner timing is unbounded — it reclaims eventually, it does not bound fd/thread lifetime. + +### Theme E — Per-status exception classes are codegen output, not a runtime feature +Their error branch is generated: 6 named 4xx classes + collapsed-5xx + `Unexpected`, each ~100 LOC with a Builder and `checkRequired` on a `Throwable`, dispatched by a `when(statusCode)` ladder (ErrorHandler.kt:48-92) reproduced in every service. We already have the better shape: `HttpExceptionFactory.fromResponse` is a pure table-driven `Response → HttpException` over a 16-subclass family whose `retryable` is derived **once** from the same `RetryUtils` predicate the policy uses (HttpException.kt:72). + +**Lesson:** Compose error-mapping + deserialization as `Response → X` handlers **at the generated-service layer**, not as HTTP-pipeline stages — the pipeline must stay transport-pure and keep returning raw `Response` (making deserialization a stage would force a typed-payload-on-`Response` hack). Crucially, our `fromResponse` is **dead code** (verified: zero call sites) — "keep ours, it's better" is weak when it isn't wired. The deliverable is to *wire* it (the generator is the intended caller) and extend it with a typed-error-body slot, not to inline a per-op ladder. + +### Theme F — Two parallel hierarchies on both sides; ours has internal duplication theirs doesn't +They duplicate every concern into sync + async (intrinsic to decorators). We have a subtler problem: **two response-carrying exception bases** (`HttpException : RuntimeException` and `HttpResponseException : IOException`, both verified present) that disagree on field name (`retryable` vs `isRetryable`) and checked-ness, plus **two pagination stacks** (`pagination/*` Paginator+Page vs `http/paging/*` PagedIterable+PagedResponse, both verified) with different close semantics — one of which (`byPage()`) leaks unless the caller closes. openai-java meets the entire pagination need in ~90 lines with one surface. + +**Lesson:** This is the rare place we are *over*-engineered relative to the reference. Collapse both before codegen so generated code targets one contract each. + +--- + +## 3. Prioritized recommendations + +### Table A — sdk-core toolkit + +Ranked by impact/effort. Effort: S < M < L. + +| # | Recommendation | Category | Effort | Impact | Notes / citation | +|---|---|---|---|---|---| +| A1 | **Fix bodyless POST/PUT/PATCH crash:** in `RequestAdapter`, pass empty okhttp body when `body==null && method∈{POST,PUT,PATCH}` | COPY (bug) | S | High | Verified in okhttp 5.0.0 bytecode (`Request$Builder.method`); throws `IllegalArgumentException`, bypasses IOException retry. RequestAdapter.kt:55-56; guard mirrors OkHttpClient.kt:237-239. Audit jdkhttp (likely already correct — JDK client doesn't throw). Regression test required. | +| A2 | **Add `SerdeException` to the serde contract; wrap Jackson failures in the adapter** | ADOPT | S | High | `serde/*` declares zero exceptions (verified); `JacksonDeserializer` leaks raw `com.fasterxml.jackson.*` across the SPI. Add `open class SerdeException(msg,cause): RuntimeException` in `serde/`; catch `JsonProcessingException` in adapter. Do **not** extend `HttpResponseException`. apiDump. Mirrors `OpenAIInvalidDataException` (JsonHandler.kt:18). | +| A3 | **Wire `HttpExceptionFactory.fromResponse` into the error path; extend with typed-error-body slot when codegen lands** | ADOPT | S | High | Verified dead (zero call sites). Table-driven `Response→HttpException` over 16 subclasses, `retryable` single-sourced (HttpException.kt:72). The generator is the intended caller. | +| A4 | **Strict coercion lockdown in `JacksonObjectMappers.defaultObjectMapper()`** | ADOPT | S | High | `"5"`→Long coerces silently today (only `FAIL_ON_UNKNOWN_PROPERTIES` off, JacksonObjectMappers.kt:49-63). Add `withCoercionConfig(→Fail)` + `disable(ALLOW_COERCION_OF_SCALARS)` per ObjectMappers.kt:44-110. **Do NOT** copy the `AUTO_DETECT_*` disables — they'd break our auto-detected Kotlin data-class binding (defer to codegen). Pre-1.0 behavior change; Apache-2.0 attribution if lifted. **Lives in adapter, not sdk-core.** | +| A5 | **Charset-aware + binary-sniff body previews** in both instrumentation steps | ADOPT + COPY | S | High | `utf8Preview` hardcodes `Charsets.UTF_8` (DefaultInstrumentationStep.kt:323) despite `MediaType.charset` being parsed+cached (MediaType.kt:53). Add `previewText(bytes, mediaType)` + a clean-room `isProbablyUtf8` (64-codepoint sniff, LoggingHttpClient.kt:556-577; cite OkHttp/openai for the 64/256 constants, re-derive to avoid a NOTICE). Land both in the shared emitter (see A6). | +| A6 | **Extract shared `InstrumentationEmitters`** from the two instrumentation steps; keep the over-cap live-tail | SIMPLIFY | M | Med | ~200 LOC of byte-identical emit/redact/preview/metrics duplicated; their own TODO at DefaultAsyncInstrumentationStep.kt:236. **Do NOT delete `PrefixThenTailSource`** — it's load-bearing for non-truncating >8 KiB body logging (tested). Natural home for A5. | +| A7 | **Per-call options on the context chain** (timeout, response-validation, ad-hoc credential), `applyDefaults` with per-field null-coalescing + `Timeout.assign`-style nested overlay | ADOPT | M | High | Zero per-call override today; `DispatchContext`/`RequestContext` carry only instrumentation + payload. refs-comparison.md:183/:345 already name `RequestContext` as the home. Keep SPIs single-method. Model fields non-null with defaults (avoid their `Boolean?`+`!!`, ModelServiceImpl.kt:94). Copy merge semantics from RequestOptions.kt:23-29. | +| A8 | **Build first-class `QueryParams` multimap; delete the `QueryParam` stub; rewrite `RequestRebuilder` to use it** | ADOPT | M | High | `QueryParam` is a literal `TODO()` (verified); `RequestRebuilder` does `&`-split URL string-surgery, single-value only. Model on `Headers` (LinkedHashMap + Builder, **derive** size, don't track it). **Blocked on A9.** | +| A9 | **Decide: resolved-`URL` vs deconstructed request** (root pre-1.0 fork) | LEARN | L | High | We store one resolved `java.net.URL` (Request.kt:139); structured URL manipulation has nowhere to live (hence A8's string-surgery). Prototype deconstructed `Request` but **keep DNS-free equality** via `toExternalForm()` (Request.kt:52-71 — a real win they lack). Record decision in `docs/architecture.md`; blocks A8. | +| A10 | **Unify the two pagination stacks** into one (`Page` + driver enriched with `PagedResponse`'s links/status/headers) | SIMPLIFY | L | High | Both verified present (`pagination/*` + `http/paging/*`). Eliminate the `byPage()`-leaks-unless-closed footgun (PagedIterable.kt:100-105). Preserve maxPages cap + per-page close + interrupt-aware blocking. Coordinate with A11/A14. | +| A11 | **Async retry step** at `Stage.RETRY` (AsyncHttpStep) reusing `RetryAfterParser`/`RetryUtils`/`isRetrySafe`; iterative re-arm, not `thenCompose` recursion | ADOPT | L | High | No async retry exists (verified). Mirror their inline-executor `handleAsync` (RetryingHttpClient.kt:76-133) but avoid stack-deepening recursion. **Do NOT** copy `Thread.sleep`/`java.util.Timer` (DefaultSleeper.kt) — Loom-hostile. Add async time seam via existing `dexpace-retry-scheduler`. | +| A12 | **Async bearer-auth step + background token refresh** | ADOPT | M | Med | No async auth step (verified); `AsyncHttpStep` KDoc names this as its raison d'être. Lift only *BackgroundRefresh* (valid-but-expiring → return now + off-thread swap, WorkloadIdentityAuth.kt:134-176); skip their `Condition`+`refreshing` machinery (our sync double-checked lock is simpler). Add default `BearerTokenProvider.fetchAsync`. | +| A13 | **401-eviction on `BearerTokenAuthStep`** via `authorizeRequestOnChallenge` override | ADOPT | S | Med | Today only the local margin triggers refresh; a revoked token keeps stamping stale until margin elapses. Hook already exists (AuthStep.kt:118); reuse the built-in single-retry+body-close. Prefer the in-step hook over throw-retryable (the throw path hits `isRetrySafe` and silently fails on non-idempotent POSTs). | +| A14 | **Async pagination** (`PageAsync` + `AsyncPaginator`, iterative re-arm), Flow/Flux in adapters | ADOPT | L | Med | None exists (verified). Expose only `PageAsync` in core (java.util.concurrent, Java-8); **do NOT** port their `AsyncStreamResponse`. Preserve maxPages + per-page close. refs-comparison.md:244 plans Flow/Flux adapters. | +| A15 | **`StreamResponse`/`SseStream : AutoCloseable, Iterable`** binding parsed-element stream to response close | ADOPT | M | Med | Our SSE surface is a bare `Sequence`; reader disclaims ownership (ServerSentEventReader.kt:23). Mirror the close-on-partial-consume invariant `PagedIterable` already enforces. Add constrain-once + is-closed `hasNext()` guards (turn our doc-warnings into enforced invariants). | +| A16 | **`ResponseHandler` + `ParsedResponse` / `Response.parseable{}`** (lazy parse-once) | ADOPT | S | Med | No handler/raw-cooked seam (verified). 3-line `fun interface` + zero-dep String/Empty handlers in core; `jsonHandler(serde, Class)` in adapter. `by lazy` memoization. Document "parse() consumes/closes body; read raw headers first". Foundation the generator emits against (refs-comparison.md:408). | +| A17 | **Collapse `HttpResponseException` into `HttpException`; hoist `Retryable` interface** | SIMPLIFY + ADOPT | M | Med | Two response-carrying bases verified (HttpException.kt:58, HttpResponseException.kt:36) disagreeing on `retryable`/`isRetryable` + checked-ness. Pick `HttpException` (RuntimeException); `NetworkException` covers the transport case. Then add `interface Retryable { val isRetryable }`. Grep pipeline/transport for throwers first; apiDump. | +| A18 | **Unify divergent retry defaults + backoff formula** | SIMPLIFY | M | Med | Status sets disagree: `RetryUtils` includes 408, `RetrySettings.DEFAULT_RETRYABLE_STATUSES` drops it (verified both exist). Two backoff formulas. Make `BackoffCalculator` (deadline + symmetric jitter) the single impl; pick one 408 stance. Keep both step families (legit different integration points). | +| A19 | **Delete `TokenPaginationStrategy`** (byte-for-byte dup of `CursorPaginationStrategy`) | SIMPLIFY | S | Low | Verified present; its own KDoc says "Logically identical." `cursorQueryParam` is already overridable. apiDump (net surface reduction). De-risks A20. | +| A20 | **Single-read pagination extractor** (items+cursor together) instead of two body-reading lambdas | SIMPLIFY | M | Med | `CursorPaginationStrategy` takes two `(Response)->` lambdas on a single-use body; our own test reinvents an `IdentityHashMap` cache to dodge double-drain (CursorPaginationTest.kt:47). Correctness trap in the public API. After fix, delete the test cache (proof). | +| A21 | **`Timeout` per-phase value type** (connect/read/write/request, read/write→request defaulting) | ADOPT | M | Med | No timeout model in core; per-phase I/O timeouts pushed onto each transport. `java.time.Duration` only. Translate to native in adapters. Keep independent of `RetrySettings.totalTimeout`; document the separation (Timeout.kt:47). | +| A22 | **`ServerOverrideRetryPredicate`** (`X-Should-Retry`) as a shipped-but-not-default `HttpRetryConditionPredicate` | ADOPT | S | Low | Expressible in our existing composable seam (HttpRetryOptions.kt:21). Opt-in (header name could be reused downstream). Keep 409 out of default set. | +| A23 | **Fractional `Retry-After` parsing** (`toDoubleOrNull`, not Float) | COPY | S | Low | We use `toLongOrNull` → fractional values silently fall through (RetryAfterParser.kt:193). Keep our negative/NaN/365-day guards. Modeled on RetryingHttpClient.kt:199-202. | +| A24 | **Configurable retry-count header per attempt** (default off / neutral name) | ADOPT | S | Low | We log `try_count` locally but stamp nothing (DefaultRetryStep.kt:537). Stamp on the per-attempt copy (honor immutability). **No vendor brand.** Document idempotency-key-stable-across-retries contract. | +| A25 | **Opt-in `LeakDetector`** (reflective Cleaner, no-op on 8, default off, log-only) | ADOPT | M | Low | Detection without auto-close. Reuse `closed` AtomicBoolean as the "was it closed?" signal. Gate `-Dorg.dexpace.sdk.leakDetection`. Test with GC-poll loop, **not** fixed `Thread.sleep` (their test is flaky). Apache-2.0 attribution on the lifted util. | +| A26 | **`HttpLogLevel.fromEnv(key, source)`** (no default key baked in core) | ADOPT | S | Low | Reuse the `Configuration` envSource seam (testable). Generated clients default to the product's own env var. | +| A27 | **Redacting `toString()` on `KeyCredential`/`NamedKeyCredential`** | LEARN | S | Low | `BearerToken` redacts (BearerToken.kt:55); these don't. Pre-emptive (no active leak). apiDump. | +| A28 | **`checkRequired(name, value)` builder helper** | COPY | S | Low | Our Request/Response builders use inconsistent `checkNotNull` messages. Real payoff at codegen scale. Trivial enough to write independently (skip attribution). | +| A29 | **Deep-array `contentEquals`/`contentHash` util** | COPY | S | Low | Preventive for sdk-core (no array-typed value field today); real requirement for codegen DTOs with `ByteArray`. Skip their bracket-stripping `contentToString` quirk (Utils.kt:90). | + +### Table B — future codegen + +| # | Recommendation | Category | Effort | Impact | Notes / citation | +|---|---|---|---|---|---| +| B1 | **`JsonField` four-state field model:** dep-free sealed type + `RawJson` tree in sdk-core; Jackson↔RawJson behind `Serde` SPI | ADOPT | L | Highest | The prerequisite for every other model finding. `Known`/`Missing`/`Null`/`Raw(RawJson)`. Map `Tristate.Present/Null/Absent` onto it. Reuse contextual-type-extraction from TristateModule.kt:108. Kotlin `sealed class` → Java-8 bytecode (no `permits`). | +| B2 | **Thin generated models over a hand-written runtime** (<100 LOC/model) | SIMPLIFY | L | Highest | Verified their scale: 1122 model files / 850,299 LOC, additionalProperties+validate/validity+dual-ctor inlined into every class. Untenable under our 80% Kover floor + apiCheck. Push invariant machinery into the B1 runtime; emit field-list + accessors only. **Decide up front:** exclude generated modules from the aggregate Kover floor + give them a separate `.api` baseline. | +| B3 | **`additionalProperties` pass-through** (`@JsonAnySetter`/`@JsonAnyGetter` on generated classes, immutable snapshot at `build()`) | ADOPT | M | High | Our default mapper silently drops unknown fields (verified). Forward-compat round-trip contract. Hard-depends on B1's RawJson tree. | +| B4 | **Unions: private-ctor + nullable-per-variant + `Visitor` + retained `_raw`** (NOT Kotlin sealed `permits`) | COPY | M | High | Java-8-safe by construction. `accept()` dispatches on first non-null arm else `visitor.unknown(_raw)`; default `unknown` throws (forward-compat opt-in). Retain raw node on every parsed variant. Emit Deserializer against `Serde`, not Jackson. ChatCompletionContentPart.kt:34-117. | +| B5 | **Enums: open class over `JsonField` + paired `Known`/`Value`(+`_UNKNOWN`)** | ADOPT | M | High | Realizes the already-committed refs-comparison.md:410/465 forward-compat-enum line. Deserialize never throws; `known()` throws; raw string preserved. **Lint-clean names — no `_UNKNOWN` underscore prefix.** ReasoningEffort.kt. | +| B6 | **Two-tier raw/cooked services:** cooked body = `withRawResponse().op(...).parse()`; raw returns lazy `ResponseFor` | COPY | M | High | Real benefit: header/status access skips deserialization. Builds on A16. `parse()` consumes/closes body — document. **Do NOT** pull errorprone `@MustBeClosed` into generated code (zero-dep violation); rely on `Closeable`+`use{}`+docs. ModelServiceImpl.kt:43-98. | +| B7 | **Typed Page classes that rebuild typed PARAMS, not URL strings** | LEARN | L | High | The most important pagination lesson. `nextPage()=service.list(params.toBuilder().after(cursor).build())` (FileListPage.kt:39-42) — cursor in a typed param, no URL surgery. Requires a `Params`/`prepare(params)` seam (see B8). Keeps `RequestRebuilder`+strategies as the BYO-only path. | +| B8 | **Minimal `OperationParams` SPI** (headers/query/path/body projections) designed to feed *our* context chain | ADOPT | S | High | Foundational precondition for thin services (B6/B7). `Params.kt:7-16` returns their Headers/QueryParams — ours must feed `DispatchContext`/`RequestContext` + our `Headers`. **Blocked on A8** (QueryParams). Public, explicit-API-clean (no `_`-prefix pseudo-privacy). | +| B9 | **Curated overload set, not the 12-per-op cross-product** | SIMPLIFY | M | Med | Verified 12 `retrieve` overloads, 1 abstract. ~48 signatures/op × raw/cooked/sync/async — a self-inflicted apiCheck + explicit-API tax forever. One canonical method + small curated set; Kotlin default args. The rare "theirs is more elaborate, do NOT follow." | +| B10 | **Lazy sub-service tree** (`by lazy` accessors, root reuses nested `WithRawResponseImpl`) | COPY | S | Med | Reusing the nested raw impl halves generated type count (OpenAIClientImpl.kt:213). Flag `by lazy` vs `Suppliers.memoize` for a future Java target. | +| B11 | **`withOptions(Consumer)`** returning a new immutable client | ADOPT | M | Med | **Precondition (decision):** introduce a single cloneable client-config with `toBuilder()` — we have none (verified: `Configuration` has no `toBuilder`, split config). If we keep split config, drop this. | +| B12 | **MultipartRequestBody** → `BufferedSink`, single shared frame-size fn for `writeTo`+`contentLength`, file parts zero-copy via `FileRequestBody`, non-file parts via `Serde` | ADOPT | L | Med | Zero multipart today. Avoid their two-independent-literal drift trap (their antipattern). No Jackson in core. Attribute if framing lifted. | +| B13 | **Per-endpoint SSE adapter** (`SseStream→Iterable`): configurable `[DONE]` sentinel, error-envelope→exception, lazy per-element decode via `Serde` | ADOPT | M | Med | Keep `[DONE]`/error-envelope **out of core** (API conventions). Lazy-decode-on-consume is the good idea. Builds on A15/A16. | +| B14 | **Per-operation auth descriptor + precedence ladder** generated; sdk-core `AuthStep` consumes descriptor + credentials, throws tailored "requires X" error | LEARN | M | Med | `SecurityOptions` two-booleans is an artifact of OpenAI having exactly 2 schemes; generate one flag per scheme. Keep core scheme-agnostic. Marker-sentinel trick (ClientOptions.kt:782) only if needed, strictly internal (would break apiCheck). | +| B15 | **`validate()`/`isValid()`/`validity()` triad** as a memoized, opt-in, never-auto-called model template member | LEARN | M | Low | validity-scoring is the **fallback** union strategy (discriminator-first when present — don't pay N full deserializations otherwise). It pays the double-parse cost once per candidate. Never call `validate()` in the deserialize path. | +| B16 | **Discriminator/const fields as defaulted raw values** + typed+raw dual accessors per field | LEARN | S | Med | Falls out of B1. 2 setters + 2 getters/field is a real apiCheck commitment → reinforces B2's separate-baseline decision. | +| B17 | **Strict-LLM-schema encoding** (all-required + `additionalProperties:false` + optional==nullable union) IF we ever emit structured-output schemas | LEARN | S | Low | The transferable nugget. **Schema derivation, if ever built, is a victools+Jackson *adapter* (`sdk-schema-jackson`), never sdk-core** — they put victools `implementation` + Jackson `api` *inside* core (build.gradle.kts:23-34). Hand-roll a subset validator, no json-schema lib. | +| B18 | **Fail-soft recursive validator skeleton** (one-shot guard + path-prefixed error list + `verify(...){return}`) for codegen spec/schema validators | COPY | S | Low | ~15-line idiom (JsonSchemaValidator.kt:689). Parameterize over our own tree type. **Avoid their `simpleName`-for-$defs collision bug** (their TODO:579) — use FQN/deterministic names. | +| B19 | **Provenance file** (generator version + input-contract hash) stamped into generated SDKs | LEARN | S | Low | Their `.stats.yml` analogue — but into *generated* output, never the hand-written toolkit (a spec hash on the toolkit would be a lie). | +| B20 | **Spring Boot starter pattern** (`@ConfigurationProperties` + `fun interface` customizer + `@ConditionalOnMissingBean`) emitted per generated API | LEARN | M | Med | Design note only. The bean is `{IoProvider + transport + HttpPipeline}` — building a toolkit POC now would prematurely anchor the codegen assembly API. Spring deps confined to generated starter module. | + +--- + +## 4. Do NOT adopt (with reason) + +1. **Jackson (or any serde) on core value/model/exception/handler types.** `JsonField`/`JsonValue` carry `@JsonDeserialize` + a static `JSON_MAPPER` (Values.kt:38/319); every generated model, every exception (`BadRequestException.kt:28` `jsonMapper().valueToTree()`), and the handlers (`JsonHandler`, `ErrorHandler`) hard-import Jackson into *core*. **Reason:** direct violation of zero-dep `sdk-core`. `Tristate` proves the correct split — annotation-free in core, Jackson in `TristateModule`. Every adopted model/handler/exception MUST follow it. + +2. **The 782-line `ClientOptions` god-object** (22 ctor params: transport + jsonMapper + 4 credential kinds + Azure routing + 3 timeouts + org/project + webhooks). **Reason:** hardcodes policy a toolkit consumer can't extend, and bakes provider-specifics into config. Our distributed per-concern Options is the correct inverse. + +3. **The frozen decorator stack baked in `build()`** (`Retrying(Logging(WorkloadIdentity(client)))`, ClientOptions.kt:678). **Reason:** unenforced ordering (nothing prevents double-retry or logging-outside-retry), and intrinsic sync+async duplication. Our `Stage` pipeline with pillar exactly-one enforcement is strictly better for a toolkit. + +4. **Ownership-transfer + GC auto-close of BYO resources** (ClientOptions.kt:246). **Reason:** the exact "we closed your client out from under you" hazard our `owned` gate (OkHttpTransport.kt:187 `if (!owned) return`) exists to prevent. Take the *detection* (A25), not the auto-close. + +5. **`Thread.sleep` retry backoff + single `java.util.Timer` for all async sleeps** (DefaultSleeper.kt:10-12). **Reason:** pins carrier threads under Loom, truncates sub-ms delays to 0, leaks a bare `InterruptedException` with the flag cleared. Violates our interrupt-aware/no-`Thread.sleep` rules. Our `ScheduledExecutorService` daemon is correct. + +6. **Deny-list header redaction + zero URL redaction + no body cap** (LoggingHttpClient.kt:99/195). **Reason:** fails *open* (any sensitive header outside the hardcoded 5 leaks; query-string secrets logged verbatim; unbounded log volume). Our allow-list + always-redact-userinfo `UrlRedactor` + `bodyPreviewMaxBytes` are the correct posture — do not regress. + +7. **`System.err.println` logging** (15+ sites). **Reason:** bypasses framework routing/levels/structured ingestion. Defensible for a single-app client, wrong for a library. Keep `ClientLogger`/SLF4J. + +8. **One-sided shortening jitter** `1.0 - 0.25*rand()` (RetryingHttpClient.kt:225). **Reason:** caps effective backoff *below* nominal and biases every client to retry *earlier* — the opposite of herd avoidance. Our symmetric midpoint-preserving jitter is correct. + +9. **Server `Retry-After` used verbatim, no clamp, no deadline** (RetryingHttpClient.kt:216). **Reason:** `Retry-After: 86400` sleeps a day. Keep our 365-day clamp + remaining-budget clamp. + +10. **`Retry-After` HTTP-date catch that swallows only `DateTimeParseException`** (RetryingHttpClient.kt:203). **Reason:** out-of-range `OffsetDateTime` throws sibling `DateTimeException` and escapes, breaking the retry loop. Our parser catches both — a latent totality bug; do not copy. + +11. **Method-agnostic retry eligibility** `request.body?.repeatable() ?: true` (RetryingHttpClient.kt:143). **Reason:** auto-retries a bodyless non-idempotent POST. Our method-aware gate (DefaultRetryStep.kt:255) is deliberately stricter. + +12. **errorprone `@MustBeClosed` in core/generated code** (ModelService.kt:5). **Reason:** drags a third-party annotation lib into the generated layer for a compile-lint our Kotlin/Java-8 consumers can't run. Use `Closeable`+`use{}`+KDoc. + +13. **Reflection in the hot request path** (`Params.modelNameOrNull()` does `declaredFunctions.find{...}.call()` per request, PrepareRequest.kt:42, swallowing all exceptions). **Reason:** per-request perf + swallow-everything hazard; refs-comparison.md:447 already bans reflection-driven serialization. Generated params expose typed projections. + +14. **`_`-prefix pseudo-privacy** (`_headers()`, `_body()` are public Java-callable). **Reason:** unidiomatic, draws ktlint/detekt fire, misleads. Use real `internal`/`@JvmSynthetic`; keep the dual-accessor *concept* with lint-clean names. + +15. **`Void?` return for "no body"** (EmptyHandler.kt:8). **Reason:** Java-interop wart (uninstantiable `Void` typed `null`). Use `Unit` or a token. + +16. **victools + Jackson as core deps for schema gen** (build.gradle.kts:23-34, Jackson is even `api` so it leaks transitively). **Reason:** cardinal zero-dep violation. Adapter-only (`sdk-schema-jackson`) if ever built. + +17. **Their binary-compat proxy** (`scripts/detect-breaking-changes` recompiles old tests). **Reason:** only catches breakage an existing test exercises; produces confusing "lint failed" errors. Our committed `.api` snapshots + validator are strictly better. + +18. **Global `-nowarn` / `-Xsuppress-warning=DEPRECATION`** (openai.kotlin.gradle.kts:23). **Reason:** they suppress because generated code references its own deprecated members. We run `allWarningsAsErrors`; generated code may need *scoped* suppression, never global. + +19. **`-Xmx8g -XX:CICompilerCount=4` JVM block** (gradle.properties). **Reason:** sized for 261 generated endpoints; wastes memory on our 9-module build. `-Xmx2g` suffices. + +20. **`tryDeserialize` catching `Exception` → `null`** (BaseDeserializer.kt:34). **Reason:** swallows NPEs/programming errors, hiding bugs behind silent fall-through. If we replicate the never-throw read model, catch narrowly (`JsonProcessingException`/`MismatchedInputException`). + +--- + +## 5. Where we're already ahead (specific, with file:line) + +**Dependency hygiene (the architectural through-line).** `sdk-core` has zero non-SLF4J runtime deps; `Tristate` (Tristate.kt:35) carries not a single Jackson annotation. openai-java welds Jackson onto `JsonField`/`JsonValue` (Values.kt:38/319), every model, every exception, the handlers, and `ClientOptions` — their core is unusable without Jackson + okhttp on the classpath. Verified: no `JsonValue`/`RawJson`/`additionalProperties` references anywhere in our main tree. + +**Illegal-state-unrepresentable value type.** `Tristate.Present` (Tristate.kt:63) makes `Present(null)` uncompilable. Their `KnownValue` has the same bound but the broader `JsonField` surface still allows odd combinations the visitor throws on at *runtime* (Values.kt:314). + +**Header-injection defense at the model layer.** `Headers.Builder` rejects CR/LF in values at build time (Headers.kt:325) uniformly across transports; theirs has no guard on `Headers` at all — a raw `\n` reaches the transport. + +**DNS-free URL equality.** `Request` compares/hashes by `url.toExternalForm()` with an explicit rationale that `java.net.URL.equals` does blocking DNS (Request.kt:52-71); their `HttpRequest` has no `equals`/`hashCode` override. + +**Retryability single-sourced + cycle-safe.** `HttpException.retryable = RetryUtils.isRetryable(status.code)` (HttpException.kt:72) — the baked flag can never disagree with the live policy; `RetryUtils.isRetryable(Throwable)` walks the cause chain with identity-`HashSet` cycle detection (RetryUtils.kt:58). They derive retryability separately in `RetryingHttpClient`, divorced from the exception, and do no cause-chain inspection. + +**Total `Status` type.** `Status.fromCode` is total, preserves vendor codes 499/520-526 with `statusName=null` (Status.kt:200-207); they pass raw `Int`s and stringly-build messages. + +**Retry-After breadth + totality.** Five header families incl. `X-RateLimit-Reset` (GitHub/Stripe/Slack/Twilio), all clamped to 365 days, catching both `DateTimeException` and `ArithmeticException` (RetryAfterParser.kt). Theirs: three families, the `DateTimeException`-escape bug, no overflow clamp. + +**Real total-timeout deadline (gax-style).** `RetryStep` refuses a retry that would overshoot `totalTimeout` and `BackoffCalculator` clamps each delay to the remaining budget (RetryStep.kt:200, BackoffCalculator.kt:82). They have no retry-budget deadline — only a per-call `Timeout` that explicitly excludes retries. + +**Loom-correct waiting + suppressed-exception trail.** `ScheduledExecutorService` + `CompletableFuture.get`, restore-flag, cancel pending, throw `InterruptedIOException` (DefaultRetryStep.kt:509); every prior attempt attached via `addSuppressed` (DefaultRetryStep.kt:227). Theirs swallows the interrupt and throws only the last failure. + +**Interrupt-aware, idempotent, ownership-aware `close()`.** OkHttp transport restores the interrupt (OkHttpTransport.kt:93), CAS-guards close (line 184), no-ops for BYO (line 187), per-step fault-isolates+logs (line 195). Theirs (OkHttpClient.kt:86) is none of these and would close a user's client and NPE on second close. + +**Zero-copy file uploads + byte-range.** `FileRequestBodyAdapter` streams via okio `FileHandle` with position/count (FileRequestBodyAdapter.kt:45); they have no file-specific path. + +**Non-consuming error-body preview.** `HttpException.bodySnapshot` reads via `source().peek()` with a 4 KiB cap (HttpException.kt:99); they re-serialize the parsed error to a `JsonValue` on every `body()` call with no bounded preview. + +**Allow-list URL/secret redaction.** Always-redact userinfo + allow-list query values (UrlRedactor.kt:67); they log the raw URL with full query string and userinfo (LoggingHttpClient.kt:99). + +**Bounded body capture + race-safe re-readable bodies.** `bodyPreviewMaxBytes` cap with non-truncating live tail (`PrefixThenTailSource`); double-checked-locking drain hands the same bytes to multiple consumers race-safely (LoggableResponseBody.kt:88-211). Their `LoggingInputStream` is single-consumer, unlocked, no cap. + +**Efficient request tee.** One encode + two segment-level moves (TeeSink.kt:75) vs their byte-at-a-time `OutputStream` loop (LoggingHttpClient.kt:309). + +**More spec-complete SSE parser.** Typed comment lines (keep-alive detection), `data` as defensively-copied `List`, typed `Duration` retry with overflow guard, explicit UTF-8 decode + bare-`\r` handling (ServerSentEventReader.kt:79/202/244/160). Theirs drops comments, flattens `data` to a joined String, lossy `Int` retry, default-charset `bufferedReader()` (an SSE-mandate bug), and zero async backpressure (our Reactor bridge has demand-driven `Flux.generate`, Reactor.kt:117). + +**`PagedIterable` close-on-partial-consume.** Closes each page the moment its items iterator is taken, so `stream().findFirst()` never strands a connection (PagedIterable.kt:171). They push this to the generated service; the page owns nothing. + +**maxPages safety cap.** Both pagers guard a never-advancing cursor (Paginator.kt:193); `AutoPager.kt:16` will spin forever. + +**Testable config ingestion.** `Configuration.getDuration` parses ISO-8601 + shorthand, rejects negatives, fails safe (Configuration.kt:145); `envSource`/`propsSource` Function seams make lookups hermetically testable (ConfigurationBuilder.kt:40). Their `fromEnv` does raw-string-to-setter with inline `System.getenv` (ClientOptions.kt:553). + +**`Futures.unwrap`.** Peels both `CompletionException` and `ExecutionException` with cycle detection (Futures.kt:46) — strictly more robust than their `unwrapCompletionException` (WorkloadIdentityAuth.kt:91). + +**Caller-wins header precedence, tested.** `ClientIdentityStep` Append-mode is documented and covered (ClientIdentityStepTest.kt:37-69) — finding's "add a test" is already satisfied. + +**Build quality bar.** Aggregate 80% Kover floor wired to `check` (build.gradle.kts:81), binary-compat-validator with 9 committed `.api` files, detekt + ktlint + explicit-API strict + `allWarningsAsErrors`, reproducible JARs (timestamps stripped, entries sorted). They have **no** coverage tooling, **no** `.api` snapshots, only formatters, and global `-nowarn`. + +--- + +**One sequencing note for the senior reader:** the codegen table has a hard dependency spine — **B1 (`JsonField`) → B3/B4/B5/B16 → B2 (thin models)**, and **A8 (`QueryParams`) → A9 (URL fork) → B8 (`OperationParams`) → B6/B7 (thin services + typed-param paging)**. A10 (pagination unification), A11/A14 (async), and A17 (exception collapse) should all land *before* the first generator run so generated code targets one contract each. The §3-Table-A correctness/wiring items (A1–A7) are independent and shippable now. + +--- + +# Appendix — Per-Subsystem Deep Reads + +The cross-cutting report above is distilled from these 17 line-level analyses. Each section gives: what the subsystem does, how it works (with `file:line` citations into openai-java), how *our* SDK does the same thing, the surviving recommendations (post-verification), what was considered and dropped, antipatterns, and where we are already ahead. + +| # | Subsystem | Kept | Dropped | +|---|---|---|---| +| 1 | JsonValue / JsonField value model + (de)serialization | 7 | 1 | +| 2 | HTTP request/response model (HttpRequest, bodies, Headers, QueryParams) | 7 | 1 | +| 3 | HttpClient decorator stack (composition pattern) + proxy + workload http | 8 | 5 | +| 4 | Request/response logging (LoggingHttpClient) + redaction + body handling | 7 | 1 | +| 5 | Retry / backoff / sleeper / timeout | 8 | 3 | +| 6 | Response handlers (Error/Json/String/Empty) + raw-response abstraction | 7 | 7 | +| 7 | Streaming + SSE (StreamResponse, AsyncStreamResponse, SseMessage, handlers) | 7 | 1 | +| 8 | PhantomReachable resource auto-closing (distinctive GC-based lifecycle) | 4 | 1 | +| 9 | ClientOptions / RequestOptions / SecurityOptions / config assembly | 5 | 3 | +| 10 | Error/exception hierarchy | 5 | 3 | +| 11 | Pagination (Page, AutoPager, PageAsync) | 7 | 1 | +| 12 | JSON Schema generation + Structured Outputs (likely no equivalent in ours) | 6 | 2 | +| 13 | Enterprise auth / workload identity (GCP/Azure/K8s token providers, token exchange) | 6 | 2 | +| 14 | CODEGEN RECIPE: generated model classes (builders, JsonField, unions, enums) | 7 | 1 | +| 15 | CODEGEN RECIPE: services (blocking/async interface+impl split, withRawResponse) + client tree | 9 | 1 | +| 16 | okhttp transport implementation | 6 | 2 | +| 17 | Build / packaging / release / DX (gradle, codegen stats, spring starter, proguard, examples) | 8 | 4 | + +--- + +## 1. JsonValue / JsonField value model + (de)serialization + +**What it is** + +openai-java's ENTIRE model layer is built on `JsonField` (Values.kt:39), a sealed hierarchy with two branches: `KnownValue` (Values.kt:402 — a typed value matching the SDK's expected `T`) and `JsonValue` (Values.kt:271 — an arbitrary JSON value that bypasses the type system). `JsonValue` is itself sealed into seven variants: `JsonMissing` (omit-from-output, Values.kt:433), `JsonNull` (explicit `null`, Values.kt:459), `JsonBoolean`/`JsonNumber`/`JsonString` (Values.kt:473/498/523), and `JsonArray`/`JsonObject` (recursive JSON trees, Values.kt:548/575). So every generated field is `JsonField`, encoding FOUR states in one type: present-and-typed (`KnownValue`), present-but-wrong-type (any `JsonValue`), explicit-null (`JsonNull`), and absent (`JsonMissing`). This is a strict superset of our `Tristate` three states, plus a per-field escape hatch and a full embedded JSON-tree model. + +Generated models (CompletionUsage.kt) layer on top: each field is a private `JsonField` with a dual accessor — `completionTokens(): Long` (CompletionUsage.kt:64) which calls `getRequired("completion_tokens")` and THROWS `OpenAIInvalidDataException` if missing/null/wrong-type, versus `_completionTokens(): JsonField` (CompletionUsage.kt:108) which returns the raw field and never throws. Unknown/extra server fields are captured into `additionalProperties: MutableMap` via Jackson's `@JsonAnySetter`/`@JsonAnyGetter` (CompletionUsage.kt:146-154), giving lossless round-trips. Forward-compat enums (ReasoningEffort.kt:22) store a `JsonField` and expose `Known` (closed) vs `Value` (closed + `_UNKNOWN`, ReasoningEffort.kt:70-81) enums, so an unrecognized server value deserializes without throwing. `validate()`/`isValid()` (CompletionUsage.kt:317/330) recursively assert every field is a `KnownValue`; `validity(): Int` (CompletionUsage.kt:344) counts known fields and powers "best-match" union deserialization. The whole thing is Jackson-soaked: `@JsonDeserialize(using=...)` on the sealed roots (Values.kt:38/270), `@JsonValue`/`@JsonCreator` on every leaf, and `JSON_MAPPER.convertValue(...)` inside `JsonValue.from`/`convert` (Values.kt:273-360). + +**How it works (line-level)** + +JsonField.Deserializer (Values.kt:248-261) is the linchpin trick: it's a `ContextualDeserializer` whose `createContextual` (Values.kt:251) pulls the generic argument out of the property's declared type — `Deserializer(context.contextualType?.containedType(0))` — then `deserialize` (Values.kt:256) does `type?.let { tryDeserialize(node, type) }?.let { of(it) } ?: JsonValue.fromJsonNode(node)`. Translation: try to parse the node as the expected `T`; if that succeeds wrap in `KnownValue`, if it FAILS (or no type) fall back to a raw `JsonValue`. That single line is how a type mismatch on the wire becomes a recoverable `JsonValue` instead of a thrown exception. `getNullValue` (Values.kt:260) returns `JsonNull` so a present-null is distinguished from absent. + +BaseDeserializer (BaseDeserializer.kt:25-27) reads the entire node into a tree first — `parser.codec.deserialize(parser.readValueAsTree())` — then `tryDeserialize` (BaseDeserializer.kt:31-43) swallows ALL exceptions and returns null (`catch (e: Exception) { null }`). This "parse to tree, then try-typed-else-raw" is the mechanism that makes the type system bypassable per-field. Note the cost: every field is materialized to a `JsonNode` tree before typed binding — double parse. + +`ExcludeMissing` (Values.kt:603-606) is a `@JacksonAnnotationsInside` meta-annotation wrapping `@JsonInclude(CUSTOM, valueFilter = JsonField.IsMissing::class)`; `IsMissing.equals` (Values.kt:243) returns true iff `other is JsonMissing`, and Jackson's CUSTOM filter omits a property when `filter.equals(value)` is true — that's how `JsonMissing` fields vanish from output. `JsonMissing.Serializer` (Values.kt:445-454) additionally throws if ever asked to serialize directly (defense in depth). `JsonNull` reuses Jackson's stock `NullSerializer` (Values.kt:458). All leaves are singletons or interned (`JsonMissing`/`JsonNull` hold a private `INSTANCE`, Values.kt:439/465). + +KnownValue equality (Values.kt:407-417) uses deep-array-aware helpers `contentEquals`/`contentHash`/`contentToString` (Utils.kt:71/80/89, all `contentDeep*`) so a `KnownValue` compares by content, not identity — a subtle correctness fix over `Objects.equals`. + +ObjectMappers.kt:34-116 is a 60-coercion lockdown: for every `LogicalType` it sets `CoercionAction.Fail` on mismatched `CoercionInputShape`s (e.g. Boolean.Integer→Fail, Integer.String→Fail, ObjectMappers.kt:44-104) AND disables `ALLOW_COERCION_OF_SCALARS` (ObjectMappers.kt:110). Net effect: the wire `"5"` will NOT silently coerce into an `int` field — it falls through to a `JsonValue`. They also disable all five AUTO_DETECT_* features (ObjectMappers.kt:111-115) so ONLY annotated `@JsonProperty` fields bind. `serializationInclusion(NON_ABSENT)` (ObjectMappers.kt:105) is the global Optional/null-omit. checkJacksonVersionCompatibility (Check.kt:50-84) probes five Jackson sub-package versions at runtime and throws a detailed message if a known-bad version (2.18.1, Check.kt:88) or too-old version is on the classpath — defensive because they hard-depend on Jackson internals. + +**vs. our SDK** + +Our value model is `Tristate` (sdk-core/.../serde/Tristate.kt:35) — a clean sealed type with exactly three states: `Absent` (Tristate.kt:42), `Null` (Tristate.kt:53), `Present(value)` (Tristate.kt:63). It is zero-dep (no Jackson), covariant, with `fold` (Tristate.kt:99), `getOrNull` (Tristate.kt:83), and Java factories `absent`/`nullValue`/`present`/`ofNullable` (Tristate.kt:112-135). Jackson wiring is correctly quarantined in the adapter: `TristateModule` (sdk-serde-jackson/.../TristateModule.kt:63) uses a `Deserializers.Base` resolver (TristateModule.kt:85) + `ContextualDeserializer` (TristateModule.kt:108) to recover `T`, a `BeanSerializerModifier`→`TristatePropertyWriter` (TristateModule.kt:222/246) that omits Absent properties via `serializeAsField` (TristateModule.kt:249), and `isEmpty` override (TristateModule.kt:186) for `@JsonInclude(NON_ABSENT)`. This is the SAME contextual-type-extraction trick as JsonField.Deserializer, just scoped to one wrapper. + +The gaps vs openai-java, all in OUR favor on dep-hygiene but against us on capability: +1. No `KnownValue` vs `JsonValue` axis. `Tristate.Present` is ALWAYS typed; there is no per-field escape hatch for "server sent the wrong type" — our `JacksonDeserializer.deserialize` (JacksonSerde.kt:158-178) would just throw. openai-java recovers it as `JsonValue`. +2. No embedded JSON-tree value model at all. We have ZERO equivalent of `JsonValue`/`JsonObject`/`JsonArray`. Grep confirms no `JsonValue`/`JsonField`/`JsonNode`-style type anywhere in sdk-core. +3. No `additionalProperties` pass-through. Our default mapper merely DISABLES `FAIL_ON_UNKNOWN_PROPERTIES` (JacksonObjectMappers.kt:55) — unknown fields are silently DROPPED, not preserved. openai-java round-trips them losslessly. +4. No `validate()`/`isValid()`/`validity()` — we have no DTO layer yet, so nothing to validate. `isValid` exists only on InstrumentationContext (unrelated). +5. No forward-compatible enum pattern. refs-comparison.md:410 already flags this as a planned ADOPT ("Square's enable-forward-compatible-enums ... emits an UNKNOWN sentinel"). +6. We don't lock down coercions. Our mapper (JacksonObjectMappers.kt:49-63) sets only 2 flags; openai-java sets ~60 + disables scalar coercion. A wire `"5"`→`int` would silently coerce in ours. + +**Recommendations (verified)** + +- **Deep-array-aware equals/hashCode/toString helpers for value types** `COPY` · `both` · effort S · confidence medium + - *Verdict:* Accurate. contentEquals delegates to contentDeepEquals via `arrayOf(this).contentDeepEquals(arrayOf(other))` (Utils.kt:71-72), contentHash to contentDeepHashCode (Utils.kt:80), contentToString to contentDeepToString with bracket-stripping (Utils.kt:88-98); used in KnownValue.equals/hashCode/toString (Values.kt:412/415/417) and MultipartField (Values.kt:706). The correctness point is real: java.util.Objects.hash/equals on a field holding a ByteArray (or Array) compares by reference, so two equal-content arrays mis-compare — contentDeep* fixes it. BUT verify the premise against OUR code before claiming a present bug: I did not find a current sdk-core value type that stores a raw array field — Headers/MediaType store String/List, not Array (grep showed no Array-typed value fields in the serde/http models I read). So for sdk-core today this is latent/preventive, not an active bug; downgrading confidence to medium and narrowing target. It is genuinely valuable for CODEGEN (generated DTOs WILL have ByteArray fields for binary payloads, exactly like openai-java's InputStream/ByteArray handling). The caveat the original raised is correct and worth heeding: contentToString strips outer brackets (Utils.kt:90-94) which is a cosmetic quirk we should NOT replicate. ~15 lines of pure Kotlin stdlib, Java-8 safe. Reimplement rather than copy — they are one-liners over contentDeep*, so Apache-2.0 attribution is avoidable. + - *Do:* Add internal contentEquals/contentHash helpers to sdk-core util (skip the bracket-stripping contentToString quirk — use a plain contentDeepToString or per-field toString). Use them in generated DTOs that hold array/ByteArray fields, and retrofit any existing value type IF it later gains an array field. Treat as preventive for sdk-core (no current array-typed value field found) and as a real requirement for codegen. Reimplement, do not copy. +- **Coercion lockdown in the Jackson adapter default mapper — ours is too permissive (a wire "5" silently coerces to int)** `ADOPT` · `sdk-core` · effort S · confidence medium + - *Verdict:* Accurate on both sides. openai-java sets per-LogicalType CoercionAction.Fail across ~10 logical types (ObjectMappers.kt:44-104), disables ALLOW_COERCION_OF_SCALARS (:110), and disables all five AUTO_DETECT_* (:111-115). Ours (JacksonObjectMappers.kt:49-63) sets only FAIL_ON_UNKNOWN_PROPERTIES off + WRITE_DATES_AS_TIMESTAMPS off + the two factory AUTO_CLOSE flags, leaving Jackson's permissive scalar coercion ON — so `"5"`->Long and `5`->String would silently coerce. For a toolkit whose serde adapter is the recommended default, strict-by-default is the safer posture; this is a real, cheap hardening. TWO important corrections to the original: (a) The TARGET is wrong — it says sdk-core but the change lives entirely in sdk-serde-jackson (JacksonObjectMappers). sdk-core never touches Jackson; I am keeping target=sdk-core ONLY because the schema enum lacks an 'adapter' value, and flagging loudly that the real target is the sdk-serde-jackson adapter. (b) The AUTO_DETECT_* disables are NOT applicable to us yet and are mildly DANGEROUS to copy now: openai-java can disable auto-detection because every field is explicitly @JsonProperty-annotated on generated classes. Our adapter is used today with hand-written Kotlin data classes via KotlinModule that rely on property auto-detection; disabling AUTO_DETECT_FIELDS/GETTERS/SETTERS would break binding for current users. So adopt the COERCION block now; DEFER the AUTO_DETECT_* disables to codegen (only valid once we emit annotated DTOs). Confidence medium because over-strict coercion can reject payloads from sloppy servers some users tolerate today — must be documented as a pre-1.0 behavior change. Also note the coercion lockdown only fully pays off WITH the four-state model (a coercible-but-wrong value then falls to JsonValue instead of throwing); without JsonField it converts silent-corruption into a thrown exception, which is still strictly better but is a different tradeoff. + - *Do:* Add the withCoercionConfig(per LogicalType -> Fail) block + disable(MapperFeature.ALLOW_COERCION_OF_SCALARS) to JacksonObjectMappers.defaultObjectMapper() in sdk-serde-jackson (NOT sdk-core). Do NOT copy the AUTO_DETECT_* disables yet — they break auto-detected Kotlin data-class binding our adapter currently supports; defer those to codegen output. Document the stricter coercion as an intentional pre-1.0 behavior change. Apache-2.0: attribute openai-java ObjectMappers.kt if the config block is lifted near-verbatim. +- **additionalProperties pass-through for lossless round-trips — real capability gap; our default mapper silently DROPS unknown fields** `ADOPT` · `codegen` · effort M · confidence high + - *Verdict:* Accurate. Every model carries `additionalProperties: MutableMap` (CompletionUsage.kt:29), captured via `@JsonAnySetter putAdditionalProperty` (CompletionUsage.kt:146-149) and re-emitted via `@JsonAnyGetter _additionalProperties()` wrapped in Collections.unmodifiableMap (CompletionUsage.kt:151-154); builder has put/putAll/remove (CompletionUsage.kt:263-280). Our side confirmed: JacksonObjectMappers.kt:55 only disables FAIL_ON_UNKNOWN_PROPERTIES and there is no @JsonAnySetter anywhere in our codebase, so unknown fields are parsed-and-discarded — a GET-modify-PUT loses server-added fields. This is exactly the silent-data-loss class of bug the user flags from prior SDKs, so the rationale lands. Correctly categorized ADOPT and correctly targeted codegen (the map lives per-DTO). Hard dependency: it needs the JsonValue tree type from the finding above to hold arbitrary values, so it cannot land before that. Note this is a forward-compat READ/round-trip concern, NOT a sdk-core toolkit feature — there is nothing to do in sdk-core today. One caveat the original undersold: openai-java's additionalProperties is a mutable field defended only by discipline (see the antipattern below) — when we adopt, snapshot to an immutable map at construction to honor our genuinely-immutable convention. + - *Do:* Codegen design: every emitted DTO gets an additionalProperties map with @JsonAnySetter/@JsonAnyGetter (annotations adapter-side / on generated classes, never on a sdk-core type) and builder put/remove. Store it as an immutable snapshot at build() rather than a live MutableMap. Document this as the forward-compat round-trip contract and note its hard dependency on the JsonValue tree type. Do not open this until the JsonField/JsonValue core type exists. +- **Forward-compatible enums (Known vs Value+_UNKNOWN) — already on our roadmap; openai-java's two-enum split is the ergonomic improvement to copy** `ADOPT` · `codegen` · effort M · confidence high + - *Verdict:* Accurate. ReasoningEffort wraps `JsonField` (ReasoningEffort.kt:22), exposes @JvmField constants (ReasoningEffort.kt:36-46), and TWO nested enums: Known (closed, :52-59) and Value (closed + _UNKNOWN, :70-81). value() returns _UNKNOWN for unrecognized strings (:90-99); known() throws OpenAIInvalidDataException (:109-118); the raw string is preserved via _value() (:32). The genuinely useful insight over a plain `enum class X { ...; UNKNOWN }` is the split: callers can do an exhaustive `when` over Known on the happy path while deserialization never throws and the original unknown string survives — a single Kotlin enum with a JsonCreator fallback cannot retain the unknown string. This is partially already-planned: refs-comparison.md:410 and :465 already commit to forward-compatible enums with an UNKNOWN sentinel (Square's setting), so the NET-NEW contribution here is narrow — adopt the two-enum (Known/Value) shape and string-preservation, not the concept. The recommendation's best point: for enums you do NOT need the full JsonField dependency — store a plain String + raw value. That is strictly simpler than openai-java (which drags JsonField in) and still forward-compatible; openai-java only uses JsonField here for serializer uniformity. Keep category ADOPT but acknowledge it is a refinement of an existing roadmap item, not a new capability. + - *Do:* In codegen, emit enums as a class wrapping a plain String (NOT JsonField) with nested Known and Value(+_UNKNOWN) enums plus known()/value()/asString(). This beats openai-java by avoiding the JsonField dependency for the enum case while preserving the unknown string and the exhaustive-when-over-Known ergonomic. Update refs-comparison.md:410 to specify the two-enum shape rather than a bare UNKNOWN sentinel. Pick lint-clean names (no _UNKNOWN underscore prefix — see antipattern). +- **Document the Tristate (write/PATCH) vs JsonField-analogue (read/forward-compat) boundary so they are never wrongly unified** `LEARN` · `docs/process` · effort S · confidence high + - *Verdict:* Accurate and a worthwhile guard-rail. Tristate is 3 states with no escape hatch and Present is `T : Any` (Tristate.kt:63), so it structurally CANNOT hold a type-mismatched value; JsonField adds exactly that fourth 'present-but-wrong-type' state via asUnknown(): Optional (Values.kt:69). The claim that neither subsumes the other is correct: Tristate has no forward-compat escape hatch; JsonField has no clean PATCH-clear ergonomic (its JsonNull means explicit null but the absent/clear distinction is JsonMissing, which is a wire-omission marker, not a PATCH-semantics affordance). This is real and matches the user's stated value for clean, purposeful abstractions and their memory of conflated/duplicated abstractions in prior SDKs. The risk is genuine but low-severity (a future contributor merging the two and losing PATCH ergonomics or forward-compat). Keep as a short doc note bundled with the codegen design — not worth a standalone work item, but cheap insurance. Confidence high that the boundary is correct; the only reason this is LEARN not ADOPT is there is no code to write. + - *Do:* In the serde section of docs/architecture.md and in the codegen design doc, state explicitly: Tristate = request/PATCH three-state (absent/null/present), illegal-state-unrepresentable via Present; the JsonField-analogue = response forward-compat four-state (adds present-but-type-mismatched). They are complementary, not redundant, and may even compose (e.g. a PATCH field typed Tristate>). Fold this into the codegen doc rather than shipping a standalone document. +- **Lazy recursive validate()/isValid() + validity() scoring for discriminator-less union disambiguation** `LEARN` · `codegen` · effort M · confidence medium + - *Verdict:* Mechanism accurate. validate() memoizes via a `validated` flag (CompletionUsage.kt:307, 317-328), forces each typed accessor (which throws on bad data), and recurses (`completionTokensDetails().ifPresent { it.validate() }`, :325). isValid() is the non-throwing wrapper (:330-336). validity(): Int counts KnownValue fields and recurses (:343-349), and for enums returns 0 if _UNKNOWN (ReasoningEffort.kt:164). The 'best-match union deserialization' purpose is stated verbatim in their own KDoc (CompletionUsage.kt:341 'Used for best match union deserialization'), so the claim that this is how Stainless disambiguates discriminator-less oneOf/anyOf is well-supported. DOWNGRADED from ADOPT to LEARN: there is nothing to adopt today — we have no DTO layer and no union types, so this is purely a principle to bake into the future codegen model template, not a capability to add now. It is also strictly a consequence of the never-throw model (you cannot validate eagerly if you want forward-compat), so it only makes sense bundled with the JsonField finding. Lower confidence because validity-scoring is one of several union strategies (discriminator-first with validity as fallback is more common and cheaper); committing to validity-as-primary now is premature. The original's 'genuinely clever' framing is fair but the user's rigor bar wants the caveat: it is O(deserialize-as-every-variant), i.e. it pays the double-parse cost (antipattern #2) once per candidate variant. + - *Do:* Internalize for the codegen model template: plan for a memoized validate(), an isValid(), and an internal validity() per model + enum. Document validity-based best-match as the FALLBACK union strategy (discriminator-first when present, validity-scoring only for discriminator-less oneOf/anyOf) so we do not pay N full deserializations when a discriminator exists. No action until codegen + union types exist; capture it in the design doc so the model shape accommodates it. +- **Forward-compatible four-state value model (JsonField) is the right shape for CODEGEN output, but must be split core-vs-adapter like Tristate already is** `LEARN` · `codegen` · effort L · confidence high + - *Verdict:* Verified line-by-line and accurate. JsonField is `sealed class JsonField` (Values.kt:39) with `KnownValue` (Values.kt:402) and the seven-variant `JsonValue` (Values.kt:271). The never-throw claim is correct: the Deserializer (Values.kt:256-258) does `tryDeserialize -> of(it) ?: JsonValue.fromJsonNode(node)`, so a type mismatch degrades to a raw JsonValue and the throw is deferred to the typed accessor via getRequired (Values.kt:171-177). ONE precision fix to the original claim: getOptional (Values.kt:180-192) returns empty only for Missing/Null and STILL THROWS on a type-mismatched JsonValue, so 'optional' fields are not fully never-throw at the accessor — only the raw `_field()` accessor (CompletionUsage.kt:108) is. The Jackson-welding is real and damning for sdk-core: `@JsonDeserialize` on both roots (Values.kt:38, 270), `@JsonValue`/`@JsonCreator` on every leaf, and a private static `JSON_MAPPER = jsonMapper()` ON the JsonValue companion (Values.kt:319) used inside `from`/`convert` (Values.kt:273-275, 360). So the category is correctly LEARN (a model-design principle), correctly targeted at codegen, and the core-vs-adapter split prescription is exactly right and already proven by Tristate (annotation-free sum type in core) + TristateModule (all Jackson in adapter). The contextual-type-extraction the recommendation wants to reuse genuinely already exists in our code (TristateModule.kt:108-121 / TristateDeserializers.kt findBeanDeserializer at :86-102), so generalizing it from one wrapper to a generic JsonField is plausible — but flag the new cost the original glossed: JsonField needs a full embedded JsonValue tree (7 variants) in core, which Tristate did NOT need. That is the real bulk of the effort, not the Deserializer plumbing. + - *Do:* In the codegen design doc, specify a JsonField-analogue with TWO layers: (1) an annotation-free sealed type in sdk-core carrying the four states AND a small embedded JSON-tree value type (the JsonValue equivalent: missing/null/bool/number/string/array/object) — zero Jackson, mirroring how Tristate lives in core; (2) the @JsonDeserialize/@JsonValue Deserializer+Serializer in sdk-serde-jackson, reusing the contextual-type-extraction already shipping in TristateModule.kt:108-121. Generated DTOs hold private JsonField with dual accessors. Be explicit that getOptional-style accessors still throw on type-mismatch (only the raw accessor never throws) so the 'never throws' guarantee is scoped correctly. This is the single highest-leverage model-layer decision for the future generator. + +**Considered & dropped** + +- ~~Runtime Jackson version-compatibility probe (checkJacksonVersionCompatibility)~~ — Claim is accurate (Check.kt:50-84 probes 5 sub-packages — core/databind/jdk8/jsr310/kotlin at :90-96 — enforcing MINIMUM 2.13.4 at :86 and blacklisting 2.18.1 at :88), and it does map cleanly to our sdk-serde-jackson which registers the same module families (KotlinModule/JavaTimeModule/Jdk8Module, JacksonObjectMappers.kt:51-54). But it is low-value defensive plumbing the original itself rated 'defer until a real version-skew bug surfaces' and 'may be noise.' openai-java NEEDS this because its core hard-depends on deep Jackson internals (JsonField/JsonValue weld Jackson in); our adapter uses broadly-compatible public Jackson APIs (registerModule, readValue, a BeanSerializerModifier/BeanPropertyWriter), so the skew blast-radius is far smaller. Not decision-ready and not substantive enough to keep; if a real ClassNotFound/NoSuchMethod skew bug ever appears, lift Check.kt:50-84 then. Dropped as speculative/low-priority filler per the keep-bar. + +**Do not copy** + +1) HARD-CODED JACKSON IN THE CORE VALUE MODEL. `JsonField`/`JsonValue` carry `@JsonDeserialize` (Values.kt:38/270), `@JsonValue`/`@JsonCreator` on every leaf (Values.kt:404/422/468/493...), and call `JSON_MAPPER.convertValue` directly inside the model (Values.kt:273-275, 360). A private `JSON_MAPPER = jsonMapper()` is even a static field ON the JsonValue companion (Values.kt:319). For us this is a direct violation of zero-dep sdk-core — DO NOT put any of these annotations or mapper references on a core type. Our Tristate (Tristate.kt) proves the correct split: annotation-free sum type in core, all Jackson in TristateModule. Any JsonField analogue MUST follow that split. + +2) DOUBLE-PARSE COST. BaseDeserializer reads the WHOLE node to a tree (`readValueAsTree()`, BaseDeserializer.kt:26) before typed binding, and `tryDeserialize` (BaseDeserializer.kt:31) re-runs `readValue(treeAsTokens(node), type)` — so every field is materialized as a JsonNode then re-parsed into T. For large payloads that's measurable overhead, the price of the never-throw model. If we adopt the four-state model, scope it to GENERATED DTOs (where forward-compat matters), not to the generic Serde fast-path. + +3) `tryDeserialize` SWALLOWS ALL EXCEPTIONS (BaseDeserializer.kt:34/41: `catch (e: Exception) { null }`) — including OOM-adjacent or programming errors, not just type mismatches. A bad custom deserializer that throws NPE becomes a silent fall-through to JsonValue, hiding bugs. If we replicate, catch narrowly (JsonProcessingException / MismatchedInputException), not `Exception`. + +4) STRINGLY-TYPED MUTABILITY IN A SO-CALLED IMMUTABLE MODEL. `additionalProperties` is a `MutableMap` stored directly (CompletionUsage.kt:29), defended only by `Collections.unmodifiableMap` at the getter (CompletionUsage.kt:154) and `.toMutableMap()` copies at build/from (CompletionUsage.kt:190/303). The field itself is mutable; correctness relies on discipline. Our convention is genuinely-immutable data — prefer an immutable map snapshot at construction. + +5) `_UNKNOWN`/`_value`/`_field` UNDERSCORE-PREFIX NAMING. openai-java prefixes raw accessors with `_` (CompletionUsage.kt:108 `_completionTokens`, ReasoningEffort.kt:32 `_value`, ReasoningEffort.kt:80 `_UNKNOWN`). It's unidiomatic Kotlin and would draw ktlint/detekt fire in our repo. Keep the dual-accessor CONCEPT but pick lint-clean names (e.g. `rawCompletionTokens()` / `completionTokensField()`). + +**Where we're ahead** + +1) DEPENDENCY HYGIENE — decisively. Our value-model abstraction (Tristate, Tristate.kt:35) is in zero-dep sdk-core with NOT A SINGLE Jackson annotation; openai-java welds Jackson onto its core value types (Values.kt:38, 273, 319, 360). Their model layer is unusable without Jackson on the classpath; ours can target any serde via the adapter seam (Serde.kt:18). This is the toolkit-vs-client distinction made concrete and it is a real architectural win, not a stylistic one. + +2) Tristate's ILLEGAL-STATE-UNREPRESENTABLE bound. `Present` (Tristate.kt:63) makes `Present(null)` uncompilable, so the illegal fourth state (present-but-null, which should be Null) can't exist. openai-java's `KnownValue` (Values.kt:402) has the same bound, but their broader JsonField surface still allows odd combinations the visitor's `visitDefault()` throws on at runtime (Values.kt:314) rather than at compile time. + +3) Caller-owned-stream rigor in our adapter. JacksonSerde drives per-call generators/parsers with AUTO_CLOSE disabled (JacksonSerde.kt:114, 174) so a BYO mapper is never mutated and caller streams stay open — a documented, tested contract (JacksonSerde.kt:80-88). openai-java's ObjectMappers just sets `disable(FLUSH_AFTER_WRITE_VALUE)` (ObjectMappers.kt:107) globally on the shared mapper; it has no equivalent per-call stream-ownership discipline because it owns its single mapper. Our seam is cleaner for a toolkit where users bring their own mapper. + +4) Cleaner factory ergonomics on the value type. Tristate exposes `fold` (Tristate.kt:99), `getOrNull` (Tristate.kt:83), and four named Java factories (Tristate.kt:112-135) on a 3-state type. openai-java's equivalent fold is the verbose Visitor interface pattern (Values.kt:293-315) requiring an anonymous class per use-site — heavier for Java callers than a single fold lambda. + +(Caveat: 'ahead' here is scoped to dep-hygiene and the PATCH axis. On raw capability — forward-compat reads, additionalProperties, validation, unions — openai-java is far ahead because they have a generated model layer and we have none yet.) + +_Verifier notes:_ Overall the analysis is unusually accurate — I confirmed every cited line in Values.kt, ObjectMappers.kt, BaseDeserializer.kt, BaseSerializer.kt, Check.kt, Utils.kt, CompletionUsage.kt, and ReasoningEffort.kt, and confirmed our side (Tristate.kt, TristateModule.kt, JacksonSerde.kt, JacksonObjectMappers.kt, Serde/Deserializer.kt, and a grep proving sdk-core has ZERO JsonValue/JsonField/JsonNode/additionalProperties/validity machinery). + +CORRECTIONS I made: (1) getOptional STILL THROWS on type-mismatch (Values.kt:188-191) — the 'never throws' guarantee applies only to the raw `_field()` accessor, not to optional typed accessors; the summary overstated this. (2) The coercion-lockdown finding's target is the sdk-serde-jackson ADAPTER, not sdk-core (kept target=sdk-core only because the schema enum has no 'adapter' value — flagged in-finding). (3) The AUTO_DETECT_* disables are NOT safe to copy now: our adapter currently binds hand-written Kotlin data classes via KotlinModule auto-detection; disabling auto-detect would break current users — defer to codegen (annotated DTOs only). I split that out of the coercion finding. (4) Downgraded validate()/validity() from ADOPT to LEARN (nothing to adopt pre-DTO-layer) and flagged validity-scoring as a fallback strategy, not primary, to avoid N-deserializations when a discriminator exists. (5) Downgraded the deep-array equals helper confidence: no current sdk-core value type holds a raw array field, so it is preventive for core and a real need only for codegen. + +ANTIPATTERNS in the source analysis are all verified correct and worth carrying into the codegen design doc as 'do-not-copy' notes: (a) Jackson welded onto core value types — @JsonDeserialize (Values.kt:38/270), per-leaf @JsonValue/@JsonCreator, and a static JSON_MAPPER on the JsonValue companion (Values.kt:319) calling convertValue inside the model (Values.kt:273-275, 360) — a direct violation of our zero-dep core, and Tristate proves the correct split. (b) Double-parse cost: BaseDeserializer reads the whole node to a tree (readValueAsTree(), BaseDeserializer.kt:26) then re-parses via readValue(treeAsTokens(node), type) in tryDeserialize (:31/38) — measurable on large payloads and multiplied per-candidate by validity-based union matching; scope the four-state model to generated DTOs, not the generic Serde fast-path. (c) tryDeserialize catches bare Exception (BaseDeserializer.kt:34/41) — swallows NPEs/programming errors into a silent JsonValue fall-through; if we replicate, catch narrowly (JsonProcessingException/MismatchedInputException). (d) additionalProperties is a live MutableMap (CompletionUsage.kt:29) defended only by Collections.unmodifiableMap at the getter and toMutableMap() copies — our convention is genuinely-immutable, so snapshot at construction. (e) underscore-prefix naming (_completionTokens, _value, _UNKNOWN) is unidiomatic Kotlin and would draw ktlint/detekt fire — keep the dual-accessor concept, pick lint-clean names. + +WE-ARE-AHEAD claims all verified true and correctly scoped to dependency hygiene + the PATCH axis: our value model is in zero-dep sdk-core with no Jackson annotations (Tristate.kt) vs their Jackson-welded core; Present makes the illegal fourth state uncompilable; our JacksonSerde drives per-call generators/parsers with AUTO_CLOSE disabled (JacksonSerde.kt:114, 174) honoring caller-owned streams and never mutating a BYO mapper, which openai-java has no equivalent of because it owns a single shared mapper; and fold/getOrNull/named factories (Tristate.kt:83/99/112-135) are lighter than their Visitor pattern. The honest caveat stands: on raw model-layer CAPABILITY (forward-compat reads, additionalProperties, validation, unions) openai-java is far ahead because they have a generated model layer and we have none — which is precisely why the substantive findings all target the future codegen, not sdk-core. + +--- + +## 2. HTTP request/response model (HttpRequest, bodies, Headers, QueryParams) + +**What it is** + +openai-java models an outbound exchange as a DECONSTRUCTED HttpRequest: `baseUrl: String` + `pathSegments: List` + `queryParams: QueryParams` + `headers: Headers` + `body: HttpRequestBody?` (HttpRequest.kt:7-15). The final URL is assembled LAZILY on demand by `url()` (HttpRequest.kt:17-44), percent-encoding each path segment and each query key/value with `URLEncoder.encode(...,"UTF-8")`. `Headers` and `QueryParams` are near-identical case-folded multimaps (Headers uses `TreeMap(CASE_INSENSITIVE_ORDER)`, Headers.kt:37-38; QueryParams uses a plain `mutableMapOf`, QueryParams.kt:36) that BOTH expose a `put(name, JsonValue)` overload which flattens the codegen value tree into wire form — arrays become repeated keys / `key[]`, objects become `key[nested]` (Headers.kt:41-52, QueryParams.kt:39-50). The body abstraction is a 5-method interface: `writeTo(OutputStream)` / `contentType(): String?` / `contentLength(): Long` / `repeatable(): Boolean` / `close()` (HttpRequestBody.kt:6-25) — note it writes to a raw JDK `OutputStream` and carries a checked `AutoCloseable`. Concrete bodies are anonymous objects minted in HttpRequestBodies.kt: a lazy-serialized JSON body (json(), :19-33), and a hand-rolled `MultipartBody` that writes its own boundary frames and keeps `contentLength()` byte-exact by a parallel manual tally that "must remain in sync with writeTo" (:121-181). The transport SPI `HttpClient` is FOUR methods — `execute`/`executeAsync` each overloaded with and without `RequestOptions` (HttpClient.kt:7-26) — plus `AutoCloseable`. `HttpResponse` is a tiny `AutoCloseable` interface returning `statusCode()/headers()/body(): InputStream` (HttpResponse.kt:8-30), and `HttpResponseFor` adds a `parse()` whose result is memoized via `by lazy` in the `parseable` decorator (HttpResponseFor.kt:11-25). The whole subsystem hard-depends on Jackson (`JsonMapper`, `JsonNode`) inside core (HttpRequestBodies.kt:7-9) — exactly the coupling our sdk-core forbids. OUR equivalent: an immutable `Request` data class holding a fully-resolved `java.net.URL` (Request.kt:42-47, no path/query decomposition), a richer `RequestBody` abstract class that writes to our `BufferedSink` and adds `isReplayable()`/`toReplayable()` (RequestBody.kt:38-83), a typed-plus-string `Headers` multimap (Headers.kt), and TWO single-method `fun interface` SPIs `HttpClient.execute` and `AsyncHttpClient.executeAsync` (HttpClient.kt:46, AsyncHttpClient.kt:65). We have NO query-param model at all — `QueryParam.kt` is a literal `TODO()` placeholder. + +**How it works (line-level)** + +DECONSTRUCTED URL, lazy assembly: `fun url(): String = buildString { append(baseUrl); pathSegments.forEach { if (!endsWith("/")) append("/"); append(URLEncoder.encode(segment,"UTF-8")) } ... }` (HttpRequest.kt:17-44). VALUE-TREE FLATTENING into query keys: `is JsonArray -> value.values.forEach { put("$key[]", it) }` and `is JsonObject -> value.values.forEach { (nestedKey, value) -> put("$key[$nestedKey]", value) }` (QueryParams.kt:46-48); Headers does the dotted-path variant `put("$name.$nestedName", value)` (Headers.kt:50). MULTIMAP `put` keeps a manual `size` counter separate from the map: `map.getOrPut(name){mutableListOf()}.add(value); size++` and `remove` does `size -= map.remove(name).orEmpty().size` (Headers.kt:54-57, 85). BODY writes to raw JDK stream: `fun writeTo(outputStream: OutputStream)` (HttpRequestBody.kt:8); JSON body lazy-serializes once and reuses bytes: `private val bytes: ByteArray by lazy { jsonMapper.writeValueAsBytes(value) }; override fun repeatable() = true` (HttpRequestBodies.kt:22-30). MULTIPART byte-exact length is hand-summed and flagged fragile: `// This must remain in sync with writeTo.` then a literal tally `byteCount += DASHDASH.size + boundaryBytes.size + CRLF.size + ... + contentLength + CRLF.size`, returning `-1L` if ANY part is unknown-length (HttpRequestBodies.kt:154-181); `repeatable() = parts.all { it.body.repeatable() }` (:183). Multipart header-injection guard percent-escapes only CR/LF/quote in the disposition: `'\n' -> append("%0A"); '\r' -> append("%0D"); '"' -> append("%22")` (HttpRequestBodies.kt:230-241). PARSE-ONCE decorator: `private val parsed: T by lazy { parse() }` (HttpResponseFor.kt:14). SPI carries per-request timeout via RequestOptions, merged with client defaults by `applyDefaults` using `timeout.assign(...)` (RequestOptions.kt:24-31). The okhttp transport's body bridge is the entire payoff of the OutputStream choice — one line: `override fun writeTo(sink: okio.BufferedSink) = writeTo(sink.outputStream())` (OkHttpClient.kt:294); empty-body-required methods are back-filled with `if (body == null && requiresBody(method)) body = "".toRequestBody()` where requiresBody = POST/PUT/PATCH (OkHttpClient.kt:237-238, 264-271). + +**vs. our SDK** + +URL: OURS stores a fully-resolved `java.net.URL` built in `RequestBuilder.url(String)` via `URL(url)` (Request.kt:137-140) and compares by `url.toExternalForm()` to dodge DNS-resolving `URL.equals` (Request.kt:52-71) — a genuinely good call they don't make. But we have NO `baseUrl`+`pathSegments`+`queryParams` decomposition and NO query model: `QueryParam.kt:18-20` is `internal class QueryParam { fun implementation(): Nothing = TODO() }`. Query edits are done by hand-rolled string surgery in `RequestRebuilder.setQueryParam` (RequestRebuilder.kt:74-103) which `split('&')`, `URLDecoder.decode` each key, and reassembles — re-implementing what a multimap + `url()` would give for free, and only for the single-value pagination case. BODY: OURS `RequestBody.writeTo(sink: BufferedSink)` (RequestBody.kt:52) writes to OUR io seam, NOT a JDK `OutputStream`; this forces every transport to bridge through `Io.provider.sink(os)` (SdkRequestBodyAdapter.kt:51-55) and makes a body unwritable until `Io.installProvider` has run. Our `RequestBody` is RICHER: `isReplayable()`/`toReplayable(provider)` (RequestBody.kt:62-83) is the explicit retry-buffering seam their `repeatable()` boolean only hints at; we ship 7 concrete `create(...)` factories (RequestBody.kt:88-213) incl. `mark/reset`-aware `InputStream` replay (RequestBody.kt:163-175) and form-urlencoded (RequestBody.kt:203-212), plus `FileRequestBody` with byte-range `position`/`count` and a `toByteBuffer()` mmap (FileRequestBody.kt). HEADERS: ours matches their multimap and ADDS a typed `HttpHeaderName` API alongside the `String` one (Headers.kt:49-74) and CR/LF injection rejection at build time (Headers.kt:325-335); we deep-copy to unmodifiable lists on `build()` (Headers.kt:288-298) where they only `toImmutable()`. SPI: OURS is two 1-method `fun interface`s — `HttpClient.execute(request): Response` (HttpClient.kt:46-51) and `AsyncHttpClient.executeAsync(request): CompletableFuture` (AsyncHttpClient.kt:65-72) — SAM-constructible, with `asAsync(executor)`/`asBlocking()` bridges (AsyncHttpClient.kt:101-140). Theirs is one interface with 4 methods carrying `RequestOptions`. RESPONSE: ours is a full immutable data class with `request`/`protocol`/`status`/`message`/`headers`/`body` (Response.kt:41-48) vs their 4-method interface; our `ResponseBody` exposes `source(): BufferedSource` not `InputStream` (ResponseBody.kt:64). + +**Recommendations (verified)** + +- **Adopt a shared checkRequired(name, value) builder-validation helper to replace inconsistent inline checkNotNull messages** `COPY` · `both` · effort S · confidence high + - *Verdict:* Accurate and Jackson-free at the function level. Check.kt:8-12 defines checkRequired(name, Boolean) and checkRequired(name, T?):T producing a uniform message `\`$name\` is required, but was not set`; used at HttpRequest.kt:172-173. The Jackson coupling in that file (checkJacksonVersionCompatibility, :49-96) lives in SEPARATE functions — the two checkRequired overloads import nothing beyond stdlib check/checkNotNull, so they are safe to mirror. Our builders are genuinely inconsistent: Request uses checkNotNull(method){"Method is required."} and checkNotNull(url){"URL is required."} (Request.kt:254-255), while Response uses checkNotNull(request){"request is required"} / "protocol is required" / "status is required" (Response.kt:255-257) — different capitalization, punctuation, and backtick style. Honest sizing: this is the smallest item in the set; for two builders it is cosmetic. Its real payoff is at codegen scale (hundreds of generated builders emitting one uniform 'X is required' message). Worth doing, but rank it last. + - *Do:* Add checkRequired(name: String, value: T?): T (and a Boolean overload) to core util, mirroring Check.kt:8-12 but WITHOUT the Jackson-version siblings. Replace the inline checkNotNull literals in Request/Response builders and have codegen-emitted builders use it. Attribute openai (Apache-2.0) if copied verbatim — though the function is trivial enough that an independently-written equivalent avoids the attribution question. +- **Build a first-class QueryParams multimap to replace the TODO() placeholder and the string-surgery in RequestRebuilder** `ADOPT` · `sdk-core` · effort M · confidence high + - *Verdict:* Verified end to end. openai QueryParams.kt:15-109 is a complete multimap (keys/values/put/replace/removeAll/clear, manual size counter, immutable build, value equality); the flattening put(JsonValue) is at :39-50 (arrays -> key[], objects -> key[nested]). Our http/QueryParam.kt:18-20 is verbatim `internal class QueryParam { fun implementation(): Nothing = TODO() }`. Pagination really does hand-roll URL string surgery: RequestRebuilder.setQueryParam (RequestRebuilder.kt:74-103) splits on '&', URLDecoders each key, reassembles; decodeOrRaw (:127-138) swallows malformed percent-encoding. The string code only handles single-valued params and re-implements rebuildUrl (:140-165) by hand. One correction to the finding's framing: openai's put(String,String) does NOT percent-encode (encoding is deferred to url() at HttpRequest.kt:24,38-40), so a ported QueryParams must decide where encoding happens — that decision is bound to Finding 2 (we store a resolved URL with nowhere natural to hang a live multimap). The finding itself flags this dependency, which is the right call. Note the size-counter pattern is a documented drift hazard (their own antipattern #5); a ported version should derive size, not track it. + - *Do:* Delete QueryParam.kt. Add a QueryParams type modeled on our existing Headers.kt (LinkedHashMap> + Builder, case-SENSITIVE keys, add/set/remove/values/names, build() deep-copies to unmodifiable lists, size derived not tracked). RESOLVE Finding 2 first (placement vs resolved-URL). Then rewrite RequestRebuilder to operate on QueryParams instead of raw strings, giving multi-value query support we currently lack. Zero-dep, Java-8, immutable+Builder — no constraint conflict. +- **Add a per-request options channel (per-call timeout, response-validation) carried on the context chain — NOT on the transport SPI** `ADOPT` · `sdk-core` · effort M · confidence high + - *Verdict:* Verified. openai threads RequestOptions through every SPI call (HttpClient.kt:9-22) and the okhttp transport reads it to override per-call timeouts via okHttpClient.newBuilder().connectTimeout/readTimeout/writeTimeout/callTimeout (OkHttpClient.kt:95-101). RequestOptions.applyDefaults merges request-level over client-level (RequestOptions.kt:23-29) and carries responseValidation + timeout. Our side: grep confirms RequestOptions does NOT exist anywhere in sdk-core/src/main, and both SPIs are single-method fun interfaces — HttpClient.execute(request) (HttpClient.kt:46-51) and AsyncHttpClient.executeAsync(request) (AsyncHttpClient.kt:65-72) — with asAsync/asBlocking bridges. So the capability gap is real: a caller cannot set a one-off timeout for a single send. The finding's key judgment is correct and important: do NOT widen the fun interface (that kills SAM literals like `HttpClient { req -> ... }`, an ergonomics win they lack, and breaks binary-compat). Carry options on the existing CallContext->DispatchContext->RequestContext->ExchangeContext chain and let transports that support overrides read them. java.time.Duration is Java-8-safe. The one open question the finding under-specifies: whether the context chain currently has a slot to carry such options — that should be checked during design. + - *Do:* Define a small immutable RequestOptions (timeout as java.time.Duration; optionally a responseValidation analog) and thread it through the context promotion chain, not the SPI. Transports that support per-call overrides read it from context; others ignore it. Keep HttpClient/AsyncHttpClient single-method by design and document why. Verify the context chain can carry it (add a field if not). +- **FUTURE CODEGEN: a core MultipartRequestBody (boundary framing + byte-exact length) writing to BufferedSink, with non-file part serialization injected via Serde (no Jackson in core)** `ADOPT` · `codegen` · effort L · confidence high + - *Verdict:* Accurate. grep confirms we have ZERO multipart implementation (only MediaType.kt/CommonMediaTypes.kt name the string). openai's MultipartBody (HttpRequestBodies.kt:121-243) hand-writes boundary frames, computes a byte-exact contentLength via a parallel tally explicitly annotated 'must remain in sync with writeTo' (:126,154-181), streams InputStream parts without buffering when possible (:45-48), reports repeatable = parts.all{repeatable} (:183), and closes every part (:185-187). MultipartField (Values.kt:609-723) carries value+contentType+filename with the documented octet-stream/text-plain default at :693-700. The finding's two mandated changes are exactly right and non-negotiable for us: (1) write to our BufferedSink so file parts dispatch FileRequestBody.transferTo (FileRequestBody.kt:106-126) instead of copyTo; (2) non-file value serialization must route through the Serde seam (sdk-serde-jackson), because openai HARD-imports Jackson into core (HttpRequestBodies.kt:7-9) which our zero-dep rule forbids. The byte-exact-length technique is worth copying (lets transports set Content-Length instead of chunked), but the dual-method drift hazard (their antipattern #2) is real — drive both writeTo and contentLength from ONE shared per-part frame-size function so they cannot diverge. Correctly targeted at codegen (multipart is per-endpoint). Apache-2.0: attribute if framing is lifted near-verbatim. + - *Do:* Plan a core MultipartRequestBody: boundary framing + a SINGLE shared frame-size function feeding both writeTo(BufferedSink) and contentLength() (avoid their two-independent-literal drift trap); FileRequestBody parts stream zero-copy; plus a MultipartField-like part descriptor (value+contentType+filename, same smart defaults). Codegen builds the field map and supplies a Serde for non-file parts; core orchestrates bytes only, no Jackson. Attribute openai (Apache-2.0) if the framing is copied verbatim. +- **Hold the line on BufferedSink body writes; record why it is intentional (do NOT regress to writeTo(OutputStream))** `LEARN` · `docs/process` · effort S · confidence high · we partly do this + - *Verdict:* Accurate, and the evidence is stronger than the finding states. openai's HttpRequestBody.writeTo(OutputStream) (HttpRequestBody.kt:8) makes the okhttp forward bridge a one-liner (toRequestBody at OkHttpClient.kt:294: `writeTo(sink.outputStream())`), but the REVERSE adapter (:328-332) immediately re-wraps the OutputStream back into an okio sink (`outputStream.sink().buffer()`) — concrete proof the OutputStream abstraction loses information and forces a re-wrap, exactly as their okhttp adapter demonstrates. Our RequestBody.writeTo(BufferedSink) (RequestBody.kt:52) costs a per-transport wrap (SdkRequestBodyAdapter.kt:51-53 double .use through Io.provider.sink(os)) and makes bodies unusable before Io.installProvider runs — but that seam is what enables FileRequestBody.transferTo zero-copy (FileRequestBody.kt:106-126), TeeSink logging, and segment reuse, none of which the OutputStream model can express without copying. So the direction is correct and we already do the better thing (weAlreadyDoIt=true). RE-CATEGORIZED from SIMPLIFY to LEARN: the original 'SIMPLIFY' label is misleading — there is no simplification to perform; this is a 'do not let someone simplify it away' guardrail. Target moved to docs/process: the only action is a one-line KDoc rationale, not code. + - *Do:* Add a one-line rationale to RequestBody.writeTo KDoc: BufferedSink (not OutputStream) is deliberate to enable transferTo/TeeSink zero-copy; the per-transport sink wrap is the accepted cost; do not regress to OutputStream. No code change. Low priority. +- **FUTURE CODEGEN: generated typed responses should wrap core Response with a lazy parse-once decorator (a la HttpResponseFor.parseable)** `LEARN` · `codegen` · effort S · confidence medium + - *Verdict:* Accurate but low-novelty. openai splits the raw transport response (HttpResponse: statusCode/headers/body:InputStream/close, HttpResponse.kt:8-30) from the typed view (HttpResponseFor adds parse(), HttpResponseFor.kt:5-8), and parseable() memoizes via `private val parsed: T by lazy { parse() }` (:11-25), delegating status/headers/body/close to the wrapped response. Our Response (Response.kt:41-48) is correctly raw-only with no parse() seam — that is right, deserialization is codegen/serde territory. The SHAPE guidance (codegen emits a thin typed wrapper around core Response that lazily deserializes once via Serde, exposing raw status/headers/body + parsed value) is sound. Honest assessment: `by lazy` memoization of a parse is a standard Kotlin idiom, not a clever trick — confidence dropped to medium and value is modest. It is a useful note to bank for the codegen design doc, not a discovery. Correctly LEARN/codegen. + - *Do:* Record in docs/refs-comparison.md (or the codegen design doc) that generated typed responses wrap core Response with a lazy parse-once decorator, delegating status/headers/body/close and memoizing the deserialized value through our Serde seam (never Jackson directly). Design note only; revisit when the KotlinPoet generator is built. +- **Decide: resolved-java.net.URL model vs a deconstructed baseUrl/pathSegments/queryParams request (root fork for this subsystem)** `LEARN` · `sdk-core` · effort L · confidence high + - *Verdict:* Accurate. openai keeps URL in pieces (HttpRequest.kt:10-14) and assembles+encodes lazily in url() (:17-44), percent-encoding each path segment and each query pair with URLEncoder at serialize time. We store one fully-resolved java.net.URL built eagerly via URL(string) (Request.kt:139). The finding correctly identifies the one place WE are genuinely ahead: Request compares/hashes by url.toExternalForm() (Request.kt:52-71) with an explicit, correct rationale that java.net.URL.equals does blocking DNS and is wrong for virtual hosts — openai's HttpRequest has no equals/hashCode override at all (it relies on field equality of a recomputed string). The cost of our choice is real and is exactly what Findings 1 and the RequestRebuilder string surgery expose: structured URL manipulation has nowhere to live. This is correctly categorized LEARN (a pre-1.0 architectural decision, not a mechanical change) and is the blocker for Finding 1. It is the single highest-leverage item in the set. + - *Do:* Treat as a design decision to make NOW (0.0.1-alpha.1). Prototype a deconstructed Request (scheme/host/port/userInfo + pathSegments + QueryParams + headers + body) with a cached resolved-URL accessor that STILL compares by external form (preserve our DNS-free equality — do not lose it). Compare ergonomics against keeping java.net.URL + a sidecar QueryParams. Assemble with java.net.URI/URLEncoder only (stay zero-dep). Record the decision in docs/architecture.md before refactoring; Finding 1 depends on the outcome. + +**Considered & dropped** + +- ~~(no findings dropped as inaccurate or already-done-and-superior)~~ — All seven original findings survived verification — the source analysis was unusually rigorous and every file:line citation checked out (with only minor framing nits, e.g. QueryParams flattens arrays to key[] not 'repeated keys', and the SdkRequestBodyAdapter wrap is lines 47-55 not 51-55). I re-categorized two items rather than dropping them: Finding 3 (BufferedSink) moved SIMPLIFY->LEARN and target sdk-core->docs/process because we ALREADY do the better thing (weAlreadyDoIt=true) and there is no code change — it is a guardrail note, the lowest-value item. Finding 6 (lazy parse decorator) kept but confidence lowered to medium since `by lazy` memoization is a standard idiom, not a novel technique. The 'weAreAhead' claims in the source were spot-checked (openai Headers.Builder.put has NO CR/LF guard vs our validateValues at Headers.kt:325-335; our RequestBody consume-once AtomicBoolean CAS guards at RequestBody.kt:307,335; DNS-free URL equality at Request.kt:52-71) and all hold. + +**Do not copy** + +1) HARD JACKSON DEPENDENCY IN CORE: HttpRequestBodies.kt:7-9 imports com.fasterxml.jackson.databind.{JsonNode,JsonMapper,node.JsonNodeType} directly into the core http package, and Check.kt embeds a whole Jackson-version-compatibility checker (Check.kt:60-110) referencing five Jackson PackageVersion classes. This is exactly the coupling our sdk-core forbids (zero runtime deps beyond SLF4J). Do NOT replicate — multipart/JSON serialization must stay behind our Serde seam in adapter modules; core only orchestrates bytes through BufferedSink. 2) FRAGILE PARALLEL contentLength TALLY: MultipartBody.contentLength() (HttpRequestBodies.kt:154-181) is a hand-summed mirror of writeTo() guarded only by a comment 'This must remain in sync with writeTo' (:126,154). The technique (byte-exact length so Content-Length can be set) is worth copying, but the dual-maintenance hazard is real — if we adopt it, the two methods should be driven from one shared frame-size function so they cannot drift, rather than two independent literals. 3) RequestOptions ON THE TRANSPORT SPI: threading RequestOptions through execute/executeAsync (HttpClient.kt:9-22) widens the SPI to 4 methods and kills SAM-constructibility. We must carry per-request options on the context/pipeline instead and keep our fun interface single-method. 4) BODY writeTo(OutputStream) + checked-AutoCloseable on the body interface (HttpRequestBody.kt:6,24): the OutputStream surface blocks zero-copy file transfer; their own okhttp adapter immediately re-wraps it back into an okio sink (OkHttpClient.kt:294,328-332), proving the abstraction loses information. Our BufferedSink choice is better — don't regress. 5) MUTABLE size counter desync risk: Headers/QueryParams track size as a separate field mutated on every put/remove (Headers.kt:39,57,85) rather than deriving it; correct here but an easy place to introduce a bug — our Headers wisely exposes no such counter. + +**Where we're ahead** + +Several concrete places, all citable: (1) HEADER-INJECTION DEFENSE AT THE MODEL LAYER — our Headers.Builder rejects CR/LF in values at build time (Headers.kt:325-335) to stop request/header splitting uniformly across transports; theirs has no such guard on Headers at all (only the multipart disposition escapes CR/LF, HttpRequestBodies.kt:230-241), so a raw header value with \\n reaches the transport. (2) DNS-FREE URL EQUALITY — Request compares/hashes by url.toExternalForm() with an explicit rationale that java.net.URL.equals does blocking DNS and is wrong for virtual hosts (Request.kt:30-71); their HttpRequest equality is the default and url() is a recomputed string. (3) EXPLICIT RETRY-REPLAY SEAM — RequestBody.isReplayable()/toReplayable(provider) (RequestBody.kt:62-83) is a richer, actionable contract than their lone repeatable() boolean (HttpRequestBody.kt:21); we additionally provide mark/reset-aware InputStream replay (RequestBody.kt:163-175) and a documented partial-write hazard. (4) RACE-SAFE CONSUME-ONCE GUARDS — single-use bodies use AtomicBoolean CAS so concurrent writeTo fails loudly instead of emitting corrupt/zero bytes (RequestBody.kt:298-313, 327-340); their one-shot multipart parts have no such guard (HttpRequestBodies.kt:74-87). (5) ZERO-COPY FILE BODY WITH BYTE-RANGE — FileRequestBody.transferTo + position/count slicing + toByteBuffer() mmap (FileRequestBody.kt:106-160); they have no file-body type, only generic InputStream parts. (6) MEDIATYPE ROUND-TRIP CORRECTNESS — our MediaType quotes/escapes parameter values and splits respecting quotes so parse(toString(x))==x for boundaries containing ';' (MediaType.kt:87-113, 237-267); they have no MediaType model in core (they lean on okhttp's). (7) TYPED HEADER NAME API — HttpHeaderName overloads alongside the String API (Headers.kt:49-74,170-185), absent in their Headers. (8) DEFENSIVE-COPY IMMUTABILITY — our Headers.build() and entries() return per-list Collections.unmodifiableList copies that reject setValue even via cast (Headers.kt:96-105,288-298); theirs only toImmutable() the outer structure. + +_Verifier notes:_ Verification method: read all nine openai reference files line-by-line plus the openai okhttp transport (OkHttpClient.kt:92-105, 283-336) and Values.kt MultipartField (609-723); read all ten cited dexpace files; grepped to confirm absences (RequestOptions: zero hits in sdk-core/src/main; multipart: only MediaType/CommonMediaTypes name it; QueryParam.kt is a literal TODO stub). + +Decision-ready priority order: +1. Finding 2 (URL model fork, LEARN, sdk-core) — ROOT decision; blocks Finding 1; highest leverage; make the call now while pre-1.0. +2. Finding 1 (QueryParams multimap, ADOPT, sdk-core) — depends on #2; deletes dead QueryParam.kt and the RequestRebuilder string surgery; gives multi-value query support. +3. Finding 4 (per-request options on context, ADOPT, sdk-core) — real capability gap; keep SPI single-method, carry on context chain. +4. Finding 5 (multipart, ADOPT, codegen) — large, future codegen work; copy the byte-exact-length idea but drive writeTo + contentLength from ONE shared frame-size fn (avoid their drift hazard); Serde-injected, BufferedSink, zero-dep core. +5. Finding 7 (checkRequired helper, COPY, both) — small consistency win; payoff is at codegen scale. +6. Finding 6 (lazy parse decorator, LEARN, codegen) — design note for the codegen doc; modest novelty. +7. Finding 3 (hold-the-line BufferedSink, LEARN, docs/process) — one-line KDoc; we already do the better thing. + +Constraint compliance: every item respects Java-8 (Duration/UUID/byte arrays only), zero-dep sdk-core (multipart value serialization and parse routed through Serde, never Jackson-in-core), single-method fun-interface SPIs (options ride the context chain, not the SPI), and immutable+Builder. The source analysis's antipattern list (Jackson-in-core, fragile parallel contentLength tally, RequestOptions-on-SPI, OutputStream body surface, mutable size counter) is accurate and correctly warns us off the traps; I folded those warnings into the relevant recommendations. + +Two openai claims slightly imprecise in the source summary but immaterial to conclusions: (a) QueryParams arrays serialize to `key[]` (QueryParams.kt:46), Headers arrays to repeated same-name keys (Headers.kt:48), Headers objects to `name.nested` (Headers.kt:50) vs QueryParams objects to `key[nested]` (:48) — the summary said 'arrays become repeated keys / key[]' which conflates the two types; (b) the SdkRequestBodyAdapter wrap spans lines 47-55, the source cited 51-55. + +--- + +## 3. HttpClient decorator stack (composition pattern) + proxy + workload http + +**What it is** + +openai-java composes cross-cutting HTTP behavior by NESTING HttpClient decorators. `HttpClient` (HttpClient.kt:7) is a 2-method SPI (`execute`/`executeAsync` + `close`). At client-build time the runtime wraps it: `RetryingHttpClient` ⊃ `LoggingHttpClient` ⊃ `WorkloadIdentityHttpClient` ⊃ `PhantomReachableClosingHttpClient` ⊃ base okhttp client. Each decorator implements the same interface, delegates to its inner `httpClient`, and adds one concern. Ordering is encoded purely by nesting order — retry is outermost, so a retry re-enters logging+auth on every attempt. There is no central registry, no stage enum, no exactly-one enforcement: the order lives in whatever builder assembles the chain. `PhantomReachableClosingHttpClient` is a pure safety-net decorator that registers the inner client with a `java.lang.ref.Cleaner` (reflectively, Java-8-safe — PhantomReachable.kt:31) so a forgotten `close()` is eventually honored. `ProxyAuthenticator` is a `fun interface` (407 → Optional) with a Base64 Basic factory. `WorkloadIdentityHttpClient` is the OAuth-token decorator: stamps `Authorization: Bearer`, and on a 401 closes the response, invalidates the cached token, and throws `OpenAIRetryableException` so the outer `RetryingHttpClient` re-runs the whole stack and re-fetches the token. OUR equivalent is a single ordered `Stage` pipeline (Stage.kt) where each concern is an `HttpStep` placed by a declared stage; re-driving (retry/redirect) is done via `PipelineNext.copy()` rather than by re-entering a wrapper. + +**How it works (line-level)** + +RetryingHttpClient.kt:44-73 is the sync retry loop: `while(true)` → `httpClient.execute` → `if (++retries > maxRetries || !shouldRetry(response)) return response`; on throw, `null` and loop. Line 71 `response?.close()` closes the failed response BEFORE `sleeper.sleep(backoffDuration)` — same "close-before-sleep" discipline as our DefaultRetryStep.kt:314. Eligibility is `request.body?.repeatable() ?: true` (RetryingHttpClient.kt:143) — note this is method-AGNOSTIC: a non-repeatable-body POST is not retried, but a repeatable-body POST IS auto-retried with no idempotency-key requirement. Backoff: `min(0.5 * 2.0.pow(retries-1), 8.0)` with `1.0 - 0.25*random` jitter (lines 222-225), preceded by Retry-After-Ms / Retry-After / RFC_1123 date parsing (lines 197-214). Retry-count observability: line 146 sets `X-Stainless-Retry-Count` per attempt unless the caller already set it (line 84). Async retry (lines 88-132) is a recursive `executeWithRetries` driven by `responseFuture.handleAsync(...) { it.run() }` (same-thread executor) + `.thenCompose(Function.identity())` to flatten the `CompletableFuture>`. The sleeper: DefaultSleeper.kt:12 `override fun sleep(duration) = Thread.sleep(duration.toMillis())` — carrier-pinning under Loom, truncates sub-ms delays to 0, and a raw InterruptedException escapes `RetryingHttpClient.execute` un-normalized (flag cleared, not restored, not wrapped as InterruptedIOException). Async sleep uses one `java.util.Timer` (DefaultSleeper.kt:10) — single thread, dies permanently on an uncaught task exception. LoggingHttpClient.kt:53-67 wraps execute: snapshot `before`, on throw `logFailure` then rethrow; on success `logResponse`. Because retry wraps logging, EACH attempt is logged. The clever bit is LoggingBuffer (LoggingHttpClient.kt:424-539): a streaming line-by-line body logger that, when charset is unknown, prefetches up to 256 bytes (PROBABLY_UTF8_BYTE_LIMIT) and runs `isProbablyUtf8` (line 556 — rejects non-whitespace ISO control chars) and prints "(binary body omitted)" instead of garbage. PhantomReachableClosingHttpClient.kt:13-15 `init { closeWhenPhantomReachable(this, httpClient) }` with the guard (PhantomReachable.kt:15) that observed !== closeable (else it never becomes phantom-reachable). WorkloadIdentityHttpClient.kt:24-28: `if (response.statusCode() == 401) { response.close(); workloadIdentityAuth.invalidateToken(); throw OpenAIRetryableException(...) }`. + +**vs. our SDK** + +We do NOT use decorators; we use an ordered stage pipeline. Stage.kt:26-49 declares stages with sparse `order` (REDIRECT=100, RETRY=200, AUTH=400, LOGGING=700, SEND=1300) and an `isPillar` flag. HttpStep.kt:31 is `process(request, next): Response` + a `val stage`. Re-driving is `PipelineNext.copy()` (PipelineNext.kt:54) which forks a fresh `PipelineCallState` cursor (PipelineCallState.kt:54). Our retry is DefaultRetryStep.kt (pillar at Stage.RETRY) — an iterative loop (line 182) calling `next.copy().process()` per attempt (line 277); our logging pillar is at Stage.LOGGING (InstrumentationStep.kt:25), which sits INSIDE retry by virtue of order 700 > 200, giving us the same per-attempt-logging property openai gets from nesting. Our ordering is enforced and visible (one enum), theirs is implicit in builder nesting. We have a SECOND, parallel retry: pipeline/step/retry/RetryStep.kt — a `ResponseRecoveryStep` folding `ResponseOutcome` (ResponsePipeline.kt:76) — which is the recovery-as-data model, and it uses ScheduledExecutorService + CompletableFuture.get (RetryStep.kt:291-322), NOT Thread.sleep. Our transports own proxy: OkHttpTransport.kt:339-386 (`applyProxy`, Basic via okhttp `Authenticator`, NonProxyHostSelector); ProxyOptions.kt is the core config carrier; the JDK transport delegates Basic/Digest to `java.net.http`. We have NO phantom-reachable safety net (grep confirms zero `java.lang.ref` use in core/transports except an unrelated ContextStore). We have NO workload-identity / OAuth-401-eviction step shipped (AuthStep.kt:119 has the `authorizeRequestOnChallenge` HOOK for it, but no concrete impl). We do NOT emit a retry-count header. + +**Recommendations (verified)** + +- **Binary-body detection in body logging (isProbablyUtf8 sniff) — port the sniff into our preview path** `COPY` · `sdk-core` · effort S · confidence high + - *Verdict:* Reference verified: isProbablyUtf8 (LoggingHttpClient.kt:556-577) decodes with REPORT-on-malformed, walks up to 64 code points, returns false on any non-whitespace ISO control char or on CharacterCodingException; the LoggingBuffer prefetch/suppress (lines 449-490) accumulates up to PROBABLY_UTF8_BYTE_LIMIT=256 bytes before deciding, then nulls prefetchBuffer so steady-state is allocation-light. Java-8 safe (Character.codePointAt/charCount/isISOControl all 1.5+, CharsetDecoder is JDK). Our gap is real and I confirmed it: DefaultInstrumentationStep.utf8Preview (DefaultInstrumentationStep.kt:323-328) does `String(bytes, Charsets.UTF_8)`, which SILENTLY REPLACES invalid bytes with U+FFFD rather than detecting binary — a gzip/protobuf body logs as a run of replacement chars. So the foot-gun is real. TWO important corrections to the analysis: (1) the 'bloats logs' rationale is already mitigated — our previews are size-bounded to bodyPreviewMaxBytes (DefaultInstrumentationStep.kt:111,156,161), so only terminal-corruption remains as the live harm, not log bloat; (2) the liftable surface is ONLY isProbablyUtf8 + the '(binary body omitted)' decision applied to our already-captured preview ByteArray. Their prefetch/suppress streaming machinery is coupled to their line-by-line System.err model, which we do NOT share (we emit one bounded structured field), so do NOT lift LoggingBuffer wholesale — that would be over-engineering for our model. + - *Do:* In DefaultInstrumentationStep.utf8Preview (and the request-side preview), when the body's Content-Type carries no charset, run a ported `isProbablyUtf8(bytes)` over the bounded snapshot; if it fails, emit the field as "(binary body omitted)" instead of a String(...) decode. Reuse the 64-code-point limit. Cite openai-java (Apache-2.0) in the file header. Skip porting LoggingBuffer's prefetch/suppress — it is for line-streaming, not our structured-field model. +- **Emit an optional, neutrally-named retry-count header per attempt** `ADOPT` · `sdk-core` · effort S · confidence medium + - *Verdict:* Reference verified: RetryingHttpClient stamps X-Stainless-Retry-Count each attempt (lines 45-47, 145-146) only if the caller hasn't set it (shouldSendRetryCount guard, lines 39-40). Our gap is real: DefaultRetryStep logs http.retry.try_count locally (DefaultRetryStep.kt:537) but stamps nothing on the wire. The capability is genuinely useful for server-side/proxy correlation of duplicate attempts. BUT I am downgrading confidence to medium and pushing back on the framing: (1) it is a thin, low-stakes nicety, not a high-value primitive — most servers don't act on it and our structured local logs + an idempotency key already give correlation; openai's own header is vendor-branded precisely because it's an OpenAI-internal telemetry convenience, not a standard. (2) The implementation must respect our documented request-immutability invariant (DefaultRetryStep.kt:147-160): stamp on the per-attempt copy, never mutate the template — the analysis flags this correctly. (3) Default OFF; the header name MUST be configurable/neutral (not X-Stainless-*, not an OpenAI brand) — a toolkit emitting a vendor header by default would be wrong. Net: worth doing because it's cheap and additive, but it is an S-effort convenience, not a headline feature. + - *Do:* Add an optional `retryCountHeader: HttpHeaderName? = null` to HttpRetryOptions (default null = off). In DefaultRetryStep, before each `next.copy().process()` on a retry attempt, if set and the caller hasn't supplied that header, stamp tryCount via `request.newBuilder().header(...)` on the per-attempt copy (honor the immutability invariant). Do the same in the recovery RetryStep if desired. Do not ship a branded default name. +- **Phantom-reachable Cleaner safety-net for forgotten close() — optional opt-in util + transport wiring** `ADOPT` · `both` · effort M · confidence high + - *Verdict:* Reference verified line-by-line. PhantomReachable.kt is exactly as described: reflective Cleaner via `by lazy` (line 31), no-op on JDK 8 when `Class.forName("java.lang.ref.Cleaner")` throws (line 52-54), and the `observed !== closeable` self-reference guard (line 15) is the genuine correctness subtlety — registering a client to observe itself means it never becomes phantom-reachable. The InvocationTargetException re-throw nuance (lines 43-47: rethrow RuntimeException/Error from the cause, wrap everything else) is real and worth preserving. Our gap is confirmed: zero java.lang.ref use in core/transports except an unrelated ContextStore WeakReference comment (ContextStore.kt:41). Our transports DO leak real resources if close() is skipped — OkHttpTransport owns a dispatcher executor + connection pool + cache (close() at lines 196-214); JdkHttpTransport owns an ExecutorService. The owned/BYO split is exactly where this belongs: owned=true clients (OkHttpTransport builder line 318, JdkHttpTransport builder) get the net; owned=false `create(...)` clients (lines 230 / 209-210) must NOT, since the caller owns lifecycle. Both transports already have idempotent, thread-safe close() (CAS latch, OkHttpTransport.kt:184) so a Cleaner-thread-driven close is safe — the ADOPT's own caveat holds. One correction to the framing: this is genuinely useful but it is a SAFETY NET for a programming error, not a feature; it must never become an excuse to not close. Also note the util itself is dep-free and Java-8-safe and belongs in sdk-core, but the WIRING is per-transport (in the adapter build() methods), so target is 'both', not 'sdk-core' alone as the analysis said. + - *Do:* Add `internal fun closeWhenPhantomReachable(observed: Any, c: AutoCloseable)` to sdk-core (e.g. core/util/Cleaners.kt) using the reflective-Cleaner pattern (Apache-2.0 — attribute openai-java in the file header), preserving the self-reference check AND the InvocationTargetException rethrow nuance. Do NOT add a `PhantomReachableClosingHttpClient` decorator to core — that contradicts our stage-pipeline model and would be a dead wrapper; instead register the net directly inside each transport's owned-path build() (`closeWhenPhantomReachable(this, underlyingClient)` keyed on a non-self sentinel) so owned=true clients are covered and BYO are not. Verify the no-op-on-JDK-8 degradation with a test on the Java-8 modules. Keep the observed object NOT equal to the closeable (register the transport facade as observed, the underlying client/executor as closeable). +- **OAuth/token 401-eviction AuthStep — ship a concrete invalidate-on-401 step (we have only the hook)** `ADOPT` · `sdk-core` · effort M · confidence high + - *Verdict:* Both sides verified. openai's WorkloadIdentityHttpClient (lines 18-31 sync, 41-54 async) does exactly what's claimed: stamp Bearer, on 401 close()+invalidateToken()+throw OpenAIRetryableException, leaning on the outer RetryingHttpClient to re-run and re-fetch. Our state confirmed: AuthStep has the `authorizeRequestOnChallenge` hook (AuthStep.kt:118-122, default returns null), BearerTokenAuthStep has the lock-guarded double-checked token cache (BearerTokenAuthStep.kt:63-95) and its KDoc explicitly invites this subclass (lines 47-51), docs/pipelines.md:267 lists 'Auth-401 eviction' as a planned recovery step — but no shipped concrete impl. The analysis's sharpest insight is accurate and worth preserving: because AUTH(400) runs INSIDE RETRY(200) by stage order, an AuthStep that evicts the token and throws a classified-retryable exception gets re-driven by DefaultRetryStep for free — the same free re-drive openai gets from decorator nesting. AuthStep already has BOTH wiring paths: it does its OWN single in-step retry via `next.copy().process(retryRequest)` (AuthStep.kt:94) when the hook returns a request, OR a thrown retryable maps to the outer retry. Caveat the analysis correctly raises and I confirm: the cross-origin marker (AuthStep.kt:69-76) means an evicted+re-stamped token is correctly NOT applied to a foreign host. One nuance to add: prefer the in-step hook path (return a re-stamped request) over the throw-retryable path, because the throw path depends on DefaultRetryStep classifying the exception as retryable AND the request being retry-safe — a bare POST (body-less, non-idempotent) would NOT be re-driven by DefaultRetryStep (isRetrySafe, DefaultRetryStep.kt:255-258), so a throw-based eviction silently fails to retry on non-idempotent calls; the in-step hook has no such gate. + - *Do:* Ship a `CachedBearerTokenAuthStep` (or a refresh-on-401 flag on BearerTokenAuthStep) overriding authorizeRequestOnChallenge: on 401+WWW-Authenticate, evict the ReentrantLock-guarded cache and return a freshly-stamped request so AuthStep's built-in single retry (next.copy().process) re-drives — this avoids the body-replayability gate that the throw-retryable path would hit on non-idempotent requests. Keep the token provider an interface (BearerTokenProvider already is) so OAuth/workload-identity providers live in adapter modules, not core. Document the two wiring options but recommend the hook path as default. +- **DefaultRetryStep (our PRIMARY retry) uses Thread.sleep and pins Loom carriers — the analysis's 'we are already correct' is WRONG for the stage pipeline** `ADOPT` · `sdk-core` · effort M · confidence high · claim-qualified + - *Verdict:* This corrects the analysis's SIMPLIFY finding AND weAreAhead #2, both of which are factually wrong. The analysis claims 'our two retry impls use ScheduledExecutorService+CompletableFuture.get' and that 'DefaultRetryStep's sleepOrAbort does the same'. FALSE. Only the recovery-pipeline `pipeline.step.retry.RetryStep` uses the scheduler (RetryStep.kt:291-322, awaitDelay). Our PRIMARY stage-pipeline retry, DefaultRetryStep, sleeps via `clock.sleep(delay)` (DefaultRetryStep.kt:516) → SystemClock.sleep → `Thread.sleep(millis, nanos)` (Clock.kt:72). That IS a blocking Thread.sleep and WILL pin a virtual-thread carrier under Loom — the exact defect the analysis correctly identifies in openai's DefaultSleeper.kt:12 and that our own CLAUDE.md 'ReentrantLock over synchronized / Loom-safe' ethos warns against. So openai's antipattern #1 is only PARTIALLY something we avoid: of its three sub-defects, we avoid two (SystemClock.sleep forwards the nanos remainder so NO sub-ms truncation, Clock.kt:69-72; and it IS interrupt-correct — restores the flag and throws InterruptedIOException, DefaultRetryStep.sleepOrAbort lines 509-523 + Clock.kt:73-78), but we share the third (carrier pinning). For a virtual-thread-heavy consumer, every backoff window on the default stage pipeline parks a carrier. This is a real latent gap, not a guardrail note. Effort is genuinely S-M: the scheduler-based awaitDelay already exists in the sibling RetryStep and could be factored into Clock (e.g. an interruptible scheduler-backed sleep) or DefaultRetryStep could grow an injectable scheduler like RetrySettings.scheduler. + - *Do:* Decide deliberately whether DefaultRetryStep should keep Thread.sleep. If Loom-friendliness on the default pipeline matters (it should, given the SDK's stated Loom posture), give DefaultRetryStep a ScheduledExecutorService-backed wait mirroring RetryStep.awaitDelay (reuse the daemon DEFAULT_SCHEDULER pattern, RetryStep.kt:366-370), preserving the existing interrupt handling. At minimum, correct CLAUDE.md/docs and the analysis's false 'both impls use the scheduler' claim, and add a comment on DefaultRetryStep.sleepOrAbort stating the Thread.sleep is a known carrier-pinning point. Do NOT cite this as a place we are ahead of openai — on this axis the default path is at parity, not ahead. +- **ProxyAuthenticator-shaped 407 SPI — close the dead ProxyOptions.challengeHandler gap** `ADOPT` · `both` · effort M · confidence medium + - *Verdict:* Both sides verified. openai's ProxyAuthenticator (ProxyAuthenticator.kt:18-58) is a `fun interface` taking the 407 response and returning Optional, with a basic(user,pass,charset) factory defaulting to ISO-8859-1 (line 42) — the RFC-7617 charset detail is correct and worth matching. Our gap is real and worse than 'split': ProxyOptions.challengeHandler is a ChallengeHandler? slot (ProxyOptions.kt:56) documented as 'currently not honoured by any shipped transport' (lines 35-37); the OkHttp transport hard-codes Basic and logs a WARNING that challengeHandler is ignored (OkHttpTransport.kt:343-352), and the JDK transport delegates Basic/Digest to java.net.http. So a consumer configuring a Digest challenge handler silently gets Basic-or-nothing. This is a genuine correctness/usability bug (a configured field does nothing). Downgrading confidence to medium for one reason the analysis itself honestly flags: HTTP-proxy 407 for an HTTPS CONNECT tunnel is handled below our pipeline at socket setup, so a core-level pipeline step CANNOT see it for HTTPS-via-CONNECT — which is precisely why openai's interface is invoked BY the transport (it takes the raw 407), not as a pipeline step. So the honest fix is transport-level, not a core step; core's role is only to carry the SPI shape. Recategorizing from the analysis's LEARN to ADOPT, because there is concrete work (wire the dead field or replace it), not just an insight. Matching the ISO-8859-1 Basic default is a free correctness win regardless. + - *Do:* Either (a) honor the existing ProxyOptions.challengeHandler by adapting it to each transport's native proxy-authenticator hook (OkHttp Authenticator at OkHttpTransport.kt:369; JDK java.net.Authenticator), or (b) replace ChallengeHandler with a ProxyAuthenticator-shaped `(407 Response) -> Request?` SPI in core and wire it through both transports. Either way, fix the ISO-8859-1 Basic default and STOP shipping a config field that only logs a warning. Be explicit in docs that full custom-scheme proxy auth on HTTPS CONNECT requires transport cooperation. +- **Capture the deliberate 'stages, not decorators' trade-off in docs** `LEARN` · `docs/process` · effort S · confidence high + - *Verdict:* Trade-off analysis is accurate on both sides and the verdict is correct. openai's composition IS just nesting: each concern is `class X(inner): HttpClient`, ordering = construction order, re-drive = the wrapper's own loop calling inner.execute again (RetryingHttpClient.kt:44-73). It is trivially testable and has zero framework, but ordering is unenforced (nothing prevents logging-outside-retry or two nested RetryingHttpClients) and every concern is authored twice (sync lines 35-74 vs async 76-133). Our model genuinely buys what's claimed: enforced ordering via the Stage enum (Stage.kt:26-49), pillar exactly-one with replace-emits-warning (StagedSteps.installPillar line 179, HttpPipelineBuilder onPillarReplaced lines 31-38) — so a second retry install REPLACES rather than silently nesting, which structurally prevents double-retry; logging-inside-retry guaranteed by order (LOGGING=700 > RETRY=200); surgical insertAfter/replace/remove with cross-stage rejection (StagedSteps.kt:110-165). The cost the analysis names is real and the most useful part of this finding: HttpStep.process's copy()-to-re-drive contract (HttpStep.kt:36-41) is genuinely more error-prone than a decorator's plain inner.execute loop — a step author who forgets next.copy() silently resumes past already-visited steps. That asymmetry is worth documenting so step authors are warned. Verdict stands: we are correctly sized for a toolkit; this is rationale capture, not a code change. + - *Do:* Add a short 'Why stages, not decorators' section to docs/pipelines.md: state the two failure modes the stage model prevents (double-retry nesting; logging-outside-retry), the one cost it adds (the next.copy()-to-re-drive contract that decorators can't get wrong), and an explicit warning in the HttpStep author guidance that re-driving REQUIRES next.copy(). Pre-empts the 'isn't this over-engineered vs wrapping a client' review question. +- **Per-call RequestOptions threaded through the SPI — open design question for codegen** `LEARN` · `both` · effort L · confidence medium + - *Verdict:* Reference accurate: openai's HttpClient.execute takes a RequestOptions on every call (HttpClient.kt:9-22) and every decorator forwards it unchanged. Our gap is real: our transport SPI is `execute(request): Response` (client/HttpClient.kt:51), single-arg, and PipelineNext.process calls `state.httpClient.execute(state.request)` with no per-call options channel (PipelineNext.kt:30) — per-call data must ride on the Request (headers) or DispatchContext, and DispatchContext is NOT passed to the transport at all. For codegen-generated services, per-request timeout/header/idempotency overrides are a common feature, so this is a legitimate forward-looking question. Keeping at LEARN (not ADOPT) is correct: it is a deliberate design decision to make BEFORE codegen, not a bolt-on, and our 'everything immutable on the Request' leaning is a valid alternative to widening the SPI. The analysis is appropriately non-prescriptive. One nuance: widening the 1-arg transport SPI is the single biggest conflict with our stated 'transport only knows send one Request' minimalism (client/HttpClient.kt:18-20), so option (1) — a RequestOptions carried INSIDE Request, keeping the 1-arg SPI — is the more on-brand choice for us and should be the leading candidate, not a coin-flip. + - *Do:* Decide where per-call config lives before codegen lands; lead with option (1): an optional per-call RequestOptions/overrides object carried inside Request (or DispatchContext threaded into PipelineNext→transport as a fallback), which preserves the 1-arg transport SPI and our toolkit minimalism. Document the decision in docs/architecture.md. Do not widen the transport SPI per-service later. + +**Considered & dropped** + +- ~~Decorator-stack vs stage-pipeline simplicity lesson (the LEARN, as originally framed alongside the SIMPLIFY guardrail)~~ — NOT dropped as a finding — kept and merged into 'Capture the stages-not-decorators trade-off in docs'. Listing here only to note the analysis split one idea (the trade-off) across a LEARN item and a SIMPLIFY item; I consolidated the rationale-capture into the single docs LEARN and converted the SIMPLIFY into a corrected ADOPT (DefaultRetryStep Thread.sleep gap), because the SIMPLIFY's premise ('we are already correct, no change needed') was factually false for the stage pipeline. +- ~~Antipattern: single java.util.Timer for all async sleeps (DefaultSleeper.kt:10)~~ — Accurate observation about openai (one Timer thread, dies permanently on any uncaught task exception) and we are correctly ahead (RetryStep.kt:366 daemon ScheduledExecutorService), but it is purely a do-not-import note with no action for us — no decision-ready item. Folded as supporting context into the DefaultRetryStep finding's critique rather than a standalone finding. +- ~~Antipattern: System.err.println logging instead of SLF4J (LoggingHttpClient.kt:98)~~ — Accurate (openai logs to System.err line 98 et al.; we correctly use ClientLogger/SLF4J) but it is a thing we already do right with zero action. Generic 'we are ahead' filler — dropped per the keep-only-actionable bar. +- ~~Antipattern: method-agnostic retry eligibility `request.body?.repeatable() ?: true` (RetryingHttpClient.kt:143)~~ — Verified accurate, but it produces no new action for us: our gate (DefaultRetryStep.isRetrySafe lines 255-258; RetryStep.canRetry lines 352-355) already applies the SAME body-replayability check AND additionally requires method idempotency for body-less requests, which is strictly safer than openai's. The analysis's own advice is 'do not regress to theirs' — i.e. keep what we have. No change, no decision; dropped. +- ~~weAreAhead bundle (recovery-as-data ResponseOutcome, resource-discipline close-before-propagate, surgical pipeline editing, zero-dep core)~~ — All individually verified TRUE (ResponseOutcome fold ResponsePipeline.kt:76-85 and the Airbyte-defect fix ExecutionPipeline.kt:96-111; close-before-propagate in ResponsePipeline.kt:104, AuthStep.kt:88-92, DefaultRedirectStep.kt:139-142; surgical edits StagedSteps.kt:110-165; dep-free core vs their okhttp+Jackson-in-core). But these are self-congratulation with no action. The one genuinely useful 'ahead' point (interrupt-correct backoff) is partly FALSE for DefaultRetryStep and is surfaced as its own corrected ADOPT finding. The rest is dropped as non-actionable. + +**Do not copy** + +1) Thread.sleep for retry backoff (DefaultSleeper.kt:12) — carrier-pinning under Loom, truncates sub-ms delays to 0, and lets a raw InterruptedException escape with the flag cleared and un-wrapped (RetryingHttpClient.kt:72 has no interrupt handling). Directly violates our 'no Thread.sleep / interrupt-aware / restore-flag-and-throw-InterruptedIOException' rules. Do not import. 2) Single java.util.Timer for ALL async sleeps (DefaultSleeper.kt:10) — one thread, dies permanently on any uncaught task exception, no daemon-isolation per call. Our ScheduledExecutorService daemon (RetryStep.kt:366) is the correct shape. 3) Full sync+async duplication of every cross-cutting concern (RetryingHttpClient.kt:35-74 vs 76-133; LoggingHttpClient.kt:53-67 vs 69-91) is intrinsic to the decorator model — copying the decorator approach means copying that duplication. Our sync HttpStep + toAsync bridge (AsyncPipelineBridges) avoids re-authoring passthrough steps twice. 4) Method-agnostic retry eligibility: `request.body?.repeatable() ?: true` (RetryingHttpClient.kt:143) will auto-retry a replayable-body POST with no idempotency-key requirement — our gate (DefaultRetryStep.kt:255-258 / RetryStep.kt:352-355) is the same body-replayability check, which is fine, but do NOT also drop our method-idempotency requirement for body-less requests, which their check effectively does. 5) Logging to System.err.println directly (LoggingHttpClient.kt:98 et al.) instead of SLF4J — fine for a single app's client, wrong for a toolkit; we correctly use ClientLogger. 6) Reflective Cleaner is GOOD (PhantomReachable.kt) but note their catch at PhantomReachable.kt:44-47 only rethrows RuntimeException/Error from the cause and otherwise wraps — preserve that nuance if copying so a Cleaner registration failure on a weird JVM doesn't crash startup. + +**Where we're ahead** + +1) Ordering safety: our Stage enum (Stage.kt:26-49) + pillar exactly-one enforcement with a logged-warning on replace (StagedSteps.kt:179, HttpPipelineBuilder.kt:31-38) makes 'exactly one retry layer' and 'logging inside retry' structurally guaranteed; their decorator nesting is unenforced convention — nothing stops a double-wrapped RetryingHttpClient or logging-outside-retry. 2) Interrupt-correct backoff: our two retry impls use ScheduledExecutorService+CompletableFuture.get, restore the interrupt flag, cancel the pending future, and surface InterruptedIOException (RetryStep.kt:304-322; DefaultRetryStep.kt:509-523) — theirs uses Thread.sleep and leaks a bare InterruptedException with the flag cleared. 3) Recovery-as-data: ResponseOutcome (sealed Success/Failure) folded through ResponsePipeline (ResponsePipeline.kt:76-85) lets a step rescue/replace/pass-through a failure uniformly, and EXPLICITLY fixes the Airbyte defect where a pre-request throw bypassed error handling (ExecutionPipeline.kt:96-111 catches request-step AND transport throws). Their decorators have no equivalent — an exception in one decorator just propagates; only IOExceptions get the retry treatment. 4) Resource discipline under failure: we close in-hand responses before propagating a step throwable in MULTIPLE places (ResponsePipeline.kt:101-106 response-step throw; AuthStep.kt:88-92 challenge-handler throw; DefaultRedirectStep.kt:140-143 recreate throw; DefaultRetryStep.kt:309-312 predicate/delay throw) — a systematic close-before-propagate discipline their decorators apply only in the narrow retry-close-before-sleep case (RetryingHttpClient.kt:71). 5) Surgical pipeline editing: insertAfter/insertBefore/replace/remove by step type with cross-stage rejection (StagedSteps.kt:110-165) — a toolkit-grade extension surface decorators can't offer (you'd have to re-assemble the nesting). 6) Zero hard deps in core: their core hard-depends on okhttp+Jackson; our concerns are dep-free and transports are pluggable. + +_Verifier notes:_ Reference files re-read line-by-line: HttpClient.kt, PhantomReachable.kt, PhantomReachableClosingHttpClient.kt, ProxyAuthenticator.kt, WorkloadIdentityHttpClient.kt, RetryingHttpClient.kt, LoggingHttpClient.kt, DefaultSleeper.kt. Our files: DefaultRetryStep.kt, pipeline/step/retry/RetryStep.kt, util/Clock.kt, AuthStep.kt, BearerTokenAuthStep.kt, client/HttpClient.kt, Stage.kt, HttpStep.kt, PipelineNext.kt, StagedSteps.kt, HttpPipelineBuilder.kt, DefaultInstrumentationStep.kt, ProxyOptions.kt, OkHttpTransport.kt (proxy+close+ownership), JdkHttpTransport (ownership), ExecutionPipeline.kt, ResponsePipeline.kt, DefaultRedirectStep.kt; docs/pipelines.md grepped. + +MOST IMPORTANT CORRECTION: the analysis's central 'we are already correct on backoff sleep' claim is FALSE for our primary retry. DefaultRetryStep (the stage-pipeline pillar at Stage.RETRY) sleeps via clock.sleep()->SystemClock.sleep->Thread.sleep(millis,nanos) (Clock.kt:72), which pins a virtual-thread carrier under Loom — the SAME defect the analysis flags in openai's DefaultSleeper. Only the secondary recovery RetryStep (pipeline.step.retry) uses ScheduledExecutorService+CompletableFuture.get. We DO avoid two of openai's three sleep sub-defects (no sub-ms truncation; interrupt-correct: flag restored + InterruptedIOException), but NOT carrier-pinning on the default path. I converted the analysis's SIMPLIFY('no change, we're ahead') into an ADOPT with a real recommendation and set claimAccurate=false for that item. + +Other accuracy notes: every openai reference claim verified true (Cleaner self-ref guard, ISO-8859-1 Basic default, 401 evict+throw-retryable, X-Stainless-Retry-Count guard, isProbablyUtf8 64-codepoint/256-byte sniff, System.err logging, body?.repeatable()?:true). Our gaps confirmed real: no phantom-reachable net (only an unrelated ContextStore WeakReference), no binary-body sniff (utf8Preview uses String(bytes,UTF_8) replacement decode), no concrete 401-eviction AuthStep (only the hook + cache), no retry-count header, dead ProxyOptions.challengeHandler (OkHttp logs a warning and falls back to Basic), 1-arg transport SPI with no per-call options. Owned/BYO + idempotent thread-safe close() confirmed on both transports, so the phantom-net wiring and the proxy-auth wiring are both feasible. + +Kept 8 actionable findings (4 ADOPT incl. the corrected sleep gap, 1 COPY, 2 LEARN, with the proxy item recategorized LEARN->ADOPT). Dropped 5 non-actionable antipattern/we-are-ahead notes; consolidated the decorator-vs-stage trade-off into one docs LEARN. The COPY item's liftable surface is narrowed to isProbablyUtf8 only (not LoggingBuffer's streaming machinery, which doesn't fit our bounded-structured-field model). Attribution required on lifted code (openai-java is Apache-2.0): the Cleaner util and isProbablyUtf8. + +--- + +## 4. Request/response logging (LoggingHttpClient) + redaction + body handling + +**What it is** + +openai-java implements logging as a single decorator HttpClient (`LoggingHttpClient`, 628 LOC in one file) inserted into the client chain at `ClientOptions.kt:678-690` (`RetryingHttpClient → LoggingHttpClient → WorkloadIdentityHttpClient → user client`). It is driven by a 4-value `LogLevel` enum (OFF/INFO/ERROR/DEBUG) with `LogLevel.fromEnv()` reading `OPENAI_LOG`. It logs by writing plain lines to `System.err` in OkHttp's `--> / <--` curl-tail format. Body capture is *streaming, line-by-line, byte-at-a-time*: it wraps the request body in `LoggingHttpRequestBody`/`LoggingOutputStream` and the response in `LoggingHttpResponse`/`LoggingInputStream`, both teeing each byte into a `LoggingBuffer` that decodes and prints one line at a time as bytes flow. There is no max-body-size cap; instead a clever `isProbablyUtf8` prefetch (sample first 64 code points / 256 bytes; suppress logging with "(binary body omitted)" if non-whitespace ISO control chars appear) prevents dumping binary blobs. Redaction is header-only (a `SortedSet` CASE_INSENSITIVE denylist defaulting to authorization/api-key/x-api-key/cookie/set-cookie, replaced with "██"). Crucially it does NOT redact the URL at all — `logRequest` (line 99-100) prints `request.url()` with full query string and any userinfo verbatim. + +Our equivalent is a *pipeline step* (`DefaultInstrumentationStep` + async mirror), not a decorator client, and it does far more (spans + OTel-shaped metrics + structured SLF4J events via `ClientLogger`/`LoggingEvent`). Body capture is *bulk-drain* (`LoggableResponseBody` drains a bounded prefix once behind double-checked locking; `LoggableRequestBody` tees via `TeeSink` into a Buffer), bounded by `bodyPreviewMaxBytes` (8 KiB). Redaction is allow-list based for both headers (`HttpHeaderName` typed set) and URL query/userinfo (`UrlRedactor`). + +**How it works (line-level)** + +REDACTION (header-only denylist): `LoggingHttpClient.kt:166-171` — `headers.names().forEach { … headers.values(name).forEach { value -> System.err.println("$name: ${if (redactedHeaders.contains(name)) "██" else value}") } }`. The set is `setOf("authorization", "api-key", "x-api-key", "cookie", "set-cookie")` (line 195-196), built into a `String.CASE_INSENSITIVE_ORDER` SortedSet at `build()` (line 252). NO URL redaction: `logRequest` line 99 `append("--> ${request.method} ${request.url()}")` prints the full URL including query params and userinfo verbatim. + +BINARY GUARD (the genuinely clever bit): `LoggingBuffer` (line 424) accumulates a `prefetchBuffer` of up to `PROBABLY_UTF8_BYTE_LIMIT = 64*4 = 256` bytes (line 548) before printing anything when charset is unknown; `isProbablyUtf8` (line 556-577) decodes with `CodingErrorAction.REPORT` and walks up to 64 code points returning false on any `Character.isISOControl(cp) && !Character.isWhitespace(cp)` — on failure sets `suppressed=true` and prints "(binary body omitted)" (line 479). If charset IS known from Content-Type, the prefetch is skipped entirely (line 449-450: `prefetchBuffer = if (charset != null) null else …`). + +STREAMING TEE: `LoggingInputStream.read(b,off,len)` (line 383-399) delegates to the wrapped stream then loops `for (i in off until off+bytesRead) buffer.write(b[i].toInt() and 0xFF)` — byte-at-a-time into the LoggingBuffer, which flushes a line on each `'\n'` (line 493-494). `markDone(closedEarly)` (line 408-414) prints `<-- END HTTP (N-byte body[, closed early])` — note it distinguishes a body that was closed before EOF. + +CHARSET PARSE: `parseCharset` (line 580-589) splits on `;`, drops the media-type token, finds `charset=`, strips quotes, `runCatching { charset(it) }.getOrNull()`. + +DURATION FORMAT: `Duration.format()` (line 592-628) → "1m 40s 467ms" human form via `toKotlinDuration().toComponents`. + +LogLevel.fromEnv (`LogLevel.kt:25-31`): reads `OPENAI_LOG` lowercased → INFO/ERROR/DEBUG else OFF. Note the odd ordinal ordering OFFString? = System::getenv): HttpLogLevel mapping none/headers/body (and reuse via the Configuration envSource seam so it stays testable). Do NOT ship a default env key in sdk-core. codegen: emit a generated client whose options default reads the product's own env var (e.g. ACME_LOG) through it. This keeps the zero-dep rule and the toolkit-not-client posture intact. +- **Surface a 'closed-early / not-consumed-fully' signal on the captured response body** `ADOPT` · `sdk-core` · effort M · confidence low + - *Verdict:* Accurate that openai has it and we do not: LoggingInputStream.close emits '<-- END HTTP (N-byte body, closed early)' when the stream is closed before EOF (LoggingHttpClient.kt:401-414), distinguishing 'caller read the whole body' from 'caller bailed after N bytes'. We track fullyCaptured / drainError and emit response.body.drain_error (DefaultInstrumentationStep.kt:222-224) but have no 'consumer closed before EOF' signal (verified: no closedEarly/consumedFully/abandoned anywhere in sdk-core main). HOWEVER the value is much lower for us than for them, and the source finding overstates it. Their whole logging model is a streaming tee, so 'closed early' is the ONLY way they learn body size; ours eagerly drains a bounded prefix in the step regardless of whether the caller ever reads, so we already log a body preview+size for every request. 'Closed early' would only add signal on the over-cap live-tail path (caller abandons a >8 KiB streamed body mid-read), which is a narrow case. It is additive (a flag in PrefixThenTailSource.close() plus an accessor) but it is a nice-to-have diagnostic, not a correctness or safety item. Effort is M because it touches the body class, the steps, and tests for marginal payoff. + - *Do:* Low priority. If pursued, record an 'abandoned' flag in PrefixThenTailSource.close() when prefix/tail are not exhausted, expose consumedFully: Boolean on LoggableResponseBody, and emit it alongside response.body.size only on the over-cap path. Defer behind the higher-value charset/binary/dedup work; do not block on it. +- **Do NOT collapse our logging stack to a single decorator; DO extract the duplicated sync/async emitters** `SIMPLIFY` · `sdk-core` · effort M · confidence high + - *Verdict:* Direction is correct and the headline is right: their 628-LOC single-file decorator is ~1/3 our LOC precisely because it gives up things a toolkit needs (single-consumer body assumption -> no locking; binary-guard instead of a byte cap -> no bounded capture; System.err instead of SLF4J; header-only redaction -> no URL/secret redaction; no spans/metrics). So 'keep the step model' is sound. The actionable, accurate sub-point is the duplication: DefaultInstrumentationStep and DefaultAsyncInstrumentationStep carry ~200 lines of byte-identical emit/redact/preview helpers (emitRequestEvent/emitResponseEvent/emitFailureEvent/appendHeadersFields/safeRedact/utf8Preview/recordMetrics) — confirmed verbatim and flagged by their own TODO(omar 2026-08-01) at DefaultAsyncInstrumentationStep.kt:236-237. That IS extractable into a shared InstrumentationEmitters and should be; it is also the natural home for the charset/binary preview fix. CORRECTION TO THE SOURCE FINDING: its second caveat — 'audit whether the over-cap live-tail (PrefixThenTailSource) is dead complexity; if no real caller needs it, delete ~60 lines for a single-consumer prefix contract' — is WRONG and would introduce a correctness bug. The over-cap path is production-reachable for ANY response larger than bodyPreviewMaxBytes (8 KiB) under BODY_AND_HEADERS: the step drains exactly the cap, LoggableResponseBody leaves the delegate open and hands the caller prefix+live-tail (LoggableResponseBody.kt:138-148, 290-311). It is exercised by unit tests (LoggableResponseBodyTest.kt:275-325, InstrumentationStepTest.kt:395-416). Deleting the live tail would silently TRUNCATE every >8 KiB response delivered to the caller whenever body logging is on — a serious regression, not a simplification. Keep the live tail. + - *Do:* Do NOT collapse to a decorator and do NOT remove the over-cap live-tail. DO extract the shared emit/redact/preview/metrics helpers into an internal InstrumentationEmitters consumed by both steps (kills the documented duplication and gives the charset+binary preview fix a single home). Treat the source finding's 'delete PrefixThenTailSource' suggestion as rejected — annotate it as load-bearing for non-truncating body logging. +- **Per-request overhead at HttpLogLevel.NONE: span start + URL redaction run unconditionally** `LEARN` · `both` · effort S · confidence medium + - *Verdict:* The composition comparison (decorator-nesting vs Stage-ordered pillar steps) is accurate but generic — neither model is better in the abstract, and 'keep the step model' is the right call for a toolkit, so I demote this from any ADOPT framing to LEARN with one concrete, verified nugget. In openai's decorator the non-DEBUG path returns the request untouched (LoggingHttpClient.kt:120-124 only wraps the body at DEBUG), so a disabled logger adds literally zero body-wrap allocation. Our pillar step always executes: at HttpLogLevel.NONE we still call options.tracer.startSpan (DefaultInstrumentationStep.kt:99-104) and safeRedact -> UrlRedactor.redact (:98) BEFORE the NONE guard, plus clock.monotonic() and a spanAttributes map allocation, every request. With the default NoopTracer/NoopMeter those are cheap no-ops, but UrlRedactor.redact still rebuilds the URL for any request that has a query/userinfo/fragment (UrlRedactor.kt:81-138) and the map is still allocated — real, if small, per-request cost on a fully-disabled instrumentation step. shouldCaptureBody already correctly gates the body-wrap allocation, so only the span/redact/clock work is wasted. + - *Do:* Keep the Stage-step model. Two cheap wins: (1) skip safeRedact/spanAttributes/startSpan when logLevel==NONE AND tracer is NoopTracer AND meter is NoopMeter — compute redactedUrl lazily only when something will consume it. (2) For codegen-emitted default pipelines, OMIT the instrumentation step entirely (rather than insert-and-no-op) when the product ships with logging off and no real tracer/meter, so there is zero per-request cost. Document that, today, NONE still starts a span and redacts the URL. + +**Considered & dropped** + +- ~~Default redacted-header names list is broader than ours implies — audit our allow-list (X-Forwarded-For PII)~~ — Downgraded and folded rather than kept as a standalone decision-ready item. The facts check out — their denylist names cookie/set-cookie (LoggingHttpClient.kt:195-196) and our DEFAULT_ALLOWED_HEADERS correctly omits cookie/set-cookie/authorization/api-key (HttpInstrumentationOptions.kt:65-90), and X_FORWARDED_FOR IS in our allow-list (line 88) — but the actionable content is a one-line audit nit, not a finding. The 'direction check validates our allow-list' conclusion is already captured in weAreAhead (and is correct: allow-list fails safe, their denylist fails open). The only residual is the X-Forwarded-For question, which is debatable: on the REQUEST side it is a value the client itself sets (not a server secret), and it rarely appears on responses; treating it as PII is regime-specific. Net: a documentation footnote, captured in notes, not worth a separate verified finding. + +**Do not copy** + +1. URL logged verbatim with NO redaction (LoggingHttpClient.kt:99 `append("--> ${request.method} ${request.url()}")`). `request.url()` (HttpRequest.kt:17-47) includes the full query string and any userinfo. For an API where secrets travel in query params (pre-signed URLs, api_key=… query auth, SAS tokens), this writes the secret to System.err in plaintext. They redact headers but not the URL — an inconsistent threat model. DO NOT copy this; our UrlRedactor (always-redact userinfo, allow-list query values) is the correct posture and must stay. + +2. Header redaction by DENY-list (LoggingHttpClient.kt:195-196). Any sensitive header not in the hardcoded five (e.g. proxy-authorization, x-amz-security-token, x-vault-token, a customer's x-session-id) is logged in full. For a multi-API toolkit this is a guaranteed leak. DO NOT adopt the denylist model; keep our allow-list. + +3. No max-body-size cap on logging (LoggingHttpClient.kt: LoggingBuffer has writeCount but never stops printing). A large text response is fully decoded and printed line-by-line to System.err — unbounded log volume and latency on the read thread. The binary guard only stops non-text. DO NOT drop our bodyPreviewMaxBytes cap; theirs is a liability we already avoid. + +4. Hardcoded System.err.println at 15+ sites (LoggingHttpClient.kt:98,115,116,132,154,277,278,…). Bypasses every log framework, log level routing, structured ingestion, and async appender. Defensible for a single-app vendor client that refuses a logging dep; an antipattern for a library. Our SLF4J/ClientLogger routing is correct — do not regress. + +5. Byte-at-a-time response/request tee (LoggingInputStream.read line 395-398, LoggingOutputStream.write line 309-314 — `for (i in off until off+len) buffer.write(b[i] and 0xFF)`). This is O(n) JDK method calls per body and defeats segment-level copying. Our TeeSink's one-encode-two-segment-moves (TeeSink.kt:75-101) is strictly better; do not copy their loop. + +**Where we're ahead** + +URL/secret redaction: we always redact userinfo (`***:***@`) and allow-list query values via UrlRedactor (UrlRedactor.kt:67-138); they log the raw URL with zero redaction (LoggingHttpClient.kt:99). Clear safety win. + +Header redaction model: our allow-list (HttpInstrumentationOptions.kt:65-90) fails safe on unknown headers; their denylist (LoggingHttpClient.kt:195) fails open. Win. + +Bounded body capture: we cap at bodyPreviewMaxBytes and still stream the remainder (LoggableResponseBody PrefixThenTailSource, kt:290-311); they have no cap. Win on memory/log-volume safety. + +Repeatable response bodies under concurrency: our double-checked-locking drain with ReentrantLock + @Volatile (LoggableResponseBody.kt:88-211) hands the same bytes to multiple downstream consumers race-safely; their LoggingInputStream is single-consumer with no locking (LoggingHttpClient.kt:331-415) and would be unsafe if read twice or concurrently. We catch a race they don't — though only because a toolkit needs re-readable bodies and a single-app client doesn't. + +Efficient request tee: one encode + two segment-level moves (TeeSink.kt:30-43, 75-101) vs their byte-at-a-time OutputStream loop (LoggingHttpClient.kt:309-314). + +Structured, framework-routed, zero-alloc-when-disabled logging: ClientLogger/LoggingEvent emit SLF4J key-values with an allocation-free NOOP path (LoggingEvent.kt:292-303) and MDC trace.id/span.id folding; they println unstructured lines to System.err. + +Drain-failure determinism: on a mid-read network error we cache the exception, re-throw it from source() every time, and still return partial bytes from snapshot() (LoggableResponseBody.kt:138-175, 258-273); their LoggingInputStream just propagates the read exception with no partial-capture-for-logging guarantee. + +_Verifier notes:_ Overall the source analysis is unusually accurate — every openai-java line citation I re-checked (LoggingHttpClient.kt:99 raw URL, :195-196 5-name denylist, :309-314 byte-at-a-time OutputStream tee, :395-398 byte-at-a-time InputStream tee, :401-414 closed-early, :424-539 LoggingBuffer, :556-577 isProbablyUtf8; LogLevel.kt:25 fromEnv; ClientOptions.kt:205 default + :678-690 RetryingHttpClient->LoggingHttpClient->WorkloadIdentity chain; HttpRequest.kt:17-43 url() with full query) holds. The antipatterns list and weAreAhead section are correct and I did not water them down: we ARE ahead on URL/userinfo redaction (UrlRedactor always redacts userinfo, allow-lists query values), header allow-list (fails safe vs their fail-open denylist), bounded body capture with a non-truncating live tail vs their uncapped streaming print, repeatable bodies under a ReentrantLock double-checked drain vs their single-consumer no-lock InputStream, SLF4J structured/zero-alloc-when-disabled logging (LoggingEvent.NOOP, LoggingEvent.kt:292-303) vs 15+ System.err.println sites, and an efficient one-encode-two-segment-move TeeSink vs their per-byte loop. None of those should be regressed. + +Two corrections to the source agent's reasoning: +1. Finding #7's caveat to 'audit/possibly delete the over-cap live-tail (PrefixThenTailSource)' is wrong and would be a correctness bug. The path is production-reachable for any response > bodyPreviewMaxBytes (8 KiB) under BODY_AND_HEADERS and is unit-tested (LoggableResponseBodyTest.kt:275-325, InstrumentationStepTest.kt:395-416). Removing it would silently truncate large response bodies delivered to the caller whenever body logging is on. I kept the finding but flipped that sub-point to 'keep it'. +2. The genuinely high-value, low-risk cluster is the body-preview decode path: (a) decode with MediaType.charset instead of hardcoded UTF-8 — we already parse+cache the charset (MediaType.kt:53-61) and throw it away — and (b) sniff for binary before printing. Both are pure-JDK-8, zero-dep, sdk-core-appropriate, and should land together with the shared-emitter extraction (their own acknowledged TODO at DefaultAsyncInstrumentationStep.kt:236-237) so the logic lives in one place across the sync/async steps. + +Robustness note in OUR favor that the source captured correctly: the async step skips body capture for unknown-length streaming bodies (contentLength() < 0, DefaultAsyncInstrumentationStep.kt:249) to avoid blocking the completion thread — a hazard openai's uncapped design does not address. Not a finding, but reinforces 'don't follow them on body handling'. + +Net kept: 7 findings (2 high-value ADOPT on the preview decode/binary path, 1 contingent COPY with a clean-room alternative to avoid a NOTICE obligation, 1 ADOPT for a parameterized env-var hook, 1 SIMPLIFY = extract the dup'd emitters but DON'T collapse and DON'T delete the live tail, 1 LEARN on NONE-path per-request overhead, 1 low-confidence ADOPT for a closed-early signal). 1 dropped/folded (header allow-list audit). + +--- + +## 5. Retry / backoff / sleeper / timeout + +**What it is** + +openai-java centralizes all retry in ONE class, `RetryingHttpClient`, a decorator that wraps an inner `HttpClient` and implements both sync `execute` and async `executeAsync` retry loops (RetryingHttpClient.kt:35-133). It depends on an injected `Sleeper` (sync `sleep` + async `sleepAsync(): CompletableFuture`, Sleeper.kt:11-21), `DefaultSleeper` backing the async path with a daemon `java.util.Timer` (DefaultSleeper.kt:8-28), and `PhantomReachableSleeper`, a delegating wrapper that auto-closes the sleeper via `closeWhenPhantomReachable(this, sleeper)` so a forgotten close still releases the Timer thread (PhantomReachableSleeper.kt:11-23). `Timeout` (Timeout.kt) is an orthogonal per-phase budget (connect/read/write/request) with `read`/`write` defaulting to `request()` and `request` to 10 min — it is NOT consulted by the retry loop ("not including retries", Timeout.kt:47). The retry policy is hardcoded (maxRetries default 2, Builder.kt:240): status predicate honors a non-standard `X-Should-Retry: true/false` server override first, then 408/409/429/>=500 (shouldRetry, :162-182); exception predicate retries `IOException || OpenAIIoException || OpenAIRetryableException` (:184-188). Backoff is `min(0.5 * 2^(retries-1), 8.0)` seconds with asymmetric jitter `1.0 - 0.25*rand()` (:222-227). Each attempt stamps `X-Stainless-Retry-Count` unless the caller set it (:39-47); an optional idempotency header gets a `stainless-java-retry-{UUID}` value (:148-160). Body replayability gates retry via `request.body?.repeatable() ?: true` (:140-143). OUR SDK splits the same surface across two parallel sync-only step families (`pipeline.step.retry.RetryStep` recovery-aware + `http.pipeline.steps.DefaultRetryStep` stage-based) sharing one `RetryAfterParser` and `RetryUtils` classifier; we have a richer parser and a Loom-safe scheduler but NO async retry and NO `Sleeper`/`Timeout` abstraction. + +**How it works (line-level)** + +Async retry loop (the thing we entirely lack), RetryingHttpClient.kt:100-129: `responseFuture.handleAsync({ response, throwable -> ... sleeper.sleepAsync(backoffDuration).thenCompose { executeWithRetries(...) } }, { it.run() }).thenCompose(Function.identity())`. The `{ it.run() }` executor runs the handler inline (same thread that completed the future), and `thenCompose(Function.identity())` flattens the `CompletableFuture>` — a clean recursive non-blocking retry with zero thread blocked during backoff. +DefaultSleeper.kt:14-25 async sleep: `timer.schedule(object : TimerTask() { override fun run() { future.complete(null) } }, duration.toMillis())` — one shared daemon `Timer("DefaultSleeper", true)` drives all async waits. +Backoff+jitter, RetryingHttpClient.kt:222-227: `val backoffSeconds = min(0.5 * 2.0.pow(retries - 1), 8.0); val jitter = 1.0 - 0.25 * ThreadLocalRandom.current().nextDouble(); return Duration.ofNanos((TimeUnit.SECONDS.toNanos(1) * backoffSeconds * jitter).toLong())`. Note jitter ∈ [0.75, 1.0): it ONLY ever shortens the delay, never lengthens — a deliberate anti-thundering-herd choice that also means effective max < 8s. +Retry-After, RetryingHttpClient.kt:196-214: tries `Retry-After-Ms` as `toFloatOrNull()?.times(MILLISECONDS.toNanos(1))`, else `Retry-After` as `toFloatOrNull()?.times(SECONDS.toNanos(1))` (note: FLOAT seconds, so "1.5" works), else `ChronoUnit.NANOS.between(OffsetDateTime.now(clock), OffsetDateTime.parse(retryAfter, RFC_1123_DATE_TIME))` in a try/catch returning null on `DateTimeParseException`. The server hint is used verbatim with NO deadline clamp and NO max-cap. +Idempotency key, RetryingHttpClient.kt:148: `private fun idempotencyKey(): String = "stainless-java-retry-${UUID.randomUUID()}"` — only added if `idempotencyHeader != null` and not already present. +Replayability gate, :140-143: `request.body?.repeatable() ?: true` — null body ⇒ retryable; the method is NOT consulted (no GET/POST distinction). +Response hygiene, :69-72 and :118-120: `response?.close()` before `sleeper.sleep(backoffDuration)` so a failed response's socket is released before the wait. + +**vs. our SDK** + +OUR async gap: the staged async pipeline (`AsyncHttpStep`/`AsyncPipelineNext`) has the plumbing for async retry — `AsyncPipelineNext.copy()` exists explicitly "before re-driving more than once (async retry / redirect)" (AsyncPipelineNext.kt:18,67) — but ZERO async retry step ships. `DefaultRetryStep` only overrides sync `process` (DefaultRetryStep.kt:163), and the recovery-aware `pipeline.step.retry.RetryStep` is sync-only too (grep for executeAsync/sleepAsync/processAsync → none). Even our "non-blocking" wait is a blocking `CompletableFuture.get()` on a scheduled task (RetryStep.kt:291-322) — Loom-safe (carrier unmounts) but still occupies the calling virtual/platform thread, unlike their truly fire-and-continue `thenCompose`. +Status predicate: ours retries 408/429/5xx-except-501/505 (RetryUtils.kt:43-49). Theirs adds 409 (lock timeout) and the `X-Should-Retry` server override; ours has NEITHER. Conversely the recovery-aware path's default set is {429,500,502,503,504} and deliberately drops 408 (RetrySettings.kt:229-244) — so our two families disagree with each other AND with theirs. +Retry-After parser: ours (RetryAfterParser.kt) is materially RICHER — handles `retry-after-ms`, `x-ms-retry-after-ms`, and `X-RateLimit-Reset` (jittered Unix epoch, :268-285), accepts both DateTimeRfc1123 (tolerant weekday) and JDK RFC_1123 grammars (:238-253), is total (clamps to 365d, never throws), and clamps the hint to the deadline (BackoffCalculator.kt:82). Theirs misses `x-ms-retry-after-ms` and `X-RateLimit-Reset`, only one date grammar, and does NOT clamp. The ONE thing theirs parses that ours does not: FRACTIONAL seconds in `Retry-After`/`Retry-After-Ms` (`toFloatOrNull`); ours uses `toLongOrNull` (RetryAfterParser.kt:193,202) so "1.5" → null. +Backoff formula: ours `initial*multiplier^(n-1)` capped at maxDelay with SYMMETRIC jitter `[delay*(1-j/2), delay*(1+j/2)]` (BackoffCalculator.kt:131-149) and a true total-timeout deadline (RetryStep.kt:200-205, gax-style); theirs is fixed `0.5*2^(n-1)` capped 8s with one-sided shrinking jitter and NO deadline. The other staged path (DefaultRetryStep.kt:443-462) uses yet a third formula `baseDelay*(1L shl tryCount)` with ±5% jitter — three formulas in our codebase vs their one. +Sleeper/Timeout: we have neither type. Our `util.Clock` (Clock.kt:22-57) covers sync `sleep` only; no `sleepAsync`, no injectable retry-clock seam on `DefaultRetryStep` beyond `Clock`. No `Timeout` per-phase model anywhere in sdk-core. +Retry-count header: we have `IdempotencyKeyStep` (UUID, configurable header, POST/PUT/PATCH) but NO `X-Stainless-Retry-Count` analog emitting the attempt number, and our idempotency step lives in the recovery pipeline, decoupled from the retry loop (it does not refresh the key per attempt). + +**Recommendations (verified)** + +- **Parse fractional Retry-After seconds (and fractional -ms)** `COPY` · `sdk-core` · effort S · confidence high + - *Verdict:* Verified exactly. Theirs uses toFloatOrNull().times(unitNanos) for both Retry-After-Ms (RetryingHttpClient.kt:199-200) and Retry-After seconds (:202), so 'Retry-After: 1.5' and 'Retry-After-Ms: 250.5' are honored. Ours uses toLongOrNull for both numeric-seconds (RetryAfterParser.kt:193) and millis (:202), so any fractional value returns null and silently falls through to exponential backoff — ignoring an explicit server pacing hint. RFC 7231 specifies integer delta-seconds so fractional Retry-After is off-spec, but proxies/servers emit it and lenient parsing is strictly more robust. Correction to the finding's suggested impl: prefer toDoubleOrNull over Float — Float has ~7 significant digits and loses precision for large millisecond values (their toFloatOrNull is itself slightly buggy for big -ms numbers). Keep our existing negative/NaN/Infinite guards and the 365-day clamp so totality is preserved. This is the single parsing case where theirs beats our otherwise-richer parser. + - *Do:* In RetryAfterParser.parseNumericSeconds and parseMillis, try toLongOrNull first, then fall back to toDoubleOrNull (reject NaN/Infinite/negative) and convert via Duration.ofNanos((value * unitNanos).toLong()) before the existing clamp. Add a unit test for '1.5' and '250.5'. Apache-2.0 attribution is unnecessary for a one-line numeric-parse change, but note the behavior was modeled on RetryingHttpClient.kt:199-202. +- **Add an async time seam (sleepAsync) when adding async retry; keep Clock for sync** `ADOPT` · `sdk-core` · effort S · confidence medium · we partly do this + - *Verdict:* The Sleeper interface (sync sleep + async sleepAsync(): CompletableFuture, Sleeper.kt:11-21) is accurately described. But the finding over-states our gap. We ALREADY have a clean, injectable, deterministic SYNC time seam: util.Clock (Clock.kt:22-57) is interface-based with a FixedClock test fixture, DefaultRetryStep takes a Clock and routes every sleep through clock.sleep (sleepOrAbort, DefaultRetryStep.kt:509-523), and the recovery RetryStep also takes a Clock plus an injectable RetrySettings.scheduler (RetrySettings.kt:80). So 'introduce an injectable Sleeper for deterministic tests' is largely already-done for sync — downgrade. The ONE real, narrow gap is an ASYNC time seam (sleepAsync), which only matters once async retry lands and is therefore a sub-part of the previous finding, not an independent win. The 'two time seams (Clock vs raw ScheduledExecutorService) can't be driven by one fake' point is real but minor (the recovery RetryStep already accepts a fake scheduler; a FixedClock drives DefaultRetryStep). Do NOT introduce their exact Sleeper shape wholesale and retrofit it onto the sync path — that would churn a working Clock-based seam. Note their sync DefaultSleeper.sleep is a bare Thread.sleep (DefaultSleeper.kt:12) that pins Loom carriers and drops the interrupt flag — strictly worse than our Clock/scheduler waits; do not copy that impl. + - *Do:* When implementing async retry, add ONE async time primitive (e.g. extend Clock with sleepAsync(Duration): CompletableFuture default-backed by the existing dexpace-retry-scheduler daemon, OR a small AsyncSleeper) so the async path is testable with the same fake that already drives the sync path. Keep util.Clock as the sync seam; do not replace it. Reuse RetryStep.DEFAULT_SCHEDULER as the backing executor (Java-8, zero-dep). No standalone work item beyond the async-retry effort. +- **Ship a composable server-override retry predicate (X-Should-Retry); document opt-in 409** `ADOPT` · `both` · effort S · confidence high + - *Verdict:* Verified. shouldRetry(response) reads X-Should-Retry via values(...).getOrNull(0) and short-circuits BEFORE the status table: 'true' forces retry, 'false' forces no-retry (RetryingHttpClient.kt:164,167-170); 408 AND 409 are both retryable (:173-175). Our classifiers are status-only (RetryUtils, RetrySettings.retryableStatuses, defaultShouldRetryResponse) with no server-driven override, and neither default set includes 409. The capability is genuinely useful for a toolkit (lets an API author signal per-response retryability without the client guessing) and — importantly — it is trivially expressible in our EXISTING HttpRetryConditionPredicate seam (HttpRetryOptions.kt:21-26, a fun interface that is composable). So this is 'ship a default/example predicate', not new architecture. Their hardcoding of this into a private method is the antipattern; our seam is better. 409 is correctly flagged opt-in (a real conflict is usually not retryable; only lock-style 409s are). Keep X-Should-Retry as opt-in, not a silent default, since a downstream backend could reuse that header name. For future codegen, a Stainless-style target would wire it on by default. + - *Do:* Provide a ServerOverrideRetryPredicate implementing HttpRetryConditionPredicate (reads X-Should-Retry true/false, else delegates to a wrapped predicate) shipped but NOT default in sdk-core; document composing it via HttpRetryOptions.shouldRetryCondition. Future codegen enables it when the target API documents the header. Leave 409 out of the default set; document how to add it. This is also expressible for the recovery RetryStep via a custom isClassifiedRetryable path — note both integration points. +- **Emit a configurable per-attempt retry-count header; co-locate with idempotency-key minting** `ADOPT` · `both` · effort S · confidence medium + - *Verdict:* Verified. They stamp X-Stainless-Retry-Count:{n} on every attempt, skipped when the caller already set it (shouldSendRetryCount, RetryingHttpClient.kt:39-40,45-47; setRetryCountHeader replaceHeaders :145-146) and mint a per-retry idempotency value stainless-java-retry-{UUID} once at the top, NOT per attempt (maybeAddIdempotencyHeader runs before the loop, :36,150-160; idempotencyKey() :148). Our IdempotencyKeyStep (IdempotencyKeyStep.kt:72-82) sets a UUID key but lives in the recovery RequestPipeline, decoupled from the retry loop — it runs once pre-dispatch and emits no attempt counter, so a server cannot distinguish attempt 1 from attempt 3. Two genuine takeaways: (1) a retry-count header is cheap server/observability and useful for debugging 'is the client retrying?'; (2) the more important insight is the SEMANTIC: the idempotency key should be minted once per logical request and held CONSTANT across that request's retries — which is exactly what their pre-loop minting guarantees and what our split design leaves ambiguous (if IdempotencyKeyStep ever ran per attempt it would re-mint and break dedup). De-hype: emitting the counter is trivial; the design value is documenting/guaranteeing the key-stability contract across the retry boundary. Header name is vendor-specific (X-Stainless-*), so make it configurable / off-by-default in sdk-core; name per target API in codegen. Respect a caller-set value as they do. + - *Do:* Have both the sync and (future) async retry steps optionally stamp a configurable retry-count header per attempt (default: disabled or a neutral name like Retry-Count), skipping when the caller already set one. Document the contract that the idempotency key is minted once per logical request and held constant across that request's retries; ensure IdempotencyKeyStep runs upstream of the retry boundary so the key is not re-minted per attempt. For codegen, name both headers per the target API convention. +- **Add a transport-agnostic per-phase Timeout value type** `ADOPT` · `sdk-core` · effort M · confidence medium + - *Verdict:* Verified. Timeout.kt is a value type with connect/read/write/request, where read()/write() default to request() (Timeout.kt:35,44), request() defaults to 10min (:56), connect() to 1min (:26), Duration.ZERO means unbounded, and it is explicitly orthogonal to retry ('not including retries', :47). We have NO timeout model in sdk-core; RetrySettings.totalTimeout (RetrySettings.kt:72) is a retry-budget deadline only, and per-phase I/O timeouts are pushed entirely onto each transport's native client. As a TOOLKIT this is a legitimate gap: a portable Timeout contract that okhttp/jdkhttp adapters translate to native settings gives downstream SDK authors one knob. De-hype: the value itself is thin — a 4-field immutable + a trivial defaulting chain; the worth is the PORTABLE CONTRACT and the documented retry/IO separation, not the code. Constraint check passes: java.time.Duration only (zero-dep), and the Optional overloads (kotlin.jvm.optionals.getOrNull, Timeout.kt:94,106,118,133) are Java-8-safe. Translation to native timeouts MUST live in the transport adapters, never sdk-core, to avoid pulling okhttp/jdk-http types into core. Medium confidence because priority is lower than the retry items and it only pays off once at least one transport wires it. + - *Do:* Add an immutable Timeout (private ctor + Builder + newBuilder()) in sdk-core with connect/read/write/request and the read/write->request defaulting chain; Duration.ZERO = unbounded. Map it onto native builders in sdk-transport-okhttp and sdk-transport-jdkhttp. Keep it independent of RetrySettings.totalTimeout and document the separation (per-call I/O budget vs retry budget) exactly as Timeout.kt:47 does. +- **No async retry step exists in either pipeline family** `ADOPT` · `sdk-core` · effort L · confidence high + - *Verdict:* Verified accurately. RetryingHttpClient.kt:76-133 is a genuinely non-blocking async retry: handleAsync (run inline via it.run(), :125-128) decides retry, then sleeper.sleepAsync(backoff).thenCompose { executeWithRetries(...) } recurses (:121-123) and the whole handler is flattened with .thenCompose(Function.identity()) (:129). On OUR side I confirmed there is NO async retry step: ls of http/pipeline/steps shows only the sync RetryStep.kt/DefaultRetryStep.kt; grep for AsyncRetry/processAsync-with-retry across sdk-core hits only AsyncPipelineNext.kt and AsyncHttpStep.kt KDoc (which explicitly name async retry as the reason copy() exists, AsyncPipelineNext.kt:18,67 / AsyncHttpStep.kt:20-21,38). The recovery RetryStep is sync-only and even its wait is a blocking CompletableFuture.get() on a scheduled task (RetryStep.kt:291-322) — Loom-safe (carrier unmounts) but it occupies the calling thread, unlike their fire-and-continue thenCompose. Two minor de-hypes: (a) thenCompose(Function.identity()) is a standard CompletableFuture flatten idiom, not a novel 'trick' — no attribution needed to use it; (b) for a TOOLKIT, the right shape is an AsyncHttpStep at Stage.RETRY plus an async variant of the recovery primitive, reusing our existing classifier/parser, so this is wiring a gap we pre-built for, not new architecture. + - *Do:* Add an async retry at Stage.RETRY (AsyncHttpStep) that mirrors DefaultRetryStep's classification + delay logic but schedules the backoff on a ScheduledExecutorService returning CompletableFuture and recurses via next.copy().processAsync().thenCompose { ... }. Reuse RetryAfterParser, RetryUtils, and the existing isRetrySafe/canRetry gate verbatim. Mirror their inline-executor handleAsync so the decision runs on the completing thread, and flatten with thenCompose(Function.identity()). Decide deliberately whether to also honor a total-timeout deadline on the async path (theirs has none — see weAreAhead). +- **Unify the divergent DEFAULT retry policy (status set + backoff formula) across our families** `SIMPLIFY` · `sdk-core` · effort M · confidence high + - *Verdict:* The divergence is real and verified: (1) status sets disagree — RetryUtils.isRetryable = 408 + 429 + 5xx-except-501/505 (RetryUtils.kt:43-49), used by DefaultRetryStep's default predicate; RetrySettings.DEFAULT_RETRYABLE_STATUSES = {429,500,502,503,504}, deliberately DROPPING 408 (RetrySettings.kt:234-244). So the stage-based step retries 408 by default and the recovery step does not. (2) Backoff formulas differ — DefaultRetryStep.exponentialBackoff uses baseDelay * (1L shl tryCount) with +/-5% jitter (DefaultRetryStep.kt:443-462); BackoffCalculator uses initialDelay * multiplier^(n-1) with SYMMETRIC jitter and a true total-timeout deadline (BackoffCalculator.kt:113-165). Two formulas, two default status sets. However, the finding's framing ('their one class proves two families are over-engineering') is wrong for a toolkit: a recovery-pipeline retry (ResponseOutcome-based) and a stage-pipeline retry (HttpStep-based) are legitimately different integration points — openai-java only needs one because it is a single-API client with one pipeline. The actual defect is that the DEFAULTS and the formula drift, not that two steps exist. Re-scope from 'collapse to one step' to 'one shared default policy + one shared formula, two thin step adapters'. Correctly, do NOT adopt their fixed 0.5*2^n/8s + one-sided jitter; keep BackoffCalculator (deadline + symmetric jitter) as the single formula. + - *Do:* Make BackoffCalculator the single backoff implementation: have DefaultRetryStep.exponentialBackoff delegate to it (deleting the duplicate 1L shl tryCount path) so the deadline+symmetric-jitter behavior is shared. Resolve the 408 disagreement once (pick include-or-exclude with a documented reason) and have RetryUtils, RetrySettings.DEFAULT_RETRYABLE_STATUSES, and HttpRetryOptions all reference that single source. Keep both step families; only the policy/formula is unified. Note HttpRetryOptions.baseDelay default is 800ms vs RetrySettings.initialDelay 200ms — fold into the same reconciliation. +- **Process-wide daemon scheduler vs per-instance Timer+Cleaner (validates our design)** `LEARN` · `docs/process` · effort S · confidence high · we partly do this + - *Verdict:* Verified and actually STRONGER than the finding states. PhantomReachableSleeper wraps the Sleeper and calls closeWhenPhantomReachable(this, sleeper) so a forgotten close still reclaims the per-instance daemon Timer (PhantomReachableSleeper.kt:13-15; DefaultSleeper.kt:10 daemon Timer, :27 close=timer.cancel()). Crucially, I read PhantomReachable.kt: closeWhenPhantomReachable is a REFLECTIVE wrapper around java.lang.ref.Cleaner that NO-OPS on Java 8 (PhantomReachable.kt:33 Class.forName, :53 'We are running Java 8, which has no Cleaner'). So on Java 8 their safety net silently does nothing and the Timer thread leaks if the client is not closed. Our RetryStep.DEFAULT_SCHEDULER (RetryStep.kt:366-370) is a single process-wide single-thread DAEMON scheduler that is intentionally never closed (daemon => does not block JVM shutdown) and never per-instance, so there is no leak to clean up and no Cleaner needed. This is the correct call for a zero-dep Java-8 core. The finding's caveat (do NOT introduce Cleaner into the Java-8 core; the Java-8 fallback would be fiddly PhantomReference/ReferenceQueue) is accurate. No code change — this is a design-rationale note that confirms we are ahead. + - *Do:* Document in docs/architecture.md (or pipelines.md) the deliberate choice of a process-wide daemon ScheduledExecutorService over per-instance background threads + phantom-reachable cleanup, citing openai-java's Cleaner-that-no-ops-on-Java-8 as the rejected alternative. Revisit ONLY if per-client scheduler isolation (e.g. bounded pools per ClientOptions) becomes a requirement, at which point the Java-8 PhantomReference idiom would be needed. + +**Considered & dropped** + +- ~~Copy their thenCompose(Function.identity()) flatten as a distinct technique~~ — Folded into the async-retry finding. It is a standard CompletableFuture flatten idiom (handleAsync returns a CompletableFuture>, flattened with thenCompose(identity), RetryingHttpClient.kt:101-129), not a liftable proprietary snippet — calling it out separately would be filler. No attribution needed for an idiom. +- ~~Lower their maxRetries default (2) toward / note vs our 3~~ — Trivia, not decision-ready. Verified: their default is 2 (Builder.kt:240) with pre-increment ++retries > maxRetries (:56,62) so 2 retries = 3 total attempts, matching our maxAttempts=3 total. Same total-attempt semantics, only the counting convention differs. Nothing to adopt; at most a one-line doc note already captured in the antipatterns commentary. +- ~~Adopt their hardcoded single shouldRetry classifier for simplicity~~ — Inapplicable and a regression for a toolkit. Their status set + formula + jitter are baked into private methods with no predicate/strategy seam (RetryingHttpClient.kt:162-182,222-227); our HttpRetryConditionPredicate / HttpRetryDelayProvider seams (HttpRetryOptions.kt:21-40) are strictly better. The legitimate part (unify our DEFAULTS) is already captured as the SIMPLIFY finding; 'adopt their hardcoding' is an antipattern, not a finding. + +**Do not copy** + +1) Hardcoded, non-configurable retry policy. RetryingHttpClient bakes the status set (408/409/429/5xx), the formula (0.5*2^n, cap 8s), and jitter directly into private methods (:162-182, :222-227) with no predicate/strategy seam — to change which statuses retry you must subclass or fork. Our HttpRetryConditionPredicate / HttpRetryDelayProvider seams (HttpRetryOptions.kt:21-40) are strictly better for a toolkit; do NOT regress toward their hardcoding. +2) One-sided jitter that only ever SHORTENS the delay: `jitter = 1.0 - 0.25*rand()` ∈ [0.75,1.0) (:225). This caps the effective backoff strictly below the nominal value (real max ≈ 6s not 8s) and biases every client to retry slightly EARLIER than scheduled — the opposite of what jitter should do for herd avoidance on the high side. Our symmetric jitter (BackoffCalculator.kt:131-149) keeps the midpoint at the nominal delay; keep ours. +3) Server Retry-After hint used verbatim with NO upper clamp and NO deadline integration (:216-219 returns immediately). A hostile/buggy server sending 'Retry-After: 86400' makes the client sleep a day. Our parser clamps to 365d (RetryAfterParser MAX_DELAY) and BackoffCalculator clamps the hint to the remaining deadline (BackoffCalculator.kt:82) — do NOT drop those guards to mimic them. +4) Retry-After HTTP-date parsed inside getRetryBackoffDuration via a broad try/catch swallowing only DateTimeParseException (:203-213) — an out-of-range OffsetDateTime can throw DateTimeException (a sibling, not subclass, of DateTimeParseException) and escape the catch, breaking the retry loop. Our parseRfc1123Instant catches BOTH (RetryAfterParser.kt:241-252); their narrower catch is a latent totality bug — do not copy it. +5) maxRetries default of 2 (Builder.kt:240) is lower than our 3 (RetrySettings/HttpRetryOptions); not wrong, just note the difference if aligning defaults. Their retries counter is pre-incremented and compared `++retries > maxRetries` (:56,62) meaning 2 retries = 3 total attempts, same total-attempt semantics as our maxAttempts=3 — the names differ (retries vs attempts) so do not conflate the numbers when documenting. + +**Where we're ahead** + +1) Retry-After coverage: our RetryAfterParser handles five header families (Retry-After seconds, Retry-After HTTP-date with TWO grammars, retry-after-ms, x-ms-retry-after-ms, and X-RateLimit-Reset as a jittered Unix epoch — RetryAfterParser.kt:268-285) vs their three (Retry-After-Ms, Retry-After seconds, Retry-After date — RetryingHttpClient.kt:196-214). X-RateLimit-Reset support alone covers GitHub/Stripe/Slack/Twilio, which their parser silently ignores. +2) Totality and overflow safety: our parser and BackoffCalculator are provably total — every numeric/date path clamps to a 365-day ceiling before toNanos() and catches both DateTimeException and ArithmeticException (RetryAfterParser.kt:54-61, BackoffCalculator.kt:174-179 toNanosSaturating). Theirs has the DateTimeException-escape bug (antipattern #4) and no overflow clamp on the hint. +3) Real total-timeout deadline (gax-style): RetryStep checks elapsed vs RetrySettings.totalTimeout and refuses a retry that would overshoot the budget (RetryStep.kt:200-205, fitsInDeadline :258-265), and BackoffCalculator clamps each delay to the remaining budget (:82,156-165). openai-java has NO retry-budget deadline at all — only a per-call Timeout that explicitly excludes retries — so a server feeding 429s with large Retry-After values can keep their client retrying unboundedly in wall-clock terms. +4) Loom-safe waiting with correct interrupt handling: our awaitDelay uses a ScheduledExecutorService + CompletableFuture.get and on interrupt restores the flag, cancels the scheduled task, and throws InterruptedIOException (RetryStep.kt:290-322); DefaultRetryStep's sleepOrAbort does the same and attaches accumulated suppressed exceptions (DefaultRetryStep.kt:509-523). Their sync DefaultSleeper.sleep is a bare Thread.sleep(duration.toMillis()) (DefaultSleeper.kt:12) that pins a carrier thread under Loom and does not restore the interrupt flag. +5) Suppressed-exception trail: DefaultRetryStep attaches every prior attempt's exception to the terminal failure via addSuppressed (DefaultRetryStep.kt:227,341) so callers see the full retry history; openai-java throws only the last throwable (:62-64), discarding earlier failures. +6) Symmetric, midpoint-preserving jitter (BackoffCalculator.kt:131-149) vs their one-sided shortening jitter (antipattern #2). +7) Body-replayability gate is method-aware: ours distinguishes body-less idempotent methods (retry on method) from body-bearing requests (retry only if body.isReplayable()) — RetryStep.kt:352-355, DefaultRetryStep.kt:255-258. Theirs collapses to `request.body?.repeatable() ?: true` (:140-143), which retries a body-less POST (non-idempotent) unconditionally — a correctness gap ours closes deliberately. + +_Verifier notes:_ Re-read all five reference files (RetryingHttpClient.kt, Sleeper.kt, DefaultSleeper.kt, PhantomReachableSleeper.kt, Timeout.kt — note: the three Sleeper files live in com/openai/core/, NOT com/openai/core/http/ as cited) and all OUR comparison files line by line. Overall the analysis is high-quality and substantively accurate; I kept all 8 findings but recalibrated 3 (Sleeper -> weAlreadyDoIt for sync, folds into async-retry; SIMPLIFY re-scoped from 'collapse two step families' to 'unify defaults+formula, keep both families'; Timeout/retry-count -> medium confidence as thin-value/observability items). Corrections to the analysis's antipattern/weAreAhead claims, all verified against source: (1) ANTIPATTERN #2 jitter range is mis-stated. jitter = 1.0 - 0.25*rand() with rand() in [0,1) yields jitter in (0.75, 1.0], not [0.75, 1.0) — the endpoints are inverted. More importantly the claim 'real max ~= 6s not 8s' is WRONG: with backoffSeconds=8.0 and jitter<=1.0 the effective delay is in (6.0, 8.0], so the MAX is still 8s; 6s is the FLOOR of the jittered top tier. The substantive point stands and is correct: the jitter only ever SHORTENS the nominal delay and biases every client to fire slightly EARLY (the wrong direction for herd avoidance) — our symmetric midpoint-preserving jitter (BackoffCalculator.kt:131-149) is better. (2) ANTIPATTERN #4 mechanism is partly mis-attributed. OffsetDateTime.parse(text, RFC_1123_DATE_TIME) throws DateTimeParseException, which IS caught (:211). The genuinely-uncaught throw on a syntactically-valid-but-extreme date is ArithmeticException from ChronoUnit.NANOS.between(...) (:204) on a far-future OffsetDateTime (and OffsetDateTime.from inside parse can throw a bare DateTimeException for resolved-but-invalid values). So the CONCLUSION (a valid far-future Retry-After date can crash their retry loop, and ours is total) is correct, but the cited 'DateTimeException sibling of DateTimeParseException' is not the primary escaping exception — ArithmeticException is. Our parser catches DateTimeException AND ArithmeticException and clamps to 365d before toNanos (RetryAfterParser.kt:54-61,241-252,288-305), so totality holds. (3) weAreAhead #7 (body-replayability gate) verified: their gate is request.body?.repeatable() ?: true (RetryingHttpClient.kt:140-143) — a body-less request returns true (retryable) and method-awareness never enters (shouldRetry is status-only), so a body-less POST that 503s WOULD be retried. Ours splits body-less (retry on idempotent method, RetryStep.kt:352-355 / DefaultRetryStep.kt:255-258) from body-bearing (retry only if isReplayable). Caveat for fairness: openai-java is a generated single-API client (OpenAI is POST-heavy with server idempotency keys), so this is a deliberate tradeoff for them, not negligence; the toolkit-level point that method-awareness is the correct default still holds. Constraint compliance of all kept findings confirmed: every recommendation stays Java-8 + zero-dep in sdk-core (Duration, ScheduledExecutorService, CompletableFuture, ThreadLocalRandom only) or pushes transport-specific translation to adapters (Timeout). No recommendation drags Jackson/okhttp/coroutines into core. Pre-1.0 so the SIMPLIFY reconciliation is safe to make breaking." + +--- + +## 6. Response handlers (Error/Json/String/Empty) + raw-response abstraction + +**What it is** + +openai-java models response decoding as a tiny one-method seam: `interface HttpResponse.Handler { fun handle(response: HttpResponse): T }` (HttpResponse.kt:26-29). Four trivial implementations cover every decode need — `jsonHandler` (JsonHandler.kt:12-20, Jackson `readValue` + reified `jacksonTypeRef`), `stringHandler` (StringHandler.kt:10-13, `readBytes().toString(UTF_8)`), `emptyHandler` (EmptyHandler.kt:10-12, returns `null` for 204), and the error pair `errorBodyHandler`/`errorHandler` (ErrorHandler.kt). The raw-vs-parsed seam is `HttpResponseFor : HttpResponse { fun parse(): T }` (HttpResponseFor.kt:5-8) plus the `HttpResponse.parseable { ... }` extension (HttpResponseFor.kt:11-25) that wraps an already-received raw response and parses lazily (`private val parsed: T by lazy { parse() }`, line 14), delegating statusCode/headers/body/close to the underlying response. Each generated service exposes two faces: `withRawResponse()` returns `HttpResponseFor` (parse-on-demand, keep raw access), and the default methods are just `withRawResponse().op(...).parse()` (ModelServiceImpl.kt:45,49,53; async mirror `thenApply { it.parse() }` in ModelServiceAsyncImpl.kt:49). The error check is composed onto the parse: `errorHandler.handle(response).parseable { response.use { opHandler.handle(it) }.also { if (responseValidation) it.validate() } }` (ModelServiceImpl.kt:90-98). The architecture cleanly splits three concerns we currently fuse or omit: (a) status→exception mapping, (b) body→typed-value decoding, (c) raw-vs-parsed access with deferred parsing and explicit resource ownership. + +**How it works (line-level)** + +Handler seam is one method, nested in HttpResponse so every handler is `HttpResponse.Handler` (HttpResponse.kt:26-29: `interface Handler { fun handle(response: HttpResponse): T }`). + +jsonHandler is reified-generic and funnels ALL parse failures to one exception type (JsonHandler.kt:12-19): `inline fun jsonHandler(...): Handler = object : Handler { override fun handle(response) = try { jsonMapper.readValue(response.body(), jacksonTypeRef()) } catch (e: Exception) { throw OpenAIInvalidDataException("Error reading response", e) } }`. The `jacksonTypeRef()` call (no explicit type arg) recovers the full generic type from the reified `T`, so `jsonHandler` decodes parametric types correctly — this is their answer to the `List` erasure problem. + +errorHandler is two composed handlers (ErrorHandler.kt). `errorBodyHandler` (lines 26-40) parses the envelope defensively: parse to `JsonNode`, pull `.get("error")` and re-read it as `JsonField`, else fall back to the whole node, and on ANY exception return `JsonMissing.of()` (line 37) — a malformed error body never masks the real HTTP failure. `errorHandler` (lines 43-93) is a pure status→exception switch: `200..299 -> response` (pass-through, body untouched), then one `throw XException.builder().headers(...).error(errorBodyHandler.handle(response)).build()` per code (400/401/403/404/422/429), `500..599 -> InternalServerException`, else `UnexpectedStatusCodeException`. Crucially it returns the SAME `HttpResponse` on success so the body is still readable downstream. + +parseable is the lazy raw-keeping wrapper (HttpResponseFor.kt:11-25): `internal fun HttpResponse.parseable(parse: () -> T): HttpResponseFor = object : HttpResponseFor { private val parsed: T by lazy { parse() }; override fun parse() = parsed; override fun statusCode() = this@parseable.statusCode(); override fun headers() = ...; override fun body() = ...; override fun close() = ... }`. Parsing is deferred and memoized; raw access (status/headers/body) stays available; close is delegated. + +Wiring in service impl (ModelServiceImpl.kt:89-98): `val response = clientOptions.httpClient.execute(request, requestOptions); return errorHandler.handle(response).parseable { response.use { retrieveHandler.handle(it) }.also { if (requestOptions.responseValidation!!) it.validate() } }`. Order matters: error-check first on the open response, then lazy typed-parse inside `use {}` (auto-close after read). The default-method face collapses to one line: `override fun retrieve(...) = withRawResponse().retrieve(...).parse()` (ModelServiceImpl.kt:45). + +StreamResponse (StreamResponse.kt:5-19) is the streaming analog: `interface StreamResponse : AutoCloseable { fun stream(): Stream }` with an internal `map(transform)` that lazily transforms each element and forwards close — same raw-keeping + lazy-transform philosophy for SSE. + +**vs. our SDK** + +We have NO equivalent of any of this in sdk-core. Concretely: + +(1) Our HTTP pipeline is typed `Request -> Response` end-to-end with no typed-value layer: HttpStep.process returns `Response` (HttpStep.kt:39-42), and the terminal is `httpClient.execute(request)` returning `Response` (HttpPipeline.kt:42). There is no `Handler`, no `parse()`, no `withRawResponse`. + +(2) Our SERDE pillar stage is a declared-but-empty slot: `SERDE(1000, true), // pillar: body-to-bytes (reserved; currently unused)` (Stage.kt:45). So the pipeline has a reserved place for request-side serialization but nothing for response-side deserialization at all. + +(3) We have the serde abstraction (`Serde`/`Serializer`/`Deserializer`, Serde.kt:18-24, Deserializer.kt:27-48) but it is never connected to `Response`/`ResponseBody`. Nothing maps `ResponseBody.source()`/`bytes()`/`string()` (ResponseBody.kt:54-64) through a `Deserializer` to a typed value. The reified `Deserializer.deserialize` extensions (Deserializer.kt:51-61) are the moral equivalent of `jsonHandler`'s `jacksonTypeRef()` trick, but there is no `Handler` wrapping them and no raw-vs-parsed seam over them. + +(4) Status->exception mapping exists and is actually cleaner than theirs in one respect: `HttpExceptionFactory.fromResponse` (HttpExceptionFactory.kt:76-102) is a pure function `Response -> HttpException`, decoupled from any serde, with retryability derived from one source (`HttpException.retryable = RetryUtils.isRetryable(status.code)`, HttpException.kt:72) rather than baked per-subclass. But it does NOT deserialize a typed error body — `HttpException.body` is the raw lazy `ResponseBody` (HttpException.kt:63) with only a `bodySnapshot(maxBytes)` peek helper (HttpException.kt:94-115). The codegen-hook comment at HttpExceptions.kt:16-19 explicitly defers typed-error-payload stamping to a future generator. + +(5) No streaming raw-vs-parsed analog. We have SSE (`ServerSentEventReader`/`ServerSentEventListener`) but no `StreamResponse` that maps a raw event stream to a typed `Stream` with lazy transform + forwarded close. + +**Recommendations (verified)** + +- **Add a one-method ResponseHandler response-decode seam (interface in core, JSON binding in the adapter)** `ADOPT` · `both` · effort S · confidence medium + - *Verdict:* Re-read confirms the mechanism exactly: HttpResponse.kt:26-29 is a bare `interface Handler { fun handle(response): T }`; the four impls are trivial (JsonHandler.kt:12-20 `jsonMapper.readValue(body(), jacksonTypeRef())`, StringHandler.kt:11-12 `readBytes().toString(UTF_8)`, EmptyHandler.kt:10-11 returns null, errorHandler/errorBodyHandler in ErrorHandler.kt). We verified sdk-core has NO Handler equivalent (grep: only ChallengeHandler, unrelated). The claim is accurate and we lack it. Two corrections to the original framing, though: (a) the value is overwhelmingly at the CODEGEN layer, not sdk-core — in our toolkit a hand-author already has `ResponseBody.string()`/`.source()` + `Deserializer.deserialize(InputStream)` (Deserializer.kt:44-47,60), so the 'collapses boilerplate' argument is thin for sdk-core itself and strong only for generated services; (b) the analysis correctly flags but under-weights that putting jacksonTypeRef/JsonMapper into a core handler (the JsonHandler.kt:5-6 / ErrorHandler.kt:7-9 shape) is a hard zero-dep violation. The honest scope: a 3-line `fun interface ResponseHandler` + zero-dep String/Empty handlers in core is cheap and harmless; the JSON handler MUST be `jsonHandler(serde, Class)` in sdk-serde-jackson built on the existing Deserializer. Do NOT mirror Kotlin reified `jacksonTypeRef` into core's contract — core stays `Class`/type-token based. + - *Do:* Add `fun interface ResponseHandler { fun handle(response: Response): T }` to org.dexpace.sdk.core.http.response. Ship dependency-free `StringResponseHandler` and an `EmptyResponseHandler` modeled as Unit/a token (NOT `Void?` — see notes). Ship `jsonHandler(serde: Serde, type: Class)` in sdk-serde-jackson over `Deserializer.deserialize(InputStream, Class)`, plus a reified convenience in the adapter only. Treat this primarily as a building block the future generator emits against, not as a sdk-core ergonomics win. +- **Add an HttpResponseFor-style raw-vs-parsed seam (lazy parse + retained headers/status/body)** `ADOPT` · `both` · effort S · confidence medium + - *Verdict:* Verified verbatim: HttpResponseFor.kt:5-8 is `interface HttpResponseFor : HttpResponse { fun parse(): T }`; the `parseable` extension (11-25) memoizes via `private val parsed: T by lazy { parse() }` (line 14) and delegates statusCode/headers/body/close to the wrapped response. The service default really is `withRawResponse().retrieve(...).parse()` (ModelServiceImpl.kt:48-49) and the async mirror is `thenApply { it.parse() }` — confirmed by reading the impl. We have no equivalent (grep for parseable/withRawResponse/ParsedResponse: none in core). Two important caveats the original analysis got right but should be stated louder: (1) This is ALREADY a recorded plan item — docs/refs-comparison.md:408 'Raw/Cooked client split... Single withRawResponse() accessor' — so this is not a net-new discovery; it confirms our own roadmap with a concrete reference implementation. (2) The raw/cooked split is fundamentally a GENERATED-CLIENT concept; the sdk-core deliverable is only the tiny `Response.parseable {}`/`ParsedResponse` primitive — most of the payoff is at codegen. The `@MustBeClosed` enforcement (ModelService.kt:5,126+, from com.google.errorprone) is real and genuinely unavailable to us; their lazy-parse means a raw caller who never calls parse() leaks the body unless they `use {}`. Our Response is already Closeable (Response.kt:48,64) so delegation is clean, but we cannot enforce close at compile time — document the ownership rule and consider a leak detector (cf. their PhantomReachableClosingStreamResponse.kt, which I confirmed exists and uses closeWhenPhantomReachable). + - *Do:* Add `ParsedResponse` + `Response.parseable { }` (memoized `by lazy`, delegates close to the wrapped Response) to org.dexpace.sdk.core.http.response, as the primitive the generator's withRawResponse()/cooked split (already planned, refs-comparison:408) builds on. KDoc the ownership contract precisely: parsed-path closes after parse; raw-path caller must `use {}`. Defer @MustBeClosed-style enforcement to docs; optionally add a phantom-reachable leak detector later. +- **Define one core deserialization-failure exception; apply throw-on-success-parse vs swallow-on-error-parse policy** `ADOPT` · `both` · effort S · confidence medium + - *Verdict:* Accurate. JsonHandler.kt:14-19 wraps every parse exception in `OpenAIInvalidDataException("Error reading response", e)` (cause preserved); errorBodyHandler (ErrorHandler.kt:30-39) does the opposite on the error path — `catch (e: Exception) { JsonMissing.of() }` (line 36-37) so a malformed error envelope degrades instead of masking the HTTP status. The two-policy observation is the real, non-obvious insight and it is correct. But 'COPY' is the wrong category: the code is a 3-line try/catch idiom, not a liftable artifact — no meaningful provenance question, so I recategorize to ADOPT (define the core type) with the dual policy captured as the design rule. Also: this is already on our roadmap — docs/refs-comparison.md:350 'Tolerant error-body parsing. Decode typed/structured error payloads without throwing inside an exception constructor' is precisely the swallow-on-error-path half. The applicability constraint is right: the exception type must live in core (so Jackson's exception type never escapes the adapter boundary — today a hand-author gets a raw JacksonException, an abstraction leak), and be thrown BY the adapter handler; do not name it after the format. + - *Do:* Add a core `ResponseDeserializationException(message, cause)` (dependency-free). In the sdk-serde-jackson handler, wrap parse failures in it (JsonHandler.kt:14-19 shape) so the Jackson type stays inside the adapter. When typed error-body decoding lands (planned, refs-comparison:350), apply the ErrorHandler.kt:36-37 swallow-and-degrade policy: a malformed error body must never replace the HTTP exception. Capture both policies as the design rule, not as copied code. +- **Add a StreamResponse-style closeable, lazily-mapped typed stream as the seam our async adapters wrap** `ADOPT` · `both` · effort M · confidence low + - *Verdict:* Mechanism verified: StreamResponse.kt:5-11 is `interface StreamResponse : AutoCloseable { fun stream(): Stream }`; the internal `map` (13-19) wraps `this.stream().map(transform)` and forwards close() (line 18) — the load-bearing detail is correct (transform preserves the single close that releases the connection). They also wrap it in PhantomReachableClosingStreamResponse.kt for leak safety (confirmed). We have SSE plumbing (ServerSentEventReader/Listener/Extensions) yielding raw events but no typed closeable lazily-mapped stream handle (grep: no StreamResponse). java.util.stream.Stream is Java 8 — no toolchain conflict, claim correct. Why I set confidence LOW despite the gap being real: (1) the design question is unresolved — a pull-based `java.util.stream.Stream` is the wrong primitive for our toolkit when reactor (Flux) and coroutines (Flow) adapters already own streaming; a core `Stream` could become a competitor rather than a seam, and the analysis itself flags this without resolving it. An Iterator/Source-based core handle that the Flux/Flow adapters wrap is more in keeping with our layering. (2) This is squarely a codegen need (a return type for `chat.completions.stream()`-style ops), so it should be designed alongside the generator, not landed speculatively now. (3) Must integrate with our interrupt-aware close contract. This is a legitimate ADOPT but it needs a brainstorming/design pass first, hence low confidence on the specific shape. + - *Do:* Design (don't rush) a dependency-free closeable typed-stream seam in org.dexpace.sdk.core.http.sse or .response whose close() releases the connection and whose map forwards close. Prefer an Iterator/Source-based handle over java.util.stream.Stream so the reactor/coroutines adapters wrap it rather than compete. Build per-event deserialization on the ResponseHandler/Serde seam (finding 1). Sequence it with codegen design; consider a phantom-reachable leak guard like PhantomReachableClosingStreamResponse.kt. +- **Consolidate the two overlapping exception hierarchies and use the already-present typed error-payload slot instead of adding another** `ADOPT` · `both` · effort M · confidence medium · we partly do this, claim-qualified + - *Verdict:* The original finding ('add an optional typed error payload slot to HttpException; ours has only raw body') is INACCURATE because it missed that the slot already exists on a SECOND hierarchy. There are two parallel error types: (a) HttpException (RuntimeException, exception/ subpackage) with status/headers/body/retryable/bodySnapshot and 18 open subclasses + factory, body is raw ResponseBody only (HttpException.kt:63); and (b) HttpResponseException (IOException, top-level response/, HttpResponseException.kt:36-43) which ALREADY carries `public val value: Any?` documented as 'Optional deserialized payload describing the error (e.g. a JSON error body)' (line 33,41) plus its own isRetryable. The api snapshot confirms both (sdk-core.api:1103-1110 HttpResponseException with getValue()Ljava/lang/Object; + isRetryable; :1329-1334 HttpException with bodySnapshot + getRetryable). And HttpResponseException is NEVER constructed in main (grep: only its unit test) — it is dead/aspirational. Meanwhile HttpException's factory is also never called. So the accurate finding is NOT 'add a slot' — it is 'we have two competing, both-unwired error hierarchies, one of which already has the typed-payload slot the analysis wanted to add.' Also note openai-java is actually AHEAD here: OpenAIServiceException exposes code()/param()/type()/body():JsonValue (OpenAIServiceException.kt:12-22), real typed-error access — so 'we lack what they have' is the correct direction for the payload itself. The defensive-swallow requirement (don't throw while decoding the error body) is right and overlaps finding 4. This is on our roadmap (refs-comparison:350). + - *Do:* Before adding any slot: decide between the two hierarchies. Recommend keeping the RuntimeException-based HttpException family (richer: factory + per-code subclasses + retryable + bodySnapshot) and either deleting HttpResponseException or folding its `value: Any?` typed-payload idea into HttpException as an optional decoded-error accessor. Then (codegen phase) have the generated client decode the error body via the Handler/Serde seam, defensively swallowing parse failures (finding 4), buffering the small error body before throw so raw + decoded coexist. Track as the refs-comparison:350 item. +- **Compose error-mapping and deserialization at the generated-service layer as Response->X handlers, NOT as HTTP-pipeline stages** `LEARN` · `docs/process` · effort S · confidence high + - *Verdict:* This is the strongest item and my own dig makes it stronger. Verified: their error mapping is a `Handler` (ErrorHandler.kt:43-93) that returns the untouched response for 200..299 (line 49) or throws, composed at the call site immediately before parse: `errorHandler.handle(response).parseable { response.use { handler.handle(it) }.also { ...validate() } }` (ModelServiceImpl.kt:90-98). errorHandler is built once per service (ModelServiceImpl.kt:58-59). So status->exception and bytes->T are both Response->X functions composed by application, not pipeline stages. Crucial reinforcement from reading OUR code: `HttpExceptionFactory.fromResponse` is a pure `Response -> HttpException` (HttpExceptionFactory.kt:76-102) but it is NEVER CALLED anywhere in main (grep confirms zero call sites outside its own KDoc), and there is no response-deserialization stage — the SERDE pillar is explicitly `reserved; currently unused` (Stage.kt:45) and HttpStep.process is fixed to return Response (HttpStep.kt:38-41), terminal `httpClient.execute(request)` returns Response (HttpPipeline.kt:42). So we don't 'fuse' these concerns as the comparison says — we currently OMIT the wiring entirely and have a fixed-typed pipeline that structurally cannot return T. That is exactly why this principle matters: when codegen lands, compose fromResponse-throw + handler-parse at the generated service face; do NOT try to make deserialization a pipeline stage (the Response return type would force a typed-payload-on-Response hack). The SERDE pillar at Stage.kt:45 should only ever be REQUEST-body serialization, if used at all. + - *Do:* Record in docs/refs-comparison.md (Error Model §, already present at :213) and docs/pipelines.md: response error-mapping (HttpExceptionFactory.fromResponse) and body deserialization compose at the generated-service layer as Response->X handlers; the HTTP pipeline stays transport-pure and keeps returning raw Response; the SERDE pillar (Stage.kt:45) is for request-body serialization only. Note that fromResponse is currently unwired — the generator is the intended caller. +- **Our status->exception mapping is table-driven and single-source for retryability (vs their inline per-status throw ladder)** `LEARN` · `sdk-core` · effort S · confidence medium · we partly do this + - *Verdict:* The structural comparison is accurate: ErrorHandler.kt:48-92 is a 45-line when with a 5-7 line `throw XException.builder().headers(...).error(...).build()` per code, while HttpExceptionFactory.kt:77-102 is a compact when over named consts and retryability is one derived val (`RetryUtils.isRetryable(status.code)`, HttpException.kt:72) shared by all subclasses including fallbacks. BUT the original 'we are simpler AND more correct, keep ours' framing overstates and partly misreads: (1) Their ladder is more verbose because it does MORE — each branch stamps a typed error payload via `.error(errorBodyHandler.handle(response))`, which ours does not; per LOC theirs buys typed-error access (finding 7). (2) Critically, OUR factory is dead code — never called in main (grep confirms), whereas their handler is the live error path. 'Keep what we have' is weak when 'what we have' isn't wired up. (3) Their retryability isn't 'scattered' so much as deliberately located in RetryingHttpClient.shouldRetry(response/throwable) (lines 162-185) operating on the raw response, with header overrides (X-Should-Retry, lines 164-179) we don't model. The accurate, narrower true claim: our exception OBJECT carries a trustworthy retryable flag and theirs doesn't (their OpenAIServiceException exposes statusCode/headers/body/code/param/type but NOT retryable). I downgrade ADOPT-adjacent 'keep' to LEARN and strip the hype: ours is a good shape for codegen (zero per-op error code) and shouldn't be replaced by a ladder, but it must first be WIRED and extended with a payload slot (finding 7), and we should consider modeling per-response retry overrides like their X-Should-Retry. + - *Do:* Keep HttpExceptionFactory table-driven and wire it into the error path the generator/pipeline-adjacent code will call (it is currently unused). When typed error bodies land, extend the factory/exception with an optional decoded payload (finding 7) rather than inlining a per-op throw ladder. Separately evaluate adopting a per-response retry-override header (their X-Should-Retry, RetryingHttpClient.kt:164-179) in our retry step. + +**Considered & dropped** + +- ~~Antipattern: Jackson hard-wired into core handlers (do not copy JsonMapper/jacksonTypeRef into sdk-core)~~ — Accurate (JsonHandler.kt:5-6, ErrorHandler.kt:7-9 import com.fasterxml.jackson into openai-java core; jsonHandler takes a JsonMapper, JsonHandler.kt:12) and important, but it is not a standalone decision-ready finding — it is a constraint already folded into findings 1 and 4 (JSON handler lives in sdk-serde-jackson; core stays Class/Serde-based). Captured there to avoid a filler item. +- ~~Antipattern: errorBodyHandler dual body-consumption contract (consumed on error branch, untouched on 2xx)~~ — Accurate observation (ErrorHandler.kt:32 reads body on error branches that all throw; line 49 returns 2xx response unread) and worth heeding, but it is an implementation caution for finding 3's compose-then-parse pattern, not a separate recommendation. Merged conceptually into finding 3 (the success branch must not touch the body before parse). +- ~~Antipattern: reliance on @MustBeClosed for raw-response resource safety~~ — Accurate and verified (ModelService.kt:5,126+ uses com.google.errorprone.annotations.MustBeClosed; we have no equivalent; PhantomReachableClosingStreamResponse.kt is their runtime fallback). Already incorporated into finding 2's risk/recommendation (document ownership, consider a leak detector). Not a distinct actionable item. +- ~~Antipattern: EmptyHandler returns Void? (Java-interop wart)~~ — Accurate (EmptyHandler.kt:8,11 return Void? = null) but trivial. Folded into finding 1's recommendation (model 'no body' as Unit/a token, not Void?). Not worth a standalone finding. +- ~~weAreAhead: Status is a total type vs their bare Int statusCode~~ — Accurate (Status.kt:27-38,207 total + fromCodeOrNull; HttpResponse.kt:10 statusCode():Int; their only structure for odd codes is UnexpectedStatusCodeException, ErrorHandler.kt:86-91). But it is a 'we're ahead' note about a DIFFERENT subsystem (status modeling) not the handler/raw-response subsystem under review, and drives no change. Recorded in notes, not as a finding. +- ~~weAreAhead: non-consuming bodySnapshot peek vs their consuming error-body read~~ — Accurate (HttpException.bodySnapshot uses source().peek(), HttpException.kt:99-115, capped; openai-java has no peek). Genuine advantage but it is a 'keep' note with no action, and overlaps the finding-7 discussion of error-body access. Noted, not a separate decision-ready item. +- ~~weAreAhead: Response/ResponseBody are first-class immutable models with Builder vs their bare 4-method HttpResponse interface~~ — Accurate but expected and inert: their HttpResponse is intentionally a thin single-API-client SPI (HttpResponse.kt:8-30); ours is a toolkit model. No change implied; it is context, not a finding. Captured in notes. + +**Do not copy** + +1. Jackson hard-wired into the core handlers. ErrorHandler.kt:7-8 and JsonHandler.kt:5-6 import `com.fasterxml.jackson.*` directly into openai-java's CORE module, and `jsonHandler` takes a `JsonMapper` param (JsonHandler.kt:12). For us this is a direct violation of zero-dep-core — any Handler we add to sdk-core must be either dependency-free (String/Empty) or generic over our `Serde`/`Class` with the Jackson binding living ONLY in sdk-serde-jackson. Do not copy the `JsonMapper`/`jacksonTypeRef` coupling into core. + +2. errorBodyHandler runs jsonHandler INSIDE the error path, re-reading the body per branch (ErrorHandler.kt:32 `handler.handle(response)`), but the success branch (line 49) returns the response with body unread. This dual contract (body sometimes consumed, sometimes not, depending on status) is subtle and easy to get wrong — it works only because the error branches all throw (so no double-read) and 2xx never touches the body. If we adopt the compose-error-then-parse shape, we must NOT generalize errorBodyHandler to also run on 2xx, or we will consume the success body before parse. + +3. Reliance on `@MustBeClosed` (ModelService.kt:167) to enforce raw-response resource safety. This is a compile-time lint that does not exist for our Kotlin/Java-8 consumers; copying the `withRawResponse()`-returns-unclosed-handle pattern WITHOUT an equivalent enforcement risks body/connection leaks. Our `ParsedResponse` must be `Closeable` with the ownership rule documented and ideally a finalizer/leak-detector (cf. their PhantomReachableClosingStreamResponse.kt) rather than trusting an annotation we cannot enforce. + +4. `EmptyHandler` returns `Void?` (EmptyHandler.kt:8,11) — a Java-interop wart (`Void?` is `null` typed as the uninstantiable `Void`). In Kotlin we should model 'no body' as `Unit` or a dedicated `Empty` token, not `Void?`; do not copy the `Void?` return type. + +**Where we're ahead** + +1. Status->exception mapping is a pure, table-driven function (`HttpExceptionFactory.fromResponse`, HttpExceptionFactory.kt:76-102) decoupled from serde and from the transport, vs their 43-line inline throw-ladder embedded in a Jackson-dependent handler (ErrorHandler.kt:48-92). Ours generates zero per-operation error code; theirs reproduces the ladder shape in every service's WithRawResponseImpl. + +2. Single-source retryability: `HttpException.retryable = RetryUtils.isRetryable(status.code)` (HttpException.kt:72) guarantees the baked flag can never disagree with the live retry policy, including for the 4xx/5xx fallback subclasses (HttpExceptions.kt:325-364). openai-java derives retryability separately in RetryingHttpClient, divorced from the exception object, so the exception itself carries no trustworthy retryable signal. + +3. `Status` is a TOTAL type (Status.kt:27-38,207): every integer maps to a `Status`, vendor codes (499, 520-526) preserved, never throws. openai-java carries status as a bare `Int` (HttpResponse.kt:9 `statusCode(): Int`) with the catch-all `UnexpectedStatusCodeException` (ErrorHandler.kt:86-91) as the only structure for non-standard codes — they lose the named-vs-unknown distinction we keep via `fromCodeOrNull` (Status.kt:215). + +4. Non-consuming error-body preview: `HttpException.bodySnapshot(maxBytes)` uses `source().peek()` to log error bodies WITHOUT consuming the primary read path (HttpException.kt:99-115), with an explicit cap against rogue megabyte payloads. openai-java has no peek/snapshot equivalent — reading their error body via `errorBodyHandler` consumes it. + +5. Our `Response`/`ResponseBody` are first-class immutable models with Builder + newBuilder and an explicit single-use streaming contract (Response.kt:40-48, ResponseBody.kt:38-64). Their `HttpResponse` is a bare 4-method interface over an `InputStream` (HttpResponse.kt:8-30) with no builder, no media-type/content-length on the body surface, and no peek — appropriate for a single-API client but thinner than a toolkit needs. + +_Verifier notes:_ VERDICT: the analysis is unusually accurate on the openai-java side — every cited file/line checked out (Handler HttpResponse.kt:26-29; the four trivial handlers; HttpResponseFor.parseable `by lazy` HttpResponseFor.kt:14; the compose-at-call-site `errorHandler.handle(response).parseable{...}` ModelServiceImpl.kt:90-98; StreamResponse.map forwarding close StreamResponse.kt:18; @MustBeClosed ModelService.kt; PhantomReachableClosingStreamResponse.kt). The subsystem gap is real: sdk-core has NO Handler/parse/withRawResponse/StreamResponse and the Deserializer (Deserializer.kt) is never wired to Response/ResponseBody (grep confirms). + +THREE THINGS THE ORIGINAL ANALYSIS GOT WRONG OR MISSED — the reason for the heavy edits: + +(1) DEAD/DUPLICATE ERROR HIERARCHIES. There are TWO error types, both unwired. HttpException (RuntimeException, exception/ pkg, rich, factory + 18 subclasses) and HttpResponseException (IOException, response/ pkg) which ALREADY has `public val value: Any?` = the exact 'typed error payload' slot finding 7 proposed adding (HttpResponseException.kt:33,41; api:1109). HttpResponseException is never constructed in main (only its test). Finding 7 had to be rewritten from 'add a slot' to 'consolidate two hierarchies; the slot exists'. claimAccurate=false for that finding. + +(2) THE ERROR PATH IS UNWIRED, NOT FUSED. HttpExceptionFactory.fromResponse is a pure function called from NOWHERE in main (grep: zero call sites). The pipeline returns raw Response end-to-end (HttpStep.kt:38-41, HttpPipeline.kt:42) and the SERDE pillar is explicitly 'reserved; currently unused' (Stage.kt:45). The comparison's 'we fuse or omit' is really 'we omit the wiring entirely'. This strengthens finding 3 (compose at the codegen layer) and weakens finding 6's 'keep ours' (a dead factory isn't a win until wired). + +(3) THEIR EXCEPTIONS ARE NOT THIN. OpenAIServiceException exposes statusCode/headers/body:JsonValue/code/param/type (OpenAIServiceException.kt:12-22) — real typed-error access we LACK. The only accurate narrow 'we're ahead' claim is the retryable FLAG: ours lives on the exception object (HttpException.retryable, HttpException.kt:72; NetworkException.retryable=true) while theirs lives in RetryingHttpClient.shouldRetry (RetryingHttpClient.kt:162-185, incl. an X-Should-Retry header override at :164-179 we don't model and could adopt). I corrected the over-broad weAreAhead framing. + +ROADMAP OVERLAP (downgrades several items from 'discovery' to 'confirmation'): docs/refs-comparison.md already plans most of this — :408 'Raw/Cooked client split... withRawResponse()' (=finding 2), :350 'Tolerant error-body parsing / typed error payloads without throwing in the ctor' (=findings 4 + 7), with a dedicated Error Model § at :213. So findings 2/4/7 are best read as 'here is a concrete reference implementation for an already-planned item', not net-new asks. + +OVERALL SHAPE OF THE KEEPERS: nearly everything actionable lands at the FUTURE CODEGEN layer (handlers, raw/cooked split, typed streams, typed error payloads); the sdk-core deltas are small primitives (a 3-line ResponseHandler fun interface, a ~25-line ParsedResponse/parseable, a ResponseDeserializationException, a closeable typed-stream seam) plus the cleanup of consolidating the two exception hierarchies and wiring the error path. Best single insight (high confidence): finding 3 — error-mapping and deserialization compose as Response->X handlers at the generated-service face, NOT as HTTP-pipeline stages; keep the pipeline transport-pure and the SERDE pillar request-side only. + +--- + +## 7. Streaming + SSE (StreamResponse, AsyncStreamResponse, SseMessage, handlers) + +**What it is** + +openai-java splits streaming into three layers. (1) A generic resource type: `StreamResponse : AutoCloseable` exposing `stream(): Stream` (StreamResponse.kt:5-11), and `AsyncStreamResponse` with a push/callback `Handler{onNext,onComplete(Optional)}` plus `onCompleteFuture(): CompletableFuture` and a one-shot `subscribe` (AsyncStreamResponse.kt:14-63). (2) Two HttpResponse.Handler factories: `streamHandler` turns `response.body().bufferedReader()` into a lazy `Sequence` of lines via `reader.useLines{...}` inside a coroutine `sequence{}` builder, wrapped so reads after close are suppressed and IOExceptions are re-typed (StreamHandler.kt:14-53); `sseHandler` is a `streamHandler` whose block runs a WHATWG line-interpretation state machine (`SseState.decode`) and yields `SseMessage`s, special-casing the `[DONE]` sentinel and surfacing in-band `{"error":...}` JSON as `SseException` (SseHandler.kt:21-63). (3) `SseMessage` is an immutable Jackson-aware value holding raw `event/data/id/retry` plus an `inline fun json()` that lazily parses `data` into a model (SseMessage.kt:10-83). The defining architectural move is the GC safety net: both stream types are wrapped in `PhantomReachableClosing*` decorators that register the underlying resource with a `java.lang.ref.Cleaner` (reflectively, so Java 8 degrades to no-op) so a forgotten `close()` still releases the socket eventually (PhantomReachable.kt:31-56). The async shape is a hand-rolled subscriber driven by `CompletableFuture>.toAsync(executor)`, which on `subscribe` does `whenCompleteAsync{ streamResponse.stream().forEach(handler::onNext) }` on a configured `streamHandlerExecutor` with a 3-state (NEW/SUBSCRIBED/CLOSED) AtomicReference guard (AsyncStreamResponse.kt:65-157). Everything below the handlers hard-depends on Jackson (`JsonMapper`, `jacksonTypeRef`) and on okhttp's response body — exactly the dependencies our sdk-core forbids. + +**How it works (line-level)** + +SSE state machine (SseHandler.kt:73-110): `decode(line)` returns null until a blank line, then `flush()`. Field parse: `colonIndex = line.indexOf(':')`; no colon -> whole line is field name, value `""`; a single leading space after the colon is stripped (`if (value.startsWith(' ')) value = value.substring(1)`). `id` is rejected if it `contains('')`; `retry` uses `value.toIntOrNull()`; `data` accumulates into a `MutableList`. `flush()` (SseHandler.kt:112-132) joins data with `"\n"`, and per spec resets `event`/`data`/`retry` but NOT `lastId` ("NOTE: Per the SSE spec, do not reset lastId"). Note their `decode` operates on already-decoded `String` lines from `BufferedReader.readLine()`, so UTF-8 decoding and line-splitting (\n, \r, \r\n via `readLine`) are delegated to the JDK reader. `[DONE]` handling (SseHandler.kt:33-38): on `message.data.startsWith("[DONE]")` it sets `done=true` and `continue`s — it deliberately does NOT break, so the loop keeps draining `lines` to EOF; this is the clever bit, ensuring the socket reaches end-of-stream so the underlying connection can be pooled/reused rather than abandoned mid-body. The async driver (AsyncStreamResponse.kt:84-134): `check(state.compareAndSet(NEW, SUBSCRIBED))` enforces single-subscribe with a tailored message; on the executor it re-checks `state.get() == CLOSED` to bail if close raced ahead; wraps `stream().forEach(onNext)` in try/catch to capture `streamError`, then nested try/finally guarantees `onComplete` -> complete/completeExceptionally the future -> `close()` all run even if a handler throws. `close()` (AsyncStreamResponse.kt:138-149) does `state.getAndSet(CLOSED)`, idempotent, and closes the StreamResponse once the source future resolves. `streamHandler` resource safety (StreamHandler.kt:24-52): the sequence is `.constrainOnce()` (single iteration) and wrapped in `CloseableSequence` whose `hasNext() = !isClosed && iterator.hasNext()` so post-close iteration cannot trigger a read on a closed reader; `close()` cascades `sequence.close(); reader.close(); response.close()`. `Cleaner` wiring (PhantomReachable.kt:31-56) is entirely reflective (`Class.forName("java.lang.ref.Cleaner")`) and lazily memoized; `closeWhenPhantomReachable(observed, closeable)` asserts `observed !== closeable` (else it'd never become phantom-reachable). `TrackedHandler` (PhantomReachableClosingAsyncStreamResponse.kt:49-56) is a no-op delegating wrapper whose only job is to hold a strong ref to a separate `reachabilityTracker` Object so the Cleaner fires on the wrapper's lifetime, not the underlying response's. + +**vs. our SDK** + +Our SSE reader is hand-written at the BYTE level, not the String level: `ServerSentEventReader.next()` (sdk-core/.../http/sse/ServerSentEventReader.kt:48-134) reads bytes via `BufferedSource`, does its own `readLine()` (ServerSentEventReader.kt:168-191) handling \n/\r/\r\n by peeking, and decodes each line with `String(bytes,...,Charsets.UTF_8)` in a custom `ByteArrayBuilder` (ServerSentEventReader.kt:235-264). Field interpretation matches theirs nearly arm-for-arm: leading-space strip (lines 94-101), `id` NUL rejection (lines 105-115), non-numeric `retry` ignored (lines 125-130), blank-line dispatch (lines 72-75). Differences in OUR favor on spec coverage: (a) we expose `comment` lines (`:` prefix) as a first-class field (ServerSentEventReader.kt:79-83, ServerSentEvent.kt:49) — theirs drops comments entirely (SseHandler.kt:78-80 returns null); (b) we expose `retry` as a typed `Duration` with explicit overflow guard (`parseRetryMillis`, ServerSentEventReader.kt:202-213) vs their lossy `Int` via `toIntOrNull`; (c) we emit `id`-only / `retry`-only / comment-only events (the permissive "any field set" rule, ServerSentEventReader.kt:38-44), theirs flushes only when `!isEmpty()` (SseHandler.kt:134-135). Our `ServerSentEvent` (ServerSentEvent.kt:43-114) keeps `data` as an unmodifiable defensive-copied `List` with full value semantics; theirs collapses `data` to a single joined `String` at flush. We have NO equivalent of: `StreamResponse` (AutoCloseable Stream resource), `AsyncStreamResponse` (subscriber API), the `[DONE]` sentinel, in-band error-as-exception, the lazy `json()` typed projection, or the phantom-reachable cleanup. Our async SSE is reactor-only: `ServerSentEventReader.toFlux()` / `BufferedSource.readServerSentEventsAsFlux()` (sdk-async-reactor/.../Reactor.kt:116-145) via `Flux.generate` (one poll per demand) and `Flux.using`. Sync exposure is `BufferedSource.readServerSentEvents(): Sequence` / `...AsIterable()` (sdk-core/.../http/sse/ServerSentEventExtensions.kt:33-55). The coroutines adapter has NO SSE→Flow (sdk-async-coroutines/.../Coroutines.kt has none). + +**Recommendations (verified)** + +- **Port the NEW/SUBSCRIBED/CLOSED state machine + guaranteed-terminal nested-finally for the async driver** `COPY` · `sdk-core` · effort M · confidence medium + - *Verdict:* Verified line-by-line against AsyncStreamResponse.kt:84-149. All four cited details are real and correct: (a) `check(state.compareAndSet(NEW, SUBSCRIBED)){ if SUBSCRIBED 'Cannot subscribe more than once' else 'Cannot subscribe after the response is closed' }` (lines 89-92); (b) early-return when `state.get()==CLOSED` after the future completes (lines 96-100); (c) the nested try/finally guaranteeing onComplete -> complete/completeExceptionally future -> close() even if the handler throws (lines 115-130); (d) the init hook completing the future exceptionally if the SOURCE future errors before subscribe (lines 73-79); plus idempotent `getAndSet(CLOSED)` close (lines 138-149). These are exactly the races a hand-rolled async stream botches first try. openai-java is Apache-2.0, so near-verbatim lifting requires an attribution note in the file header. The `TODO(JDK): compareAndExchange once targeting JDK 9` (line 88) is irrelevant to us (Java 8 -> compareAndSet is correct). IMPORTANT caveat: this is strictly contingent on accepting the AsyncSseStream finding above; it is the *implementation* of that contract, not an independent item — do not schedule it separately. Their code has zero interrupt handling (forEach runs on a worker thread); our port MUST add interrupt-awareness per our convention (catch InterruptedException in the driver loop, restore flag, surface InterruptedIOException through onComplete's Optional). + - *Do:* When implementing AsyncSseStream, port the AtomicReference guard and the nested-finally terminal sequence verbatim in structure, re-typed to our explicit-API + ReentrantLock-free atomics + interrupt-aware conventions. Add an Apache-2.0 attribution comment in the file header for the lifted control flow. Cover with tests for: double-subscribe, close-before-future-completes, handler throwing in onComplete, source-error-before-subscribe. +- **Single-pass (constrain-once) + read-after-close suppression for any element stream we expose** `ADOPT` · `sdk-core` · effort S · confidence high + - *Verdict:* Verified. StreamHandler.kt:37 `.constrainOnce()` makes a second iteration throw; CloseableSequence (StreamHandler.kt:85-102) makes `hasNext()` short-circuit to false after close (`!isClosed && iterator.hasNext()`) so no read is attempted on a closed reader; IOExceptionWrappingSequence (StreamHandler.kt:56-77) wraps ONLY the reader's lines, deliberately not the user's block (explicit comment StreamHandler.kt:30-34) so user errors aren't masked. Our `readServerSentEvents()` returns a plain `generateSequence` (ServerSentEventExtensions.kt:33-36) whose KDoc only *warns* 'Do not call this twice' (lines 24-25) and enforces nothing: a second forEach silently restarts mid-stream (BOM flag already consumed), and a read after the source is closed throws a raw IOException. Their wrappers turn our doc-warnings into enforced invariants — that is exactly toolkit-grade rigor we're missing. Re-categorized from LEARN to ADOPT: this is a concrete behavior change (add the guards), not just a principle. The 'wrap only the library read, not the consumer lambda' boundary for any IO re-typing is a precise distinction worth mirroring IF we ever re-type SSE IO errors (we currently don't, and our zero-dep core has no domain IO-exception to map to, so that sub-point is lower priority). + - *Do:* Bake a constrain-once guard and an is-closed check on hasNext() into SseStream's iteration so a closed/already-iterated stream reports exhaustion instead of throwing or silently restarting. This pairs directly with the StreamResponse finding (same primitive). Skip the IOException re-typing wrapper for now (no core domain exception to map to); if added later, scope the catch to the reader pull only, never the consumer's lambda. +- **Add a transport-neutral StreamResponse = AutoCloseable + lazy element source, with close cascading into the body** `ADOPT` · `sdk-core` · effort M · confidence high + - *Verdict:* Verified. StreamResponse.kt:5-11 is exactly `interface StreamResponse : AutoCloseable { fun stream(): Stream; override fun close() }`; the internal `map` (StreamResponse.kt:13-19) preserves close while transforming elements; StreamHandler.kt:45-49 cascades sequence.close() -> reader.close() -> response.close(). We genuinely lack any equivalent: our SSE surface is a bare `Sequence`/`Iterable` (ServerSentEventExtensions.kt:33-55) plus a reactor `Flux`, and ServerSentEventReader explicitly disclaims ownership ('does not own the underlying source and does not close it', ServerSentEventReader.kt:23-24). So element-stream lifetime and socket lifetime are fully decoupled and the burden is on the caller. ONE correction to the analysis's framing: the value is not 'try-with-resources for Java' per se (a raw Sequence can be drained in a Kotlin `use{}` over the Response already); it is binding the *parsed element stream* to the *response close* in a single handle so a partial consume cannot abandon a pooled connection. That is the same invariant we already enforce in PagedIterable (PagedIterable.kt:170-176 closes the page eagerly after taking the items iterator 'so a partial consume never abandons an open response body or pooled connection') — so the pattern is established in our codebase for paging and missing for SSE; this finding closes that gap. Their `Stream` return is Java-centric; we'd want `Iterable`/`asSequence()` as the primary with an optional `stream()` to match how PagedIterable already exposes both. + - *Do:* Add `org.dexpace.sdk.core.http.sse.SseStream : AutoCloseable, Iterable` (plus `asSequence()`, optional Java `stream()`) wrapping a ServerSentEventReader and the owning Response/BufferedSource; `close()` closes the response (transport decides drain-vs-discard). Mirror PagedIterable's existing close-on-partial-consume contract so the two streaming primitives behave identically. Keep the raw `Sequence` extension for callers who own the source. Defer the generic `map`-carrying-close combinator to the serde/codegen layer (it's only useful once you decode elements). +- **Cleaner-based last-resort close for SDK-managed closeables via reflective Java-8-safe Cleaner** `ADOPT` · `sdk-core` · effort M · confidence medium + - *Verdict:* Verified and genuinely clever. PhantomReachable.kt:31-56 obtains `java.lang.ref.Cleaner` entirely by reflection (`Class.forName`, `getMethod("create")`, `getMethod("register", Any, Runnable)`) memoized in a `by lazy`, returning null on Java 8 (ReflectiveOperationException -> no-op). So the SAME Java-8 bytecode opportunistically uses the Cleaner on JDK 9+ with no multi-release jar — exactly the trick that fits our Java-8-everywhere constraint. The `check(observed !== closeable)` guard (PhantomReachable.kt:15-17) is a real correctness requirement (a Cleaner action capturing a strong ref to the observed object prevents collection). The async variant's separate `reachabilityTracker` Object + `TrackedHandler` (PhantomReachableClosingAsyncStreamResponse.kt:22-29,49-56) is a non-obvious necessity: you can't observe the response directly because the handler keeps it alive, so you observe a proxy and pin it to the handler's lifetime. Reflection in sdk-core is allowed (no new runtime dep), so applicable. WHY ONLY MEDIUM, and a sharper critique than the analysis gave: this is a safety NET, not a correctness fix, and it is not free — it spins up a Cleaner daemon thread and adds a finalization-adjacent code path. We already ship Response/ResponseBody/PagedResponse/PagedIterable as Closeables with NO such net and have lived fine; adding it only for SSE would be inconsistent. The right scope is: build the util generically and decide deliberately whether to retrofit it to all SDK-managed closeables, or skip it entirely and rely on docs + close-on-partial-consume (which we already do for paging). Treat as opt-in polish, not a must-have. + - *Do:* Add an internal `closeWhenPhantomReachable(observed, AutoCloseable)` util in sdk-core (reflective Cleaner, lazy-memoized, Java-8 no-op), keeping the `observed !== closeable` assertion. Lift PhantomReachable.kt structure with an Apache-2.0 attribution note. Decide consciously whether to wrap SseStream/AsyncSseStream (and the separate-tracker pattern for the async case) — and whether to retrofit existing closeables — rather than bolting it onto SSE alone. Document loudly that it is a net, never a substitute for close(); GC timing is not guaranteed. +- **Codegen (not core) should emit a per-endpoint decode/sentinel/error-envelope adapter over the raw SSE stream, with lazy per-element deserialization** `ADOPT` · `codegen` · effort M · confidence medium + - *Verdict:* Verified. openai-java bakes three API-specific behaviors into its SSE layer that we correctly keep OUT of core: the `[DONE]` terminator (SseHandler.kt:33-38, an OpenAI/SSE convention, not WHATWG), surfacing a mid-stream `{\"error\":...}` as a thrown SseException carrying status+headers (SseHandler.kt:46-60), and a lazy typed projection `SseMessage.json()` via `mapJson` (SseHandler.kt:138-145, SseMessage.kt:44-58) that Jackson-parses `data` only when the element is consumed. The lazy-decode-on-consume posture is the genuinely good idea: don't deserialize chunks an abandoned consumer never pulls. This maps cleanly onto our planned KotlinPoet codegen emitting a thin `SseStream -> Iterable` adapter per streaming endpoint. HARD CONSTRAINT (analysis got this right): deserialization must go through our `Serde` abstraction in a serde adapter / codegen output, NEVER sdk-core — SseMessage hard-deps Jackson (SseMessage.kt:5,12) which our ServerSentEvent correctly avoids by staying a raw value holder. MEDIUM confidence because it's a forward-looking design note for a generator that doesn't exist yet; the deliverable is a spec line in docs/refs-comparison.md, not code. The analysis target 'codegen' is correct. + - *Do:* In docs/refs-comparison.md (codegen design section), specify: the generator emits a per-streaming-endpoint adapter wrapping the toolkit's SseStream, with (a) a configurable sentinel string for early-done, (b) an error-envelope -> typed-exception mapper, (c) lazy per-element deserialization via the injected `Serde` (joining `ServerSentEvent.data` with '\n' to reconstruct the payload). Keep all of it out of sdk-core. +- **Define a dependency-free async push contract (subscribe/onNext/onComplete(Optional)/onCompleteFuture) as the neutral seam every async adapter wraps** `ADOPT` · `sdk-core` · effort L · confidence medium + - *Verdict:* Verified. AsyncStreamResponse.kt:14-63: `fun interface Handler{ onNext; onComplete(error: Optional) }`, `subscribe(handler[, executor])` once, `onCompleteFuture(): CompletableFuture`, and a deliberately-NOT-AutoCloseable `close()` (AsyncStreamResponse.kt:40-46, with the explicit 'should not be synchronously closed via try-with-resources' rationale). Built on CompletableFuture + Executor + AtomicReference, zero reactive-streams dep. We confirmed our only async SSE is reactor-only (Reactor.kt:116-145) and the coroutines adapter has NO SSE->Flow (Coroutines.kt has only execute/asAsync bridges) — so a non-Reactor user gets nothing today. The exactly-once terminal `onComplete(Optional)` is a cleaner low-level contract than reactive onError/onComplete duality. Two caveats the analysis got right and I confirm: (1) zero backpressure — `stream().forEach(handler::onNext)` (AsyncStreamResponse.kt:110) is unbounded push; a toolkit seam must document this as fire-hose and let the Reactor adapter (our Flux.generate already is demand-driven, Reactor.kt:117-127) / coroutines reintroduce demand; (2) keep `Executor` caller-supplied — do NOT bake a default executor that drags a dependency. Why MEDIUM not high: this is a real gap but it is a NEW public-contract surface whose main consumer is future codegen + adapters that don't exist yet; scope it carefully and don't over-build before there's a caller. The analysis target 'both' is wrong — the contract lives in sdk-core; codegen merely *returns* it. I set target=sdk-core. + - *Do:* Define `org.dexpace.sdk.core.http.sse.AsyncSseStream` (or a generic `AsyncStream`) in sdk-core: `subscribe(handler, Executor)`, `onCompleteFuture()`, non-AutoCloseable idempotent `close()`, single-subscribe guard. Implement the SSE driver once on top of SseStream; have sdk-async-reactor expose it as Flux (with real backpressure), coroutines as Flow, virtual-threads drive it on a Loom executor. Document the unbounded-push semantics loudly. Make the driver interrupt-aware (their forEach has none — see COPY finding). +- **Keep our byte-level reader as the parse core when building SseStream — but the 'their bufferedReader is a charset bug' rationale is WRONG** `SIMPLIFY` · `sdk-core` · effort S · confidence high · we partly do this, claim-qualified + - *Verdict:* The DIRECTION is right (don't regress to a String-level shortcut when wrapping the reader) but the central technical claim is FALSE and must not propagate. The analysis says theirs 'uses the JVM default charset' via `bufferedReader()` (SseHandler.kt:21). I checked: HttpResponse.body() returns a `java.io.InputStream` (HttpResponse.kt: `fun body(): InputStream`), so `response.body().bufferedReader()` is Kotlin's `InputStream.bufferedReader(charset: Charset = Charsets.UTF_8)` — it DEFAULTS TO UTF-8, not the platform charset. Both SDKs decode UTF-8. The 'outright bug-risk against the SSE UTF-8 mandate' (antipattern #3, weAreAhead point d) is incorrect. Second correction: the analysis (and our own ServerSentEventReader.kt:163-165 doc) implies theirs can't handle bare `\r`; but theirs goes through `java.io.BufferedReader.readLine()`/`useLines`, and java.io.BufferedReader treats `\r`, `\n`, and `\r\n` all as terminators — so theirs DOES handle bare `\r`. Our byte-level readLine (ServerSentEventReader.kt:168-191) exists because OKIO's `readUtf8Line` doesn't handle bare `\r`, not because java.io can't. So 'we handle bare \r and they don't' (weAreAhead point e) is also wrong. Net: there is no charset/line-terminator advantage. The genuine, defensible advantages of our parser are narrower (see notes). The actionable residue: when we wrap the reader in SseStream, keep ServerSentEventReader as the parse core (we already own it; weAlreadyDoIt=true) and don't introduce a JDK-reader shortcut — but for code-reuse/maintenance reasons, not because theirs is buggy. + - *Do:* When SseStream lands, build it ON ServerSentEventReader; do not add a `bufferedReader().useLines` path. Drop the charset/bare-\r justifications entirely — they are factually wrong. If non-UTF-8 SSE ever matters, add an explicit `charset` param. Also fix the misleading framing in our own ServerSentEventReader.kt:163-165 doc if it's being read as 'java.io can't do bare \r' (it's specifically about okio). + +**Considered & dropped** + +- ~~Drain-to-EOF on early stop (the [DONE]/done-flag pattern) keeps connections reusable~~ — Rationale is conflated and the unique insight is thin; folded into the StreamResponse finding instead. VERIFIED FACTS: SseHandler.kt:25-38 does set `done=true; continue` (not break) with the comment 'we don't break because we still want to iterate through the full stream'. BUT this drain only triggers when the SERVER emits `[DONE]` — it does NOT address a CONSUMER who breaks out of forEach early (the analysis's stated scenario). Early consumer-stop is handled entirely by StreamResponse.close() -> response.close() (StreamHandler.kt:45-49), where the TRANSPORT decides drain-vs-discard. So 'drain-to-EOF keeps the connection reusable on early stop' misattributes the mechanism. Moreover we ALREADY enforce the real lesson (don't abandon a pooled connection on partial consume) in PagedIterable.kt:170-176 via eager close, and our ResponseBody.close() contract (ResponseBody.kt:66-78) already mandates connection release even when the body was never read. The genuinely-new residue — a one-line doc note about the SSE early-stop/close hazard — is now covered by the SseStream finding's close contract, so a standalone docs/process item is redundant filler. + +**Do not copy** + +(1) sdk-core MUST NOT copy SseMessage's Jackson coupling: SseMessage.kt holds a `JsonMapper` field and `inline fun json()` calls `jsonMapper.readerFor(jacksonTypeRef())` (SseMessage.kt:13,44-58). For us, deserialization belongs in sdk-serde-jackson or codegen output via the `Serde` seam — our `ServerSentEvent` correctly stays a raw value holder. (2) Do NOT bake the `[DONE]` sentinel or `{\"error\":...}` envelope into the core SSE reader (SseHandler.kt:33-60) — both are API conventions; they belong in codegen-emitted adapters, not the toolkit. (3) Their default-charset `bufferedReader()` (SseHandler.kt:21) is an outright bug-risk against the SSE UTF-8 mandate; do not emulate. (4) Their async stream has zero backpressure: `streamResponse.stream().forEach(handler::onNext)` (AsyncStreamResponse.kt:110) pushes as fast as it parses with no demand signal — fine for their one API, but a transport-neutral toolkit seam should either expose demand or be explicitly documented as fire-hose so the Reactor/coroutine adapter can reintroduce backpressure (our existing Flux.generate already does, Reactor.kt:117). (5) Their `SseState.flush()` collapses multi-line `data` into one joined String (SseHandler.kt:121), discarding line structure; our `List` is strictly more faithful — do not regress. (6) Heavy reflection for Cleaner (PhantomReachable.kt:32-51) is acceptable as a one-time lazy memo, but per-stream reflective dispatch would be a hot-path antipattern — keep it memoized as they do. + +**Where we're ahead** + +Our SSE PARSER is more spec-complete and faithful than theirs in concrete, citable ways: (a) we expose comment lines as a typed field (ServerSentEventReader.kt:79-83, ServerSentEvent.kt:49) enabling keep-alive detection — theirs silently drops every `:` line (SseHandler.kt:78-80); (b) we keep `data` as an unmodifiable, defensively-copied `List` with full value semantics (ServerSentEvent.kt:52,89-106) — theirs flattens to a single joined String at flush (SseHandler.kt:121); (c) our `retry` is a typed `Duration` with an explicit pre-multiply overflow guard (parseRetryMillis, ServerSentEventReader.kt:202-213) — theirs is a lossy `Int` via `toIntOrNull` (SseHandler.kt:106); (d) we decode bytes explicitly as UTF-8 (ServerSentEventReader.kt:244) honoring the SSE charset mandate, while theirs relies on the JVM default charset via `bufferedReader()` (SseHandler.kt:21); (e) we handle bare `\r` line terminators explicitly at the byte level with a documented rationale (ServerSentEventReader.kt:160-191) rather than delegating to a JDK reader; (f) our reactor SSE bridge already implements true demand-driven backpressure (`Flux.generate`, one poll per request, Reactor.kt:117-127) — their async path has none. Net: keep our reader as the parse core; their advantage is purely in the higher-level resource/lifecycle wrappers (StreamResponse, AsyncStreamResponse, phantom-reachable close), which we lack entirely and should add ON TOP of our reader. + +_Verifier notes:_ RIGOR CORRECTIONS to the analysis (load-bearing):\n\n1. CHARSET CLAIM IS FALSE. The analysis repeatedly asserts openai-java decodes SSE with the JVM default charset (`bufferedReader()`, SseHandler.kt:21) and calls it 'an outright bug-risk against the SSE UTF-8 mandate' (Finding 6, antipattern #3, weAreAhead point d). HttpResponse.body() returns `java.io.InputStream` (HttpResponse.kt), so `.bufferedReader()` is Kotlin's `InputStream.bufferedReader(charset = Charsets.UTF_8)` — it DEFAULTS TO UTF-8. Both SDKs decode UTF-8. Drop this 'advantage' and the antipattern.\n\n2. BARE-\\r CLAIM IS FALSE. weAreAhead point (e) claims theirs can't handle bare `\\r`. Theirs uses `java.io.BufferedReader.readLine()`/useLines, which treats `\\r`, `\\n`, `\\r\\n` all as terminators. Both handle bare `\\r`. Our byte-level readLine (ServerSentEventReader.kt:168-191) exists only because OKIO's readUtf8Line doesn't — not because java.io can't. Our own doc at ServerSentEventReader.kt:163-165 is correct (it's about okio) but the analysis over-generalized it.\n\n3. GENUINE parser advantages over theirs (the defensible weAreAhead set), all re-verified: (a) we expose comment lines as a typed field (ServerSentEventReader.kt:79-83, ServerSentEvent.kt:49,63-64) — theirs drops every `:` line (SseHandler.kt:78-80); (b) we keep `data` as an unmodifiable defensively-copied List (ServerSentEvent.kt:52) — theirs joins to one String at flush (SseHandler.kt:121), though note their `\\n` join is itself spec-conformant for the common case; (c) `retry` as typed Duration with explicit overflow guard (parseRetryMillis, ServerSentEventReader.kt:202-213) vs their lossy Int `toIntOrNull` (SseHandler.kt:106) — genuine but low-stakes (Int-ms overflows at ~24.8 days); (d) we emit id-only/retry-only/comment-only events (ServerSentEventReader.kt:38-44,56) — theirs flushes only when !isEmpty() (SseHandler.kt:113,134-135); (f) our reactor bridge is demand-driven (Flux.generate one-poll-per-request, Reactor.kt:117-127) — their async path is unbounded push (AsyncStreamResponse.kt:110). Charset (d-orig) and bare-\\r (e-orig) are removed.\n\n4. CITATION DRIFT (minor, substance intact): analysis cites 'SseHandler.kt:134-135' for the flush-when-not-empty gate; the actual gate is `if (isEmpty()) return null` at SseHandler.kt:113 (134-135 is the isEmpty() definition). 'PhantomReachable.kt:31-56' / ':15-17' line refs are accurate.\n\n5. WHERE WE'RE BEHIND (the real, accurate gaps, all confirmed missing via grep — zero hits for StreamResponse/AsyncStreamResponse/SseStream/Cleaner/phantom in our non-test source): a resource-owning closeable stream type, a dependency-free async push contract, single-pass/closed-aware iteration guards, and (optionally) a Cleaner safety net. These are the keepers and they belong ON TOP of our existing reader.\n\n6. DEPENDENCY/SCOPING: ServerSentEventListener (ServerSentEventListener.kt) exists but is driven NOWHERE in production — no driver loop is wired. So our effective SSE surface is reader + Sequence/Iterable + reactor Flux. The COPY (state machine) finding is NOT independent — it is the implementation of the AsyncSseStream ADOPT finding; schedule them together. The Cleaner finding is opt-in polish (adds a daemon thread + reflection path), not a correctness fix — we already survive without nets on Response/PagedResponse. Recommended sequence: SseStream (closeable, single-pass guards) first, then AsyncSseStream (+ ported state machine, interrupt-aware), then decide on Cleaner, with codegen decode-adapter as a docs spec. + +--- + +## 8. PhantomReachable resource auto-closing (distinctive GC-based lifecycle) + +**What it is** + +openai-java layers a GC-triggered "safety net" on top of explicit close(): if a user forgets to close an HttpClient, stream, ExecutorService, or Sleeper, the resource is still released when the JVM determines the wrapper has become only phantom-reachable. The mechanism is a single shared java.lang.ref.Cleaner, obtained REFLECTIVELY so the code still compiles/runs on Java 8 (where Cleaner does not exist and the whole feature silently degrades to a no-op). The architecture is a set of thin delegating decorators — PhantomReachableClosingHttpClient, PhantomReachableExecutorService, PhantomReachableSleeper, PhantomReachableClosingStreamResponse, PhantomReachableClosingAsyncStreamResponse — each of which (a) forwards every interface method to a wrapped delegate and (b) in its init block registers a cleanup action via closeWhenPhantomReachable(...). These wrappers are applied at ClientOptions build time (httpClient/streamHandlerExecutor/sleeper) and at stream-creation time (StreamHandler.streamHandler, CompletableFuture.toAsync). Crucially, openai-java pairs this with an OWNERSHIP-TRANSFER model: ClientOptions.Builder.httpClient docs literally say "This class takes ownership of the client and closes it when closed" (ClientOptions.kt:246) — the opposite of our SDK's "SDK closes only SDK-managed resources, never BYO clients" rule. + +**How it works (line-level)** + +CORE REFLECTION SHIM — PhantomReachable.kt:31-56. A lazy private val computes the cleanup function once: `val cleanerClass = Class.forName("java.lang.ref.Cleaner")` (line 33), `cleanerCreate = cleanerClass.getMethod("create")` (34), `cleanerRegister = cleanerClass.getMethod("register", Any::class.java, Runnable::class.java)` (35-36), `cleanerObject = cleanerCreate.invoke(null)` (37). It then returns a closure `{ observed, close -> cleanerRegister.invoke(cleanerObject, observed, Runnable { close() }) }` (39-41). JAVA 8 BEHAVIOUR is the load-bearing detail: on JDK 8 `Class.forName("java.lang.ref.Cleaner")` throws ClassNotFoundException (a ReflectiveOperationException), caught at line 52-55 returning `null`, with the comment "We're running Java 8, which has no Cleaner." The public entry points `closeWhenPhantomReachable(observed, closeable)` (14) and `closeWhenPhantomReachable(observed, close)` (27) null-check: `closeWhenPhantomReachable?.let { it(observed, close) }` (28) — so on Java 8 EVERY wrapper's registration is a SILENT no-op and the safety net simply does not exist. + +SELF-REFERENCE GUARD — PhantomReachable.kt:15-17: `check(observed !== closeable) { "observed cannot be the same object as closeable because it would never become phantom reachable" }`. This catches the classic Cleaner footgun: the cleanup Runnable must not strongly reference the observed object, or the object is kept alive forever and never cleaned. The simple wrappers (executor/sleeper/httpclient/sync-stream) sidestep this by registering `closeWhenPhantomReachable(this, delegate)` (e.g. PhantomReachableExecutorService.kt:17 `closeWhenPhantomReachable(this) { executorService.shutdown() }`) — `this` is observed, the *delegate's* method is the action, so the action holds the delegate, not `this`. + +THE CLEVER ONE — PhantomReachableClosingAsyncStreamResponse.kt:22,24-26,49-56. An async stream has no single owning reference (the user may only hold the Handler), so the wrapper allocates a dedicated `private val reachabilityTracker = Object()` (22), registers `closeWhenPhantomReachable(reachabilityTracker, asyncStreamResponse::close)` (25), and on subscribe wraps the user handler in `TrackedHandler(handler, reachabilityTracker)` (29,34). TrackedHandler (49-56) holds the tracker as a field, so the stream stays open exactly as long as the *handler* is reachable — liveness is decoupled from the wrapper object itself. Comment at 19-21: "An object used for keeping asyncStreamResponse open while the object is still reachable." + +WIRING — ClientOptions.kt:249 `this.httpClient = PhantomReachableClosingHttpClient(httpClient)`; :281 wraps a user ExecutorService; :294 wraps the Sleeper; :614 wraps the DEFAULT cached thread pool; :630 wraps the default sleeper. StreamHandler.kt:40 wraps the sync StreamResponse; AsyncStreamResponse.kt:67 wraps the async one. The wrappers are pure delegation — e.g. PhantomReachableClosingHttpClient.kt:17-25 forwards execute/executeAsync/close verbatim. + +**vs. our SDK** + +OUR SDK has NO GC-based safety net and deliberately inverts the ownership model. Grep of sdk-core for Cleaner/PhantomReference/ReferenceQueue returns zero production hits (only ContextStore.kt:41 mentions WeakReference in a KDoc, unrelated). We use pure explicit AutoCloseable everywhere: +- HttpClient.kt:46,59 and AsyncHttpClient.kt:65,80 — fun interface ... : AutoCloseable with a default no-op close() so SAM literals stay valid. +- OkHttpTransport.kt:183-216 — close() gated by `closed.compareAndSet(false, true)` (184) AND `if (!owned) return` (187); only the SDK builder sets owned=true. BYO clients via create() are NEVER closed. +- JdkHttpTransport.kt:166-199 — same closed+owned latch; note the Java-11-safe `client as? AutoCloseable` probe (178) because java.net.http.HttpClient only became AutoCloseable in JDK 21 (JEP 461) — they solve "feature exists only on newer JDK" with `as?`, openai-java solves the analogous problem with Class.forName reflection. +- VirtualThreads.kt:71-92 — VirtualThreadAsyncHttpClient with AtomicBoolean closed latch; close() calls executor.close() (blocking; waits for tasks). +- Response.kt:48,64 / ResponseBody.kt:46,78 / PagedResponse.kt:58,74 — Closeable response graph; caller owns close (AsyncHttpClient.kt:39-41 KDoc spells this out). +The contract is codified in docs/architecture.md:596-625 (Lifecycle): Idempotent (AtomicBoolean+CAS, "no synchronized which would pin a carrier thread under Loom"), Ownership-aware (`internal val owned`), Interrupt-safe, Best-effort-non-throwing. docs/implementation-plan.md:235 even notes "JDK's HttpClient is GC'd" — i.e. we already rely on plain GC reclamation for the BYO JDK client rather than an explicit Cleaner. + +**Recommendations (verified)** + +- **Optional opt-in leak DETECTOR (log on phantom-reachable-before-close), not silent auto-close** `ADOPT` · `sdk-core` · effort M · confidence medium + - *Verdict:* This is the one finding that proposes adding a real capability we lack, and it is correctly reasoned: take the detection value of phantom-reachability (catch forgotten close()) while dropping the two harms (silent lifetime mutation; ownership-transfer requirement). The contrast with openai-java is accurate — their net is silent (PhantomReachable.kt:14-19 just closes, no diagnostic) and wraps even the default executor (ClientOptions.kt:612-614). Caveats are honestly stated: Cleaner is 9+, so sdk-core must use the same reflective+lazy+no-op-on-8 probe; allocation-stack capture is expensive and MUST be flag-gated default-OFF to preserve zero-overhead-by-default and not perturb the 80% coverage floor; interface-only, no new runtime dep. Skeptical caveats I'd add: (a) effort is closer to M/L than the optimistic M once you account for the reflective Cleaner seam + a default-off flag + capturing allocation sites cheaply + tests that force GC deterministically (their own test, PhantomReachableTest.kt, relies on System.gc()+Thread.sleep(100), which is inherently flaky — our coverage gate would need a tolerant/loop-until test, not a fixed sleep). (b) On Java 8 (our primary target) the detector is inert, so its value lands only on 9+ runtimes — acceptable for a debug aid but worth stating. (c) This is net-new surface for a pre-1.0 toolkit; it should be explicitly OPTIONAL and clearly an internal/diagnostic seam, not part of the stable transport contract. Given those, it's a worthwhile but non-urgent ADOPT, not a must-do. + - *Do:* Prototype (do not rush to stable API) an internal, opt-in `LeakDetector` seam in sdk-core: `LeakDetector.track(resource, allocationSite)` that registers via the reflective Cleaner (reuse the probe shape from PhantomReachable.kt:31-56) and, when cleanup fires before close(), logs a ClientLogger WARN with the captured allocation site. Reuse the existing `closed` AtomicBoolean (OkHttpTransport.kt:184) as the 'was it closed?' signal. Gate behind `-Dorg.dexpace.sdk.leakDetection=paranoid|simple|off`, default off; allocation-stack capture only in paranoid. Keep it interface-only in sdk-core (no new dep). Write the test as a GC-poll loop with a bounded timeout, not a fixed Thread.sleep, to avoid a flaky coverage gate. Crucially: detection only — never auto-close (that is the whole point of choosing this over their design). +- **Reflective capability-probe for newer-JDK APIs (Class.forName + getMethod cached in a lazy val)** `LEARN` · `docs/process` · effort S · confidence high + - *Verdict:* Mechanism claim is accurate. PhantomReachable.kt:31-56 resolves java.lang.ref.Cleaner via Class.forName, binds create/register via getMethod, builds the bound closure once inside `by lazy`, and returns null on ReflectiveOperationException (the Java-8 path, lines 52-54). One correction to the original: the file lives in com/openai/core/, NOT com/openai/core/http/ as the analysis cited — line numbers are right, package path is wrong. The pattern is real and clean. BUT the original finding overstates novelty for us: we already solve our ONLY current instance (JdkHttpTransport.kt:178 `client as? AutoCloseable`) and we have no live need for a method-on-existing-class probe today — CLAUDE.md bans the usual culprits (Thread.threadId, InputStream.transferTo) rather than reflecting around them. So this is a recipe-for-later, not a gap. Also note their impl carries OpenAI-specific baggage (rethrow via OpenAIException at line 49) that we would not copy. Value is the cached-lazy + null-degrade shape, not the code. + - *Do:* Keep as a LEARN, scoped to docs only. Add a short 'Opportunistic newer-JDK APIs from Java-8 modules' note (docs/architecture.md, near the JDK-8 compatibility section) recording two sanctioned techniques: (1) `as? NewerInterface` for interface-shaped additions (cite JdkHttpTransport.kt:178), (2) cached Class.forName+getMethod inside `by lazy` returning a null sentinel on ReflectiveOperationException for method-shaped additions (cite com/openai/core/PhantomReachable.kt:31-56). Do NOT add any Cleaner shim. This is preemptive documentation; do not spend effort beyond the note. +- **Tie resource liveness to the user-held handle, not an internal wrapper (async stream tracker)** `LEARN` · `both` · effort S · confidence medium + - *Verdict:* Claim is accurate and this is the one genuinely non-obvious idea in the subsystem. PhantomReachableClosingAsyncStreamResponse.kt allocates a dedicated `reachabilityTracker = Object()` (line 22), registers cleanup against IT not `this` (line 25, via `asyncStreamResponse::close`), and injects it into TrackedHandler (lines 29, 34, 49-52) so the stream stays alive exactly as long as the user-retained Handler. Without it, registering against the wrapper would close the stream as soon as the user drops the wrapper but keeps the handler. HOWEVER the original finding's concrete action item is partly WRONG: it says 'verify sdk-async-reactor SSE->Flux closes the Response on cancel/terminal; if not, file a bug.' I checked — Reactor.kt:116-127 (`toFlux`) and Reactor.kt:129-138 deliberately do NOT close the source on any terminal signal, and the KDoc at 132-138 states 'the caller owns the source... chain .doFinally { this.close() } if close-on-cancel is desired.' That is a documented ownership decision consistent with our toolkit philosophy, NOT a bug. So the principle is worth recording, but the 'go find the bug' framing should be dropped. For us the analog is 'dispose on Subscription cancel', and Reactor already exposes doOnCancel hooks (Reactor.kt:58,94) — we just deliberately leave closing to the caller. + - *Do:* Keep as LEARN; downgrade confidence and strike the bug-hunt. Capture one design note for FUTURE codegen streaming helpers and the reactive adapters: when a generated helper OWNS a Response/reader it created, liveness and disposal must follow the user-held handle (Subscriber/Handler/Flux subscription), releasing the owned resource on cancel and terminal completion. Explicitly contrast with the current toolkit primitives (toFlux/SSE), which by design transfer source ownership to the caller and must NOT auto-close. No code change to existing adapters. +- **Decision: NO GC safety-net auto-close in sdk-core (keep explicit, owned-gated close)** `LEARN` · `docs/process` · effort S · confidence high · we partly do this + - *Verdict:* Recategorized from ADOPT to LEARN: the original finding itself admits 'the recommended action is NOT to add the mechanism' — labeling a 'do not build X' decision as ADOPT is a category error. All four supporting facts check out: (1) Java-8 no-op is real (PhantomReachable.kt:52-54 returns null). (2) Ownership conflict is real and sharp — ClientOptions.kt:246/276/292 all say 'takes ownership... closes it when closed' and wrap even DEFAULT resources (executor 612-614, sleeper 630), the exact inverse of our owned-gate (OkHttpTransport.kt:187 `if (!owned) return`; JdkHttpTransport.kt:170; documented architecture.md:609-615). (3) Unbounded GC-latency cleanup on a daemon thread is a fair criticism for a toolkit that prizes deterministic release. (4) Toolkit-philosophy point stands. We ALREADY embody this decision in code and docs — so the only deliverable is writing the rationale down so a future contributor doesn't 'helpfully' add a Cleaner. That is worth doing but it is documentation of an existing stance, not adoption of anything new. Minor: impl-plan.md:235 ('JDK's HttpClient is GC'd') confirms we already, knowingly, lean on plain GC for the BYO JDK client rather than a Cleaner. + - *Do:* Add a 'Resource lifecycle: why no GC safety net' subsection to docs/architecture.md after the Lifecycle contract (~line 625): state the four reasons (Java-8 no-op placebo; conflict with BYO-never-closed; non-deterministic unbounded cleanup latency vs Loom-friendly determinism; toolkit hands ownership to the embedder), and contrast openai-java's ownership-transfer model (cite ClientOptions.kt:246). If a backstop is ever wanted, point to the leak-DETECTOR option below rather than auto-close. This is the central question the subsystem poses and the honest answer is a documented 'no.' + +**Considered & dropped** + +- ~~Self-reference guard for any GC-cleanup registration (COPY check(observed !== closeable))~~ — Folded into the leak-detector ADOPT rather than standing alone. The claim is accurate (PhantomReachable.kt:15-17 guards `observed !== closeable`, and the cleanup correctly captures the delegate's `::close`, e.g. PhantomReachableExecutorService.kt:17, not `this`). But as a standalone COPY it is not decision-ready: it is only relevant CONDITIONALLY on us building a Cleaner seam, which is exactly the leak-detector item. Lifting two lines verbatim adds Apache-2.0 attribution overhead for a guard we'd anyway get for free by shaping the API as `track(observed, AutoCloseable delegate)` and registering `delegate::close`. The substantive guidance ('make misuse impossible by only accepting observed + delegate') belongs inside, and is captured by, the leak-detector recommendation. No independent action remains, so it is not a separate keep. + +**Do not copy** + +1) DO NOT pull java.lang.ref.Cleaner usage into sdk-core as an always-on dependency-of-behaviour. It is Java 9+ (PhantomReachable.kt:52-55 proves they themselves treat Java 8 as no-op), so on our baseline JVM it would be a silent placebo. Any Cleaner use in sdk-core MUST be reflective + lazy + no-op-on-8, and gated behind an opt-in flag. +2) DO NOT copy the ownership-transfer model (ClientOptions.kt:246 'This class takes ownership of the client and closes it when closed', :249, :281). It is the direct antithesis of our documented BYO-never-closed contract (architecture.md:609-615, OkHttpTransport.kt:187 `if (!owned) return`). Auto-closing a user-supplied client/executor on GC is exactly the 'we closed your client out from under you' hazard our owned-gate exists to prevent. +3) DO NOT auto-close resources on a GC-scheduled daemon thread as DEFAULT behaviour. Cleaner cleanup latency is unbounded and GC-dependent; a connection pool / socket / executor could remain open long past last use. For a Loom-friendly toolkit that prizes deterministic resource release, silent non-deterministic close is a regression, not a feature. +4) Minor: their wrappers (PhantomReachableExecutorService.kt:14-58) re-implement the ENTIRE ExecutorService surface by hand purely to interpose one init-block registration. That is a lot of delegate boilerplate; if we ever wrap, prefer a single generic decorator/registration helper over per-type hand-written forwarders. Not a correctness issue, just over-verbose. + +**Where we're ahead** + +Yes, materially, for a toolkit. (1) Determinism: our close path is explicit, idempotent via AtomicBoolean+compareAndSet (OkHttpTransport.kt:184, JdkHttpTransport.kt:167, VirtualThreads.kt:85) and releases resources at a caller-controlled moment, not whenever the GC happens to notice — openai-java's net releases at an unbounded, GC-determined time (PhantomReachable.kt:8-12). (2) Ownership safety: our `owned` gate (OkHttpTransport.kt:187, JdkHttpTransport.kt:170) guarantees we never touch BYO clients/executors; openai-java's ownership-transfer + auto-close (ClientOptions.kt:246) can reclaim resources the user still intends to use. (3) Loom-correctness is explicit in our contract (architecture.md:606-608 'no synchronized which would pin a carrier thread'); their wrappers don't address carrier-thread pinning at all. (4) Honest Java-8 story: our newer-JDK probe (`client as? AutoCloseable`, JdkHttpTransport.kt:178) actually does something on JDK 21 and is correctly inert on 11; their Cleaner net is a SILENT no-op on Java 8 (PhantomReachable.kt:53-54), so the 'safety' they advertise evaporates on the platform we most target. (5) Best-effort non-throwing close with structured WARN logging (OkHttpTransport.kt:197-210) — they offer no such diagnostic; cleanup just runs or doesn't. The one thing they have that we lack is a forgotten-close backstop; we should answer that with an opt-in leak DETECTOR (SIMPLIFY finding), not with their silent auto-close. + +_Verifier notes:_ Verification summary. All mechanism-level claims in the analysis are ACCURATE after line-by-line re-read; the only factual defects are citation drift (PhantomReachable.kt is in com/openai/core/, not com/openai/core/http/ — line numbers are correct) and one wrong action item (Finding 3 told us to hunt a Response-not-closed-on-cancel bug in sdk-async-reactor; there is none — Reactor.kt:116-138 deliberately transfers source ownership to the caller and documents it, consistent with our toolkit stance). + +'We are ahead' verdict CONFIRMED for a toolkit: explicit owned-gated CAS close (OkHttpTransport.kt:183-221, JdkHttpTransport.kt:166-199, VirtualThreads.kt:84-91), no synchronized (architecture.md:606-608), BYO never closed (owned gate + architecture.md:609-615), best-effort non-throwing WARN logging — versus their silent, Java-8-no-op, ownership-transfer net. The one capability they have and we lack is a forgotten-close backstop, and the right answer for us is an opt-in DETECTOR, not their silent auto-close. + +Net of 5 original findings: kept 4 (3 recategorized to LEARN, 1 real ADOPT = the leak detector), dropped 1 (the COPY guard, folded into the detector). Two category corrections were necessary: the original 'ADOPT: decide NOT to add the mechanism' is a LEARN/docs item by definition, and the original 'COPY' guard is conditional scaffolding for the detector, not an independent liftable technique. + +Antipatterns section of the original analysis is sound and I concur with all four: (1) no always-on Cleaner in sdk-core; (2) do not copy ownership-transfer (ClientOptions.kt:246); (3) no GC-scheduled auto-close as default; (4) their per-type hand-written ExecutorService forwarder (PhantomReachableExecutorService.kt:20-57, ~14 methods) is over-verbose boilerplate — if we ever wrap, prefer one generic registration helper. These should inform the docs note in the no-GC-safety-net LEARN item. + +Realism caveat surfaced during verification: their own correctness test (PhantomReachableTest.kt) proves the feature with System.gc()+Thread.sleep(100), which is inherently nondeterministic. Any detector we build must test via a bounded GC-poll loop to avoid a flaky 80% coverage gate — flagged in the ADOPT recommendation. + +--- + +## 9. ClientOptions / RequestOptions / SecurityOptions / config assembly + +**What it is** + +openai-java centralizes ALL client configuration in one immutable `ClientOptions` god-object (782 lines, ~22 constructor params) built by a long Builder. The Builder.build() (ClientOptions.kt:610-716) is the real engine: it (1) injects 8 `X-Stainless-*` telemetry headers then `replaceAll`s user headers on top so users can override (642-643); (2) resolves the effective `Credential` via a precedence ladder (effectiveCredential, 503-533); (3) DECORATES the bare user `HttpClient` into a fixed stack `Retrying( Logging( WorkloadIdentity( client ) ) )` (678-690) and stores both the original and the decorated client; (4) does Azure URL categorization to inject `api-version` query (645-661). `RequestOptions` (56 lines) is a tiny per-call overlay carrying only `responseValidation` + `timeout`, merged via `applyDefaults` (RequestOptions.kt:23-29) using null-coalescing, with `Timeout.assign` doing field-level merge (Timeout.kt:144-153). `SecurityOptions` (74 lines) is a generated 2-boolean toggle (bearerAuth/adminApiKeyAuth) consumed by `ClientOptions.securityHeaders` (ClientOptions.kt:736-779) to emit the right auth header per endpoint. `Properties.kt` is plain env/system-property readers for the telemetry headers. `Utils.kt` is internal stdlib helpers (toImmutable, contentHash, withLockAsync). The architecture is "one config object, decorated client baked in at build time, thin per-call overlay." + +**How it works (line-level)** + +DECORATION BAKED INTO CONFIG (ClientOptions.kt:678-690): `RetryingHttpClient.builder().httpClient(LoggingHttpClient.builder().httpClient(workloadIdentityHttpClient).clock(clock).level(logLevel).build()).sleeper(sleeper).clock(clock).maxRetries(maxRetries).build()` — the cross-cutting concerns (retry, logging, auth-refresh) are pre-composed HttpClient wrappers, not pipeline steps. Both `originalHttpClient` and decorated `httpClient` are kept (34-44); `toBuilder().from()` reuses `originalHttpClient` (218) so re-building doesn't double-wrap. +PHANTOM-REACHABLE OWNERSHIP: every owned resource is wrapped to self-close on GC — `httpClient(...)` wraps in `PhantomReachableClosingHttpClient` (249), `streamHandlerExecutor` in `PhantomReachableExecutorService` (281), `sleeper` in `PhantomReachableSleeper` (294). `close()` (729-733) closes httpClient, shuts down executor, closes sleeper. +HEADER PRECEDENCE TRICK (632-643): default `X-Stainless-*` headers `put` first, then `headers.replaceAll(this.headers.build())` — user headers win because replaceAll runs last. Telemetry sourced from Properties.kt (getOsArch normalizes amd64/x86_64→"x64" etc., 7-21). +CREDENTIAL LADDER (effectiveCredential, 503-533): rejects (credential AND workloadIdentity) both set (513), else picks workloadIdentity > explicit credential > adminApiKey sentinel `AdminApiKeyOnlyCredential` (a private object, 782) > throw. The sentinel is a marker object meaning "only admin key present" so `from()` can strip it (233 `takeUnless { it === AdminApiKeyOnlyCredential }`). +PER-CALL OVERLAY (RequestOptions.kt:23-29): `applyDefaults` is null-coalescing — `responseValidation ?: options.responseValidation`; timeout uses field-merge `timeout.assign(options.timeout)` when both present. `Timeout.assign` (Timeout.kt:144-153) copies target then overlays only non-null source fields. `RequestOptions.from(clientOptions)` (14-18) seeds the per-call baseline from client config. `none()` is a cached singleton (9-11). +SECURITY-PER-ENDPOINT (securityHeaders, 736-779): generated `SecurityOptions{bearerAuth, adminApiKeyAuth}` per operation drives which header is emitted; dispatches on credential subtype (BearerTokenCredential→`Authorization: Bearer`, AzureApiKeyCredential→`api-key`), throws a tailored "This request requires apiKey or workloadIdentity" message if nothing satisfied (767-776). +JACKSON VERSION GUARD (Check.kt:50-84, called from ClientOptions.kt:142-146 init): reflectively reads 5 `PackageVersion.VERSION` constants (89-96) and compares against MINIMUM 2.13.4, with a hardcoded BAD_JACKSON_VERSIONS map pinning 2.18.1 to a specific databind bug (88). + +**vs. our SDK** + +We have NO ClientOptions analogue — grep for ClientOptions/RequestOptions/SecurityOptions across sdk-core returns zero hits. This is deliberate: we are a toolkit, not a client. Config is DISTRIBUTED: +- Layered env/sysprop/default lookup → Configuration.kt (org/dexpace/sdk/core/config/Configuration.kt:31-174) + ConfigurationBuilder.kt. Their Properties.kt + the fromEnv() block (ClientOptions.kt:551-595) is our Configuration, but ours is a generic typed key-value store (getInt/getBoolean/getDuration with parse-fail-to-default, Configuration.kt:66-95) rather than a fixed setter list. +- Per-concern config objects instead of one blob: HttpRetryOptions (http/pipeline/steps/HttpRetryOptions.kt:60-111), RetrySettings (pipeline/step/retry/RetrySettings.kt:65-272), HttpRedirectOptions, HttpInstrumentationOptions, ProxyOptions (util/ProxyOptions.kt:48-357). Each owns one concern's knobs + its own Builder. +- Their baked-in client DECORATION (ClientOptions.kt:678-690) is our PIPELINE — RetryStep, LoggingStep, AuthStep are ordered HttpStep pillars (http/pipeline), composed explicitly, not pre-wrapped HttpClient layers. This is the central architectural divergence. +- Telemetry headers: their X-Stainless-* in build() ↔ our SdkInfo.kt (util/SdkInfo.kt:34-74) feeding ClientIdentityStep. Same data, ours is a pipeline step. +- Their RequestOptions per-call overlay: WE HAVE NOTHING. Our context chain (CallContext→DispatchContext→RequestContext→ExchangeContext, http/context/) carries instrumentation + the Request/Response payloads forward, NOT per-call config overrides. docs/refs-comparison.md:183 already records "Per-call auth override is not yet exposed" as a known gap. +- Their PhantomReachableClosingHttpClient ownership ↔ our `Io.installProvider` + transport close() contract (BYO clients never closed). Different mechanism, same intent. +- checkJacksonVersionCompatibility (Check.kt) has NO analogue and SHOULD NOT — Jackson lives only in sdk-serde-jackson, never sdk-core. + +**Recommendations (verified)** + +- **Init-time runtime-dependency version fail-fast (Jackson PackageVersion check) — adopt the pattern in adapters, never sdk-core** `ADOPT` · `both` · effort S · confidence medium + - *Verdict:* Accurate. Check.kt:50-84 reads five Jackson PackageVersion.VERSION constants (json-core, databind, jdk8, jsr310, kotlin module; lines 89-96), compares against MINIMUM_JACKSON_VERSION 2.13.4 (:86) plus a BAD_JACKSON_VERSIONS pin of 2.18.1 → databind#4639 (:88), and fails with a multi-line actionable message. It is wired into ClientOptions.init at :142-146 and opt-out-able via the checkJacksonVersionCompatibility flag (:52). The dependency-coupling caveat is correctly flagged and verified: Check.kt:5-6 imports com.fasterxml.jackson.core.Version/VersionUtil, so an equivalent in sdk-core would import the very dependency it validates — a hard violation of our zero-dep rule. I confirmed our adapters have no such guard (grep for version/NoSuchMethodError checks in sdk-io-okio3 and sdk-serde-jackson is empty). The value is real and squarely in the class of bug our CLAUDE.md already warns about (jvmTarget mismatch → NoSuchMethodError far from the cause). Confidence medium not high only because the payoff depends on whether Okio/Jackson actually break across the versions we support; scope it to deps with a known hard minimum. + - *Do:* In adapter modules with a hard minimum runtime dep, add an opt-out-able init-time version check that fails with an actionable message modeled on Check.kt:50-84 — e.g. a Jackson check in sdk-serde-jackson (it already imports Jackson) and, if Okio has a real floor, an OkioIoProvider-init check in sdk-io-okio3. Keep the third-party import strictly in the adapter. Do NOT add any version check that imports a third-party type to sdk-core; at most sdk-core could expose a dependency-free helper interface, but the concrete check belongs to the adapter. For the future generator: emit the check into the generated/adapter layer, not the core. +- **Per-call RequestOptions overlay with null-coalescing applyDefaults + field-level Timeout.assign merge — we have no per-call override channel and should add one** `ADOPT` · `both` · effort M · confidence high + - *Verdict:* Re-read confirms every cited mechanism. RequestOptions.kt:23-29 applyDefaults uses scalar null-coalescing (responseValidation ?: options.responseValidation) and for the nested Timeout calls timeout.assign(options.timeout) only when both sides are non-null, else timeout ?: options.timeout. Timeout.assign (Timeout.kt:144-153) is the subtle bit the finding correctly highlights: it takes the BASE (target) builder and overlays only the non-null source fields (connect?.let(this::connect) ...), so a per-call Timeout that sets only `request` inherits connect/read/write from the client default rather than nulling them. NONE singleton cached at RequestOptions.kt:9-11. Our side verified: DispatchContext (http/context/DispatchContext.kt:31-34) carries only instrumentationContext + callKey; RequestContext adds only the Request payload. There is no per-call config slot anywhere in the chain. refs-comparison.md:183 explicitly records 'Per-call auth override is not yet exposed — auth is wired per pipeline ... a per-call override remains worth adding,' and :345 lists 'per-call auth override via RequestContext' as planned. This is the one genuinely actionable, not-yet-done item in this subsystem and it is decoupled from their god-object, so it ports cleanly onto our distributed model. The finding's categorization (ADOPT), target (both), and the warning about field-merge-vs-whole-replace are all correct. + - *Do:* Add an immutable per-call overrides type (nullable fields only — NO Java records; Builder + newBuilder() per our conventions) carrying at least timeout, maxAttempts, and an optional Credential. Give it an applyDefaults(base) that does per-field null-coalescing for scalars and a field-level overlay for nested structures (copy the Timeout.assign pattern: start from the base, overlay only non-null source fields — do NOT replace the whole nested object). Thread it as an optional slot on the existing context chain (RequestContext is the natural home given refs-comparison.md:345 already names it) rather than inventing a parallel mechanism, and have RetryStep/AuthStep read it at execute() time. This closes the refs-comparison.md:183/:345 gap. Note: their overlay rides on a single ClientOptions; ours must read defaults from the per-concern Options objects, so applyDefaults' 'base' is assembled from those, not from one blob. +- **ClientOptions is a config-explosion / baked-in-decorator antipattern for a toolkit; our distributed per-concern Options + explicit pipeline composition is the correct inverse — record the decision** `LEARN` · `docs/process` · effort S · confidence high · we partly do this + - *Verdict:* Citations accurate. ClientOptions.kt is 783 lines; the primary constructor (lines 36-139) carries ~22 params spanning transport, headers, query, timeout, maxRetries, logLevel, jsonMapper, clock, sleeper, streamHandlerExecutor, four credential kinds (apiKey/adminApiKey/credential/workloadIdentity), Azure routing, org/project tenancy, and webhookSecret. The load-bearing antipattern is real: build() at :678-690 hardcodes the decorator order RetryingHttpClient(LoggingHttpClient(WorkloadIdentityHttpClient(client))) and stores both original and wrapped client (:692-694) — a consumer cannot reorder, omit, or insert a layer. securityHeaders (:736-779) further fuses auth-scheme resolution into the config object. We verified we have zero ClientOptions/RequestOptions/SecurityOptions (grep across sdk-core is empty) and that config is distributed: Configuration (generic typed lookup) + HttpRetryOptions/RetrySettings/HttpRedirectOptions/ProxyOptions (per-concern, each with its own Builder) + cross-cutting behavior as ordered HttpStep pillars. So weAlreadyDoIt=true; this is validation-by-scrutiny, not a change. The only deliverable is documenting the rejected alternative so we don't drift toward a single Options blob as knobs accumulate. Worth keeping precisely because it is a load-bearing design bet for a toolkit, but it is a docs note, not engineering work. + - *Do:* Add a short decision note to docs/architecture.md (or extend docs/refs-comparison.md): 'Config is distributed per-concern (Configuration + per-concern *Options) and cross-cutting behavior is composed via pipeline-step ordering. We reject the single-ClientOptions model because (a) it hardcodes the decorator stack order — see ClientOptions.kt:678-690 — and (b) a god-object cannot be extended by a future codegen consumer without forking.' Cite the two concrete lines. No code change. +- **Telemetry/identity header precedence (defaults first, caller overrides last) — we already implement and test the caller-wins contract in ClientIdentityStep** `LEARN` · `sdk-core` · effort S · confidence medium · we partly do this + - *Verdict:* The openai-java mechanism claim is accurate: build() puts eight X-Stainless-* headers then headers.replaceAll(this.headers.build()) at :632-643, with the comment at :642 stating the intent that end-users can overwrite. Properties.kt normalizes os.arch (amd64/x86_64→x64, aarch64→arm64, :13-18) and detects Android via java.vendor.url (:29). BUT the finding's recommendation ('confirm our step honors override order and ADD a test') is substantially already satisfied, so I downgrade it and set weAlreadyDoIt=true. ClientIdentityStep defaults to Mode.Append, which preserves a caller-set value and appends the SDK token line after it (apply(), lines 111-128: existing.isNullOrEmpty() → tokenLine; else → "$existing $tokenLine"). The precedence is documented in the class KDoc (Modes section) AND tested: ClientIdentityStepTest 'append mode prepends caller's User-Agent to the SDK tokens' (:37-49 asserts 'MyApp/1.0 dexpace-sdk/... jvm/...'), plus a Replace-mode test (:53-69) and a custom-header test (:85+). So caller-wins-by-default is an established, tested contract for the identity header. The one residual nuance worth keeping as a LEARN: their single replaceAll makes the override contract global across ALL default headers in one place; ours is per-step, so if we later add OTHER default-header emitters (e.g. an Accept/Content-Type defaults step), each must independently decide and document its precedence — there is no single SDK-wide 'caller headers win' guarantee. That is a small consistency note, not the new test the finding asked for. + - *Do:* No new test needed for ClientIdentityStep — the caller-wins (Append) and override (Replace) behaviors are already documented and covered (ClientIdentityStepTest:37-69). If/when we add additional default-header-emitting steps, mirror the explicit-intent discipline at ClientOptions.kt:642: document each step's precedence-vs-caller-headers and add the analogous test. Optionally add one sentence to docs stating 'default-header steps default to caller-wins (Append); use Replace only when the SDK must own the header.' +- **Per-operation auth requirement (SecurityOptions) + credential-resolution-with-tailored-error is a codegen output and an sdk-core seam, not a fixed toolkit primitive** `LEARN` · `codegen` · effort M · confidence medium + - *Verdict:* Accurate, and I am merging findings 4 and 5 of the original analysis here because they are the same lesson seen twice (auth-scheme selection is generated per-API policy, with one clever encoding trick). SecurityOptions.kt is 74 lines, two booleans bearerAuth/adminApiKeyAuth (:11-13), generated per operation. ClientOptions.securityHeaders (:736-779) consumes it: it emits Authorization: Bearer / api-key depending on the resolved Credential subtype and throws a precisely-tailored error when no configured credential satisfies the endpoint (:767-776, e.g. 'This request requires adminApiKey'). effectiveCredential (:503-533) enforces a precedence ladder (workloadIdentity > explicit credential > adminApiKey) with a mutual-exclusion guard at :513. The genuinely clever sub-trick: AdminApiKeyOnlyCredential is a private object marker (:782) used as a non-null Credential meaning 'no real credential, admin key only', detected by reference identity and stripped on round-trip via takeUnless { it === AdminApiKeyOnlyCredential } (:233) so the sentinel never ossifies into a real credential. All API-specific policy — correctly target=codegen, NOT sdk-core. We verified our auth is per-pipeline (sealed Credential family + ChallengeHandler + AuthStep pillars per refs-comparison.md:175-183), not per-operation. The toolkit should expose the seam (an AuthStep that takes a required-schemes descriptor + available credentials and either sets the header or throws a precise error); the per-operation scheme-set and the precedence ladder are generated. The two-boolean shape is an artifact of OpenAI having exactly two schemes; a generator emits one flag per scheme. + - *Do:* When designing codegen's auth model: (1) emit a per-operation 'required schemes' descriptor (do NOT bake a fixed scheme enum into sdk-core — the toolkit must stay scheme-agnostic); (2) generate the precedence ladder into per-API code; (3) provide an sdk-core AuthStep seam that consumes the descriptor + available credentials and either sets the header or throws a tailored 'this request requires X' error, copying the message-tailoring style at ClientOptions.kt:767-776. If a generated multi-scheme credential slot needs a tri-state ('explicit' / 'none' / 'scheme-only'), consider the private-object marker-sentinel technique (ClientOptions.kt:782, :233) — but keep any such sentinel strictly internal so it never crosses the public API boundary (it would break apiCheck and confuse callers). + +**Considered & dropped** + +- ~~Their RequestOptions is dramatically simpler than our retry/config surface; the two retry-config classes (HttpRetryOptions vs RetrySettings) are duplication to collapse~~ — The SIMPLIFY framing rests on a claim I verified to be INACCURATE. HttpRetryOptions (http/pipeline/steps/HttpRetryOptions.kt:60-73) and RetrySettings (pipeline/step/retry/RetrySettings.kt:65-81) are NOT 'near-identical fields under different names.' HttpRetryOptions carries 8 fields oriented to the synchronous http.pipeline stage runtime and tuned to Azure Core: maxRetries, baseDelay, maxDelay, fixedDelay, a list of Retry-After header names (retryAfterHeaders), and three pluggable predicates/providers (shouldRetryCondition, shouldRetryException, delayFromCondition). It has no total-timeout budget, no jitter, no per-method gating, no scheduler, and a plain @JvmOverloads constructor with zero validation. RetrySettings carries 9 different fields oriented to the async recovery-pipeline primitive and tuned to Square+gax: totalTimeout (a gax-style cumulative deadline budget), initialDelay, delayMultiplier, maxDelay, maxAttempts, jitter, retryableStatuses, retryableMethods, and a ScheduledExecutorService — behind a validating builder with bounds checks (delayMultiplier>=1.0, jitter in 0..1, nano-representable-delay cap). The actual field overlap is only maxRetries/maxAttempts and maxDelay. They model two genuinely distinct layers that CLAUDE.md and docs/pipelines.md document as separate (stage-based http.pipeline vs recovery-aware pipeline) and they cite different reference designs. So the 'collapse the duplication' SIMPLIFY direction is wrong, and the 'theirs is simpler, don't follow' half is generic (of course a single-API client exposes fewer knobs than a toolkit). No decision-ready action survives. If anything remains, it is a one-line doc note that two retry vocabularies exist because two pipeline layers exist — too thin to keep as a finding. +- ~~Configuration.getDuration shorthand parser is richer and more defensive than openai-java's env handling — we are ahead~~ — Accurate but non-actionable filler. I verified Configuration.getDuration (config/Configuration.kt:92-95, parseDuration :145-172) supports ISO-8601 + shorthand (ms/s/m/h/d, bare=millis), rejects negatives on both paths, and fails safe to the default; getBoolean is strict (:75-85). And their fromEnv (ClientOptions.kt:551-595) is indeed raw-string setters with no typed coercion (maxRetries isn't even env-configurable). But the recommendation is explicitly 'No change' — it is a self-congratulatory note. The only residual value is preempting a mistaken 'simplify toward their thinner model,' which is already covered by the antipatterns/weAreAhead narrative the orchestrator receives. Not decision-ready; drop to avoid filler. +- ~~Utils.kt / Properties.kt internal helpers (toImmutable, contentHash, withLockAsync, os-arch normalization)~~ — Surveyed for completeness; nothing decision-ready. Utils.kt (toImmutable variants :17-34, contentHash/contentDeepHashCode :80, withLockAsync :109-121) are ordinary stdlib conveniences — we already have equivalents (defensive Collections.unmodifiable* copies throughout, e.g. RetrySettings.build() :198-199) and our Loom rule already mandates ReentrantLock.withLock. Properties.kt os.arch/os.name normalization (:7-35) is identity-telemetry detail our SdkInfo deliberately keeps minimal (sdkVersion + java.version, util/SdkInfo.kt:67-71); enriching identity tokens with normalized arch/os is a cosmetic codegen-era nicety, not a config-subsystem finding. withLockAsync is a reasonable async-lock idiom but belongs (if anywhere) to an async adapter, not this subsystem, and only if a concrete need appears. + +**Do not copy** + +1. THE 782-LINE GOD-OBJECT ITSELF (ClientOptions.kt:34-780). 22 constructor params spanning transport, headers, query, timeout, retry, logging, Jackson, clock, sleeper, 4 credential kinds, Azure routing, tenancy, webhooks. Do NOT introduce a single dexpace ClientOptions. For a toolkit it hardcodes policy a consumer can't extend. Our distributed per-concern Options is the correct inverse — keep it. +2. DECORATOR STACK HARDCODED IN build() (ClientOptions.kt:678-690): the wrap order Retrying(Logging(WorkloadIdentity(client))) is frozen in config-build code. A user cannot reorder, omit, or insert a layer. This is exactly what our explicit pipeline composition exists to avoid; do not migrate cross-cutting concerns from pipeline steps into pre-wrapped HttpClient layers. +3. HARD JACKSON DEPENDENCY IN CORE CONFIG (ClientOptions.kt:5 imports JsonMapper; :59 jsonMapper field; Check.kt:5-6 imports jackson Version). openai-java's core cannot be used without Jackson on the classpath. This is the precise coupling our zero-dep sdk-core rule forbids. Any port of the Jackson-version-check or mapper-config MUST live in sdk-serde-jackson, never sdk-core. +4. okhttp + executor + sleeper OWNERSHIP wrapped via PhantomReachable* in the config object (ClientOptions.kt:249, 281, 294). Clever (GC-driven close) but it bakes resource ownership into config; combined with #1 it means constructing options spins up a cached thread pool (build():612-629) as a side effect. Our 'BYO clients are never closed by the SDK' + explicit Io.installProvider seam is cleaner; don't adopt side-effecting resource allocation in a config constructor. +5. MUTABLE STATIC-ENV READS SCATTERED IN fromEnv() (ClientOptions.kt:551-595): direct System.getenv/getProperty calls inline, not behind a test seam — they even special-case AZURE_OPENAI_KEY vs OPENAI_API_KEY mutual exclusion inline (:571-585). Our Configuration with envSource/propsSource Function seams (ConfigurationBuilder.kt:40-43) is more testable; do not inline raw System.getenv in config-assembly code. + +**Where we're ahead** + +1. CONFIG INGESTION ROBUSTNESS: our Configuration.getDuration (Configuration.kt:92-95, :145-172) parses ISO-8601 + shorthand (ms/s/m/h/d), rejects negatives, and fails safe to the default; getBoolean is strict (Configuration.kt:75-85). openai-java's fromEnv (ClientOptions.kt:551-595) does none of this — raw strings to setters, no typed coercion, maxRetries not even env-configurable. Ours is a real typed config layer; theirs is a thin per-API reader. +2. TESTABILITY OF CONFIG SOURCES: our envSource/propsSource Function seams (Configuration.kt:33-34, ConfigurationBuilder.kt:40-43) make env/property lookup hermetically testable. Theirs hardcodes System.getenv/getProperty inline in fromEnv (ClientOptions.kt:553-587), forcing env-var manipulation in tests. +3. SEPARATION OF CONCERNS: cross-cutting behavior is composable pipeline steps (RetryStep/LoggingStep/AuthStep ordered as HttpStep pillars) vs their frozen decorator stack (ClientOptions.kt:678-690). A consumer of our toolkit can reorder/omit/insert; a consumer of theirs cannot. +4. ZERO-DEP CORE: sdk-core has no Jackson/okhttp coupling; theirs imports JsonMapper and okhttp into core (ClientOptions.kt:5, and the okhttp transport is a hard runtime path). Our adapter isolation is strictly better for a library others embed. +5. RESOURCE OWNERSHIP CLARITY: our 'SDK closes only SDK-managed resources, BYO clients never closed' + single Io.installProvider seam is more predictable than baking PhantomReachable* GC-close wrappers and a side-effecting cached-thread-pool allocation into the config builder (ClientOptions.kt:249/281/294/612-629). +NOT AHEAD ON: per-call overrides (we have zero; their RequestOptions overlay at RequestOptions.kt:23-29 is a clean pattern we should adopt — see findings) and runtime dependency-version fail-fast (their Check.kt:50-84 has no analogue in our adapters). + +_Verifier notes:_ Verification method: read all five cited openai-java files line-by-line (ClientOptions.kt 783 lines, RequestOptions.kt, SecurityOptions.kt, Properties.kt, Utils.kt) plus Timeout.kt (for the assign field-merge) and Check.kt (for the Jackson guard); read our Configuration.kt, ConfigurationBuilder seams, ProxyOptions.kt, SdkInfo.kt, ClientIdentityStep.kt (+ its test), DispatchContext/RequestContext/CallContext, and both retry-config classes; grepped sdk-core for ClientOptions/RequestOptions/SecurityOptions (zero hits, confirming we have no analogue by design). + +Net: of the original 8 findings I kept 5 (one is a merge of the original auth findings 4+5) and dropped 3. + +Strongest KEEP: the per-call RequestOptions overlay (ADOPT, high) — the only not-yet-done, cleanly-portable gap; refs-comparison.md:183 and :345 already name it, and Timeout.assign:144-153 is the field-merge detail to copy (overlay non-null onto base; never whole-replace a nested structure). + +Key CORRECTION: the SIMPLIFY 'two retry classes are duplication' finding was dropped because the duplication claim is false on inspection — HttpRetryOptions (Azure-Core-tuned, predicate-based, no timeout-budget/jitter/method-gating, no validation) and RetrySettings (Square+gax-tuned, totalTimeout budget + jitter + method-gating + validating builder + scheduler) overlap on only ~2 fields and serve the two documented pipeline layers. + +Key DOWNGRADE: the telemetry-header-precedence finding — caller-wins is already the documented, tested default (ClientIdentityStep Mode.Append; ClientIdentityStepTest:37-69), so its 'add a test' recommendation is already satisfied; I kept only a thin LEARN about future per-step precedence consistency. + +The two auth findings (credential precedence + marker sentinel; per-operation SecurityOptions) were merged: both are 'auth selection is generated per-API policy + an sdk-core seam,' correctly target=codegen, with the private-object === sentinel as the one liftable technique (keep it internal). + +Cross-cutting reminders honored: every recommendation respects Java-8 (no records/sealed permits), zero-dep sdk-core (the Jackson check must live in the adapter that already imports Jackson — Check.kt:5-6 proves the coupling), and Builder+newBuilder()+nullable-fields conventions. + +--- + +## 10. Error/exception hierarchy + +**What it is** + +openai-java's error tree is a single Jackson-coupled hierarchy rooted at OpenAIException (RuntimeException; OpenAIException.kt:3-5). Below it sit three zero-member MARKER subtypes — OpenAIIoException (transport failure), OpenAIInvalidDataException (deserialization failure), OpenAIRetryableException (explicit transient) — each 5 LOC and existing only to be `is`-checked. The protocol-error branch is abstract OpenAIServiceException (OpenAIServiceException.kt:9-23), which declares SIX accessors: statusCode(), headers(): Headers, body(): JsonValue, and code()/param()/type(): Optional. The last three expose OpenAI's structured error envelope (ErrorObject{code,message,param,type}; ErrorObject.kt:20-60). Then come 8 concrete per-status classes (BadRequest/Unauthorized/PermissionDenied/NotFound/UnprocessableEntity/RateLimit/InternalServer/Unexpected, plus Sse), each ~100 LOC, each holding `Headers` + `JsonField` + a full Builder with checkRequired validation. Crucially their per-status coverage is NOT 1:1 with codes: only 6 specific 4xx get named classes; ALL 5xx collapse into InternalServerException (carrying a dynamic statusCode field), and everything else falls to UnexpectedStatusCodeException. Central dispatch is errorHandler() (ErrorHandler.kt:42-93), a `when(statusCode)` that builds and throws — the precise analog of our HttpExceptionFactory.fromResponse. The whole branch is generated ("// File generated from our OpenAPI spec by Stainless") and hard-depends on Jackson (jsonMapper(), JsonValue, JsonField) inside core. + +**How it works (line-level)** + +Marker-only subtypes drive retry: RetryingHttpClient.kt:184-188 — `private fun shouldRetry(throwable: Throwable): Boolean = throwable is IOException || throwable is OpenAIIoException || throwable is OpenAIRetryableException`. That is the SOLE consumer of the Io/Retryable split — the type IS the signal; there is no `retryable` field. Their response-side retry (RetryingHttpClient.kt:162-182) is a separate hardcoded `when`: 408->true, 409->true (lock timeouts), 429->true, >=500->true, plus an X-Should-Retry header override (lines 169-170) that wins over everything. Note this CONTRADICTS their own class taxonomy — 409 retries here but 409 has no named class, and 501/505 retry here (>=500) whereas a well-behaved policy excludes them. Message construction is inlined into each super-call, e.g. BadRequestException.kt:22 — `OpenAIServiceException("400: ${error.asKnown().getOrNull()?._message()?.asKnown()?.getOrNull() ?: (if (error.isMissing()) \"Unknown\" else jsonMapper().writeValueAsString(error))}", cause)`. The `asKnown().getOrNull()` is deliberate: asKnown() throws OpenAIInvalidDataException on a malformed field (Values.kt:111), so the getOrNull() guard prevents an exception-from-an-exception-constructor. body() lazily re-serializes: `if (error.isMissing()) JsonMissing.of() else JsonValue.fromJsonNode(jsonMapper().valueToTree(error))` (BadRequestException.kt:28-30). Each class ships a Builder whose build() runs `checkRequired("headers", headers)` (Check.kt:11-12 -> checkNotNull) — i.e. a thrown object validates its own required fields. errorBodyHandler (ErrorHandler.kt:26-40) tolerantly digs `node.get("error")` and falls back to JsonMissing.of() inside a catch-all, so a non-JSON 500 body never breaks dispatch. InvalidData is the deserialization-failure currency: thrown from JsonHandler.kt:18 ("Error reading response"), Values.kt:174-176 (missing/null/invalid field), Check.kt-adjacent Utils.kt:14, SseMessage.kt:56/64. Transport wraps IO at OkHttpClient.kt:51/71 — `throw OpenAIIoException("Request failed", e)`. + +**vs. our SDK** + +Our hierarchy is cleaner and already ahead on the toolkit axis. HttpException (HttpException.kt:58-128) is abstract, extends RuntimeException, and carries status: Status, headers: Headers, body: ResponseBody? (LAZY, not eagerly buffered) — plus a non-consuming bodySnapshot(maxBytes) that reads from body.source().peek() (HttpException.kt:104-114) so the primary read path is undisturbed and rogue multi-MB error bodies don't OOM (cap DEFAULT_SNAPSHOT_BYTES=4096). retryable is a derived `val` (HttpException.kt:72) = RetryUtils.isRetryable(status.code), NOT a hardcoded per-subclass constant and NOT encoded via the exception's TYPE — so the baked flag can never drift from the live policy. We have 16 named per-status subclasses + 2 fallbacks (ClientErrorException/ServerErrorException) in HttpExceptions.kt — far more granular than their 6+collapsed. Each of OUR subclasses is ~18 LOC (just the constructor forwarding response.status/headers/body) vs their ~100 — because we carry NO typed body and NO Builder. Dispatch is HttpExceptionFactory.fromResponse (HttpExceptionFactory.kt:74-102), a `when` over named consts that throws IllegalArgumentException for non-4xx/5xx (a guard they lack — their `else` branch happily wraps a 200 as UnexpectedStatusCodeException). Transport split: NetworkException extends IOException, always retryable=true (NetworkException.kt:34-46) == their OpenAIIoException. HttpResponseException (HttpResponseException.kt:36-60) is a second, parallel IOException-based error carrying response + an Any? `value` (deserialized payload) + isRetryable computed over BOTH status code AND cause-chain via RetryUtils.isRetryable(Throwable) (RetryUtils.kt:58-68, with cycle-safe HashSet identity tracking). GAPS: (1) we have NO deserialization-failure type anywhere — serde/ package (Deserializer/Serializer/Serde/Tristate) declares zero exceptions, so there is no analog of OpenAIInvalidDataException. (2) HttpException exposes no structured-error accessors (confirmed via .api dump lines 1322-1336: only getStatus/getHeaders/getBody/getRetryable/bodySnapshot) — code()/param()/type() do not exist. (3) We have TWO overlapping base error types (HttpException : RuntimeException AND HttpResponseException : IOException) that both carry a response and both compute retryability — a confusing duplication. + +**Recommendations (verified)** + +- **Add an SDK-level deserialization-failure exception to the serde contract — real gap vs OpenAIInvalidDataException; our adapter currently leaks raw Jackson exceptions across the zero-dep seam** `ADOPT` · `both` · effort S · confidence high + - *Verdict:* Strongest finding in the set; verified on both sides. openai-java has a dedicated catchable OpenAIInvalidDataException (5-LOC marker) thrown at the decode boundary: jsonHandler wraps every parse failure as `throw OpenAIInvalidDataException("Error reading response", e)` (JsonHandler.kt:18-19), and ErrorObject's typed accessors document `@throws OpenAIInvalidDataException` (ErrorObject.kt:39,42,48). OUR serde package declares ZERO exceptions (confirmed: Deserializer.kt/Serializer.kt/Serde.kt name none in their contracts), and JacksonDeserializer just calls `mapper.readValue(input, type)` (JacksonSerde.kt:161,166,176) — so a malformed payload surfaces as a raw com.fasterxml.jackson.* exception (JsonProcessingException/MismatchedInputException) straight through the SPI. For a TOOLKIT this is a genuine abstraction leak: a consumer building on our Deserializer must `catch (com.fasterxml...)` to handle a bad payload, which defeats the adapter boundary that the whole zero-dep design exists to protect. Correcting the prior agent's category split: the exception TYPE belongs in sdk-core (target both → it is really sdk-core for the type + adapter for the wrapping); it must be a plain RuntimeException (Java-8 safe, zero deps). Note the type should NOT extend HttpResponseException — decode failure is orthogonal to having an HTTP response, and HttpResponseException is itself slated for removal (see SIMPLIFY finding). + - *Do:* Add `public open class SerdeException(message: String, cause: Throwable? = null) : RuntimeException(message, cause)` in org.dexpace.sdk.core.serde, document it in the Deserializer/Serializer KDoc as the type adapters MUST throw on decode/encode failure, and update JacksonDeserializer (and the deserializeAs family) to catch JsonProcessingException and rethrow as SerdeException with the original as cause. Public-API addition → apiDump. Keep zero deps in core. +- **Hoist retryability to a one-method interface so NetworkException, HttpException (and codegen subclasses) are queried uniformly** `ADOPT` · `sdk-core` · effort S · confidence medium + - *Verdict:* Facts verified: three types expose a retryability boolean with no common supertype and two different names — HttpException.retryable (HttpException.kt:72), NetworkException.retryable (NetworkException.kt:45), HttpResponseException.isRetryable (HttpResponseException.kt:48). A retry step must special-case each. The interface idea is sound and is the field-based discipline taken to its polymorphic conclusion (correctly avoiding openai-java's type-sniffing). Two important downgrades from the prior agent's framing: (1) confidence medium not high — this partly OVERLAPS the SIMPLIFY finding: if HttpResponseException is deleted, two of the three divergent declarations collapse to one (HttpException) plus NetworkException, shrinking the problem to a 2-type naming mismatch that a tiny interface tidies but does not urgently require. Sequence this AFTER the collapse so we don't design an interface around a type we're about to remove. (2) The naming fix (retryable vs isRetryable) is the actually-valuable half; the interface is a nice-to-have seam. Java-8 safe (plain interface). Keep zero deps. + - *Do:* After collapsing HttpResponseException, introduce `public interface Retryable { public val isRetryable: Boolean }` (standardize on ONE name) and have HttpException + NetworkException implement it; retry steps query `(t as? Retryable)?.isRetryable == true`, never concrete types. Public-API addition → apiDump. +- **Collapse the two parallel response-carrying base errors (HttpException : RuntimeException and HttpResponseException : IOException) into one model** `SIMPLIFY` · `sdk-core` · effort M · confidence high + - *Verdict:* Verified and substantive — this is real internal duplication, not an openai-java import. HttpException (HttpException.kt:58-72) is a RuntimeException carrying status+headers+lazy body with derived `retryable`. HttpResponseException (HttpResponseException.kt:36-60) is a SEPARATE IOException carrying response + `value: Any?` + `isRetryable`, computed via computeIsRetryable that defers to RetryUtils.isRetryable(status.code) for the response branch and RetryUtils.isRetryable(Throwable) for the cause branch. So both bases (a) carry a Response, (b) compute retryability from the same RetryUtils, (c) disagree on field name (`retryable` vs `isRetryable`) AND on checked-ness (RuntimeException vs IOException). openai-java has exactly ONE protocol base (OpenAIServiceException.kt:9), which is the instructive contrast. The same logical failure (non-2xx) can surface under two different supertypes depending on which path threw — a real footgun. The prior agent's observation that `value: Any?` is an untyped half-implementation of the typed-error-body feature is fair. Two cautions on the recommendation: (1) deleting/folding HttpResponseException is a public-API break (apiDump) — fine pre-1.0 but call it out; (2) before deleting, grep the pipeline/transport modules for actual throwers/catchers of HttpResponseException so the migration target is known. I verified HttpException is the better base per its own KDoc rationale (HttpException.kt:44-49). Effort M is right. + - *Do:* Pick HttpException (RuntimeException) as the single protocol base. Remove HttpResponseException; the pre-response/transport case is already covered by NetworkException. If a deserialized-error slot is wanted before codegen lands, add `errorValue: Any?` to HttpException rather than keeping a second hierarchy — but prefer letting codegen stamp a typed field. First grep pipeline/transport/pagination for HttpResponseException usages to scope the migration; then apiDump. +- **openai-java's ~100-LOC-per-class error duplication is codegen output, not a hand-written pattern; keep our thin Response-forwarding subclasses and reserve typed-body subclasses for the generator** `LEARN` · `both` · effort S · confidence high · we partly do this + - *Verdict:* Claim verified line-by-line. BadRequestException.kt, RateLimitException.kt (both :15-100) and InternalServerException.kt:15-112 are byte-identical except the status literal (override statusCode()=400/429 or a dynamic statusCode field for the collapsed cases) and all carry JsonField + a full Builder with checkRequired; every file is tagged '// File generated from our OpenAPI spec by Stainless'. Our subclasses (HttpExceptions.kt) are ~18 LOC, forward response.status/headers/body, carry no typed body, no Builder. The .api dump (sdk-core.api:1273-1336) confirms zero extra accessors. This is a sound LEARN that AFFIRMS current design and correctly assigns the typed-body work to codegen. The one caveat: it is confirmatory, not actionable for sdk-core today; its only deliverable is a docs note, and docs/refs-comparison.md:227 ALREADY states 'Per-operation per-status subclasses carrying typed bodies (Expedia pattern) are still codegen's job', so even the doc action is largely redundant. Keep as a low-effort LEARN that hardens the codegen design record; do not inflate its importance. + - *Do:* Do NOT add typed bodies or Builders to sdk-core's HttpExceptions; keep the ~18-LOC Response-forwarding constructors. The doc rule already exists at docs/refs-comparison.md:227 — only extend it with the concrete codegen shape (generator emits e.g. `class FooNotFound(resp, val error: FooError) : NotFoundException(resp)` deriving from our open subclasses, parsing the typed body lazily). No sdk-core code change. +- **Tolerant, never-throwing error-body handling is the key correctness lesson for codegen's typed error bodies — lift the PATTERN (guarded accessors + catch-to-empty), not the Jackson code** `LEARN` · `codegen` · effort S · confidence high · we partly do this + - *Verdict:* Pattern verified precisely. errorBodyHandler wraps the whole parse in `catch (e: Exception) { JsonMissing.of() }` (ErrorHandler.kt:31-38) so a malformed error body degrades to missing rather than throwing; and the per-class message interpolation guards every field via `error.asKnown().getOrNull()?._message()?.asKnown()?.getOrNull() ?: ...` (BadRequestException.kt:22) precisely because asKnown()/getRequired can throw OpenAIInvalidDataException. Net effect: a garbage 5xx body still yields a clean InternalServerException, never a secondary parse crash inside the exception constructor. This is the single most important rule for our FUTURE codegen and is correctly targeted at codegen, not sdk-core (our toolkit half already nails it: lazy body + peek snapshot, HttpException.kt:94-115). The rule is ALREADY recorded at docs/refs-comparison.md:228 ('never throw inside an exception constructor; pass through the raw body on parse failure'), so weAlreadyDoIt=true at the doc/principle level — the only delta is adding the concrete how (guarded accessors + catch-to-empty) and the file:line citation. Keep as a high-value LEARN for the codegen design doc; no sdk-core change. + - *Do:* In the codegen design doc, sharpen the existing rule (docs/refs-comparison.md:228) with the concrete mechanism: generated typed-error subclasses parse the body LAZILY and tolerantly (catch decode failure → expose null, keep raw bytes reachable via our existing bodySnapshot), and NEVER parse in the constructor. Cite ErrorHandler.kt:31-38 + BadRequestException.kt:22 as the reference. No sdk-core code change. + +**Considered & dropped** + +- ~~Their Io/InvalidData/Retryable split encodes the retry signal in the exception TYPE; ours encodes it in a derived retryable val — do NOT follow theirs~~ — Claim is accurate (RetryingHttpClient.kt:184-188 type-sniffs `is IOException || is OpenAIIoException || is OpenAIRetryableException`; the markers are 5-LOC; our HttpException.kt:72 uses a derived val) but the finding is confirmatory filler that produces no decision. Its category is SIMPLIFY yet it changes nothing and even admits 'affirms current design'. Its only forward action ('hoist retryable to an interface') is already the standalone ADOPT finding I kept. Folded the type-vs-field insight into the antipattern note and the Retryable-interface finding; standalone it is generic praise that fails the rigor bar. +- ~~The Builder + checkRequired on every exception is over-engineering for a thrown object — do not adopt~~ — Accurate (BadRequestException.kt:57-99 ships a Builder + toBuilder + Optional overloads + checkRequired, repeated verbatim across all 8 classes; Check.checkRequired→checkNotNull) but it is a non-action: it tells us to keep doing exactly what we do (Response-forwarding ctors, no Builder) and to not do something we never proposed. No decision, no change to sdk-core or codegen. The substance (Builder-on-a-throwable is ceremony) is already captured as antipattern #3 in the prior summary and folded into the kept codegen LEARN. Dropped as affirm-only filler. +- ~~Their per-status coverage is sparse and their retry table contradicts it; our Status-driven factory + RetryUtils single-source is more coherent~~ — All sub-claims verified (ErrorHandler.kt:48-92 names only 400/401/403/404/422/429, collapses 5xx→InternalServerException, else→Unexpected; RetryingHttpClient.kt:167-181 retries 408/409/429/>=500 so 408/409 have no named class and 501/505 ARE retried; our RetryUtils.kt:43-49 excludes 501/505; our 409 ConflictException is correctly non-retryable). But it is purely confirmatory — it affirms our single-source RetryUtils design and proposes no change beyond 'keep doing this' plus a maybe-reconsider-409 aside. The valuable nugget (their two retry tables disagree; the >=500 rule retries 501/505) is already documented in our own code comments (HttpExceptions.kt:22-25, RetryUtils.kt:23-25) and is folded into the antipattern note. Dropped: no decision-ready action, and we already encode the lesson. + +**Do not copy** + +1) JACKSON IN CORE: every openai-java exception imports com.openai.core.jsonMapper, JsonValue, JsonField and re-serializes the body via jsonMapper().valueToTree() (BadRequestException.kt:28-30). This hard-couples the error hierarchy to Jackson INSIDE core — exactly the dependency we forbid in sdk-core. Adopting their typed-body classes verbatim would drag Jackson into core; the typed body must live in a codegen-emitted subclass that depends on the chosen serde adapter, never in sdk-core. 2) RETRY-SIGNAL-AS-TYPE: making retryability a function of the exception's runtime type (RetryingHttpClient.kt:184-188) forces every step author to enumerate marker classes and can't express per-code nuance — we correctly use a derived field instead; do not regress. 3) BUILDER + checkRequired ON A THROWABLE (BadRequestException.kt:57-99): ceremony with no payoff at a throw site; inflates each class 5x. 4) NO 200-GUARD IN DISPATCH: their errorHandler `else` branch (ErrorHandler.kt:86-92) will wrap ANY status — including a 2xx that slipped through — as UnexpectedStatusCodeException; our HttpExceptionFactory throws IllegalArgumentException for non-4xx/5xx (HttpExceptionFactory.kt:96-100), which is safer. 5) TWO RETRY TABLES that disagree (taxonomy names 6 codes; retry table covers 408/409/>=500) — a maintenance hazard our single RetryUtils predicate avoids. 6) DYNAMIC statusCode FIELD on a 'typed-per-status' class (InternalServerException.kt:17,39) — half-defeats the point of a per-status type; we either name the code or fall back, not both. + +**Where we're ahead** + +Several concrete places. (a) LAZY body + non-consuming snapshot: our HttpException.body is a lazy ResponseBody? and bodySnapshot() reads from body.source().peek() with a 4 KiB cap (HttpException.kt:96-114) — a rogue multi-MB 5xx body neither OOMs nor consumes the primary read path. openai-java instead re-serializes the parsed ErrorObject into a JsonValue on every body() call (BadRequestException.kt:28-30) and has no bounded preview. (b) RETRYABILITY AS DERIVED SINGLE-SOURCE: our retryable = RetryUtils.isRetryable(status.code) (HttpException.kt:72) is computed once from the same predicate the policy uses, and our RetryUtils correctly excludes 501/505 (RetryUtils.kt:47-48); openai-java keeps a separate, coarser, contradictory >=500 retry rule (RetryingHttpClient.kt:179) and signals retry via exception type. (c) GRANULARITY WITH ZERO DUPLICATION: we expose 16 named per-status classes + 2 fallbacks at ~18 LOC each, vs their 6 named + collapsed-5xx at ~100 LOC each. (d) TOTAL Status TYPE: Status.fromCode is total and preserves vendor codes (499/520-526) with statusName=null (Status.kt:200-207); openai-java has no status value type — it passes raw Ints and stringly-builds messages. (e) DISPATCH GUARD: HttpExceptionFactory rejects non-4xx/5xx (HttpExceptionFactory.kt:96-100) instead of silently wrapping. (f) CYCLE-SAFE CAUSE-CHAIN classification: RetryUtils.isRetryable(Throwable) walks the cause chain with identity-based HashSet cycle detection (RetryUtils.kt:58-68); openai-java does no cause-chain inspection at all (only top-level `is` checks). The ONE thing they have and we lack is the structured-error accessors (code/param/type) and a deserialization-failure type — both addressed in findings as codegen + a small sdk-core addition respectively. + +_Verifier notes:_ VERDICT: 5 of 8 findings kept (2 ADOPT, 1 SIMPLIFY, 2 LEARN), 3 dropped as affirm-only/no-decision filler. Only 2 findings drive real sdk-core work: (A) add a SerdeException at the decode boundary (genuine gap — JacksonSerde.kt:161/166/176 leaks raw com.fasterxml.jackson.* through the zero-dep SPI, serde package declares no exceptions), and (B) collapse the two parallel bases HttpException : RuntimeException vs HttpResponseException : IOException (real internal duplication, not an openai-java import). The Retryable-interface ADOPT is real but should be sequenced AFTER (B) since deleting HttpResponseException removes 2 of the 3 divergent retryability declarations. The two LEARNs are codegen-doc hardening of rules already present in docs/refs-comparison.md:227-228. + +ACCURACY CORRECTIONS to the analysis I verified: +- The antipatterns claim 'NO 200-GUARD IN DISPATCH ... else branch will wrap ANY status including a 2xx' is WRONG. ErrorHandler.kt:49 has `in 200..299 -> response`, so 2xx IS guarded (passed through). The `else` branch wraps 1xx/3xx (and anything else) as UnexpectedStatusCodeException. So the comparative point survives — openai-java would wrap a 1xx/3xx where our HttpExceptionFactory throws IllegalArgumentException (HttpExceptionFactory.kt:96-100) — but the specific '200' framing is inaccurate. Restate as '1xx/3xx-guard', not '200-guard'. +- 'weAreAhead' item (a) says openai-java 're-serializes the parsed ErrorObject into a JsonValue on every body() call' — verified accurate (BadRequestException.kt:28-30 `JsonValue.fromJsonNode(jsonMapper().valueToTree(error))` with no memoization), and our lazy-body + 4 KiB peek snapshot (HttpException.kt:94-123) is genuinely better. Item (f) cycle-safe cause-chain (RetryUtils.kt:58-68, identity HashSet) verified; openai-java does only top-level `is` checks. These 'we are ahead' points are real but are not findings (no action) — correctly left out of the findings list. + +The two structured-error accessors gap (code/param/type) is real (.api dump 1322-1336 shows only getStatus/getHeaders/getBody/getRetryable/bodySnapshot) but is correctly NOT a sdk-core action: those accessors are API-specific (OpenAI's ErrorObject envelope) and belong to codegen, exactly as the kept LEARN #1 states. No separate finding needed. + +--- + +## 11. Pagination (Page, AutoPager, PageAsync) + +**What it is** + +openai-java's pagination is deliberately minimal and lives in two layers. (1) A 3-method per-page contract: `Page` (`hasNextPage()`, `nextPage(): Page`, `items(): List`; Page.kt:21-32) and its async twin `PageAsync` where `nextPage()` returns `CompletableFuture>` (PageAsync.kt:31). (2) Two thin drivers that wrap a first page: `AutoPager : Iterable` whose entire engine is one line — `generateSequence(firstPage) { if (it.hasNextPage()) it.nextPage() else null }.flatMap { it.items() }.iterator()` (AutoPager.kt:16-18) plus a `stream()` via `StreamSupport` (AutoPager.kt:20) — and `AutoPagerAsync : AsyncStreamResponse` which drives async iteration by recursively chaining `nextPage().thenCompose { it.handle() }` with an `AtomicReference` (NEW/SUBSCRIBED/CLOSED) lifecycle (AutoPagerAsync.kt:38-67). + +The architecture's load-bearing decision is that there are NO pagination "strategies" in the runtime at all. Each concrete page is GENERATED per-endpoint (e.g. FileListPage.kt) and bakes the wire convention directly into typed code: it holds `service` + typed `params` + typed `response`, computes `hasNextPage()` from response data (`items().isNotEmpty()`, FileListPage.kt:37), builds `nextPageParams()` by mutating the TYPED params builder (`params.toBuilder().after(items().last()._id()...).build()`, FileListPage.kt:39-41), and implements `nextPage()` by simply re-invoking the typed service: `service.list(nextPageParams())` (FileListPage.kt:42). The cursor lives in a typed param, never in a URL string. `PrepareRequest.kt` + the `Params` interface (`_headers()`, `_queryParams()`, Params.kt:7-16) are how a params object is folded into an `HttpRequest` (PrepareRequest.kt:13-29) — pagination reuses that same request-build path rather than rewriting URLs. So the toolkit owns ~90 lines (Page/PageAsync/AutoPager/AutoPagerAsync); codegen owns everything endpoint-specific. + +**How it works (line-level)** + +AutoPager engine, in full (AutoPager.kt:15-20): `override fun iterator(): Iterator = generateSequence(firstPage) { if (it.hasNextPage()) it.nextPage() else null }.flatMap { it.items() }.iterator()` and `fun stream(): Stream = StreamSupport.stream(spliterator(), false)`. The whole sync pager is a lazy `generateSequence` of pages flat-mapped to items — no explicit iterator class, no index bookkeeping, no maxPages cap. + +Async driver core (AutoPagerAsync.kt:38-46): `fun PageAsync.handle(): CompletableFuture { if (state.get() == State.CLOSED) { return CompletableFuture.completedFuture(null) }; items().forEach { handler.onNext(it) }; return if (hasNextPage()) nextPage().thenCompose { it.handle() } else CompletableFuture.completedFuture(null) }`. Iteration is push-based via `AsyncStreamResponse.Handler` (`onNext`/`onComplete`), the SAME interface used for SSE streaming (AsyncStreamResponse.kt:14-63) — so a caller iterating pages and a caller consuming an SSE stream use one surface. Note the unwrap at AutoPagerAsync.kt:50-51: `val actualError = if (error is CompletionException && error.cause != null) error.cause else error` — strips the `CompletionException` wrapper before handing the error to `onComplete`. Single-subscription is enforced by `state.compareAndSet(State.NEW, State.SUBSCRIBED)` with a message that distinguishes "already subscribed" from "closed" (AutoPagerAsync.kt:33-36). + +Generated page, the real mechanism (FileListPage.kt:35-42): `override fun items(): List = data()`; `override fun hasNextPage(): Boolean = items().isNotEmpty()`; `fun nextPageParams(): FileListParams = params.toBuilder().after(items().last()._id().getOptional("id")).build()`; `override fun nextPage(): FileListPage = service.list(nextPageParams())`. The "no next page" case is generated as a hard throw: VectorStoreSearchPage.kt:34-37 emits `override fun hasNextPage(): Boolean = false` and `fun nextPageParams(): VectorStoreSearchParams = throw IllegalStateException("Cannot construct next page params")` — search endpoints are single-page, and codegen encodes that statically rather than via a runtime flag. Cursor extraction is uniform across all 57 list endpoints: `params.toBuilder().after(items().last()._id()...)` (AssistantListPage.kt:40-41 is identical to FileListPage). The service wires the page by builder (FileServiceImpl.kt:175-181): `.let { FileListPage.builder().service(FileServiceImpl(clientOptions)).params(params).response(it).build() }` — and crucially the response body is drained inside `response.use { listHandler.handle(it) }` (FileServiceImpl.kt:169) BEFORE the page is built, so the page holds a fully-deserialized typed `response`, not an open stream. + +**vs. our SDK** + +Our design splits the same job very differently and is heavier in the toolkit. Our per-page contract is `Page` with `items: List`, `hasNext: Boolean`, and `nextPageRequest(): Request?` (Page.kt:33-54) — note ours returns a raw `Request` (URL+headers+body), not a typed next-page object; the next request is built by string/URL surgery, not by mutating typed params. Our driver is `Paginator` (Paginator.kt:86-234): a hand-written 70-line `PaginatorIterator` (Paginator.kt:136-233) with explicit fields `currentPage`/`currentItemIndex`/`nextRequest`/`started`/`done`/`pagesFetched`, an `advance()` method, and a `maxPages` safety cap (Paginator.kt:92-96, 193-199). It owns response lifecycle: `strategy.parse(response, initialRequest)` in a `try { } finally { response.close() }` (Paginator.kt:219-224). + +The wire convention lives in runtime `PaginationStrategy` (fun interface, PaginationStrategy.kt:35-55) with FOUR shipped impls: CursorPaginationStrategy.kt, TokenPaginationStrategy.kt (which the file itself admits is "logically identical to CursorPaginationStrategy" — TokenPaginationStrategy.kt:14-15), PageNumberPaginationStrategy.kt, and the 238-line LinkHeaderPaginationStrategy.kt (a full RFC 5988/8288 hand-rolled parser: `splitLinkValues`, `parseLinkValue`, `extractRelTokens`, `resolveNextUrl`). All of cursor/token/page-number drive `RequestRebuilder` (RequestRebuilder.kt:37-166) — a from-scratch URL query rewriter (`setQueryParam`/`getQueryParam`/`rebuildUrl`) using `URLEncoder`/`URLDecoder` and manual `StringBuilder` URL reassembly, including a hand-coded fix for RFC 3986 query-only references (LinkHeaderPaginationStrategy.kt:90-96). + +Separately we ALSO ship a second, overlapping pager: `http.paging.PagedIterable` (PagedIterable.kt:89-193) with its own `FirstPageFetcher`/`NextPageFetcher` SAM pair, its own `PagedResponse` holder (PagedResponse.kt:48-77, exposes nextLink/previousLink/firstLink/lastLink/continuationToken), its own `PagingOptions` (offset/pageSize/pageIndex/continuationToken; PagingOptions.kt:40-47), its own `byPage()` sequence, its own `maxPages` cap, AND its own eager-close logic in an `AbstractIterator` (PagedIterable.kt:149-179). So we have TWO independent pagination stacks in sdk-core (`pagination/*` and `http/paging/*`) covering the same need with different vocabularies. openai-java has one ~90-line stack. + +No async pagination exists on our side: refs-comparison.md:244 confirms "Async variants for sdk-async-coroutines (Flow) and sdk-async-reactor (Flux) are not yet built." Their `AutoPagerAsync` + `PageAsync` ship today. + +**Recommendations (verified)** + +- **Ship async pagination (we have none); model the driver on AutoPagerAsync but loop instead of recursing, and layer Flow/Flux in the adapters** `ADOPT` · `both` · effort L · confidence high + - *Verdict:* Gap confirmed three ways: refs-comparison.md:244 states 'Async variants for sdk-async-coroutines (Flow) and sdk-async-reactor (Flux) are not yet built'; a grep for AutoPager/PageAsync/paginate across all four async adapter modules returned zero hits; and our AsyncHttpClient SPI (AsyncHttpClient.kt) has no pagination surface. openai-java ships 57 PageAsync impls plus AutoPagerAsync. Their driver is verified at AutoPagerAsync.kt:38-67: items().forEach(onNext) then if(hasNextPage()) nextPage().thenCompose{it.handle()} (lines 43-45). Two real cautions the analysis got right: (1) the recursion at line 44 builds a dependent-future chain whose depth grows with page count — on a transport whose futures complete synchronously this deepens the stack and can StackOverflow on long paginations; their own TODO(JDK) at line 32 flags JDK-version compromises in this class. Use an iterative re-arm loop (a single CompletableFuture you re-complete in a whenComplete) to stay O(1) in stack depth. (2) Their AutoPagerAsync implements AsyncStreamResponse (their bespoke push interface, AsyncStreamResponse.kt:14-63) — we deliberately do NOT have that; we bridge everything through CompletableFuture and expose Flux/Flow in adapters. So in sdk-core, expose only PageAsync (hasNext/items/nextPageAsync():CompletableFuture>, Java-8 safe — java.util.concurrent only) and an AsyncPaginator returning CompletableFuture-driven iteration; do NOT port AsyncStreamResponse. The richer Flow/Flux surfaces belong in sdk-async-coroutines/-reactor where those deps are allowed (refs-comparison plans exactly this at lines 244/349). Fold in the one genuinely-useful idiom from the now-dropped COPY finding: a single-subscription/idempotent-close guard via AtomicReference{NEW,SUBSCRIBED,CLOSED} (AutoPagerAsync.kt:33-36,71-81) IF the async pager is a subscribe-once push type; if it is instead a pull-based CompletableFuture chain or a cold Flux/Flow, that guard is unnecessary because Reactor/coroutines already enforce single-consumption. Respect the interrupt-aware blocking rule only at any sync bridge (asBlocking already does this, AsyncHttpClient.kt:124-131). + - *Do:* In sdk-core add PageAsync mirroring Page (val items, val hasNext, fun nextPageAsync():CompletableFuture>) and an AsyncPaginator that drives iteration with an iterative re-arm loop (not thenCompose recursion) and preserves the maxPages cap + per-page response close. Add Flow in sdk-async-coroutines and Flux in sdk-async-reactor on top. Decide the async page CONTRACT before codegen lands so generated async list endpoints target it (ties to Finding 4/7). +- **Drop TokenPaginationStrategy — it is a byte-for-byte duplicate of CursorPaginationStrategy distinguished only by a default param name** `SIMPLIFY` · `sdk-core` · effort S · confidence high + - *Verdict:* Verified by direct diff. TokenPaginationStrategy.parse() (TokenPaginationStrategy.kt:43-57) and CursorPaginationStrategy.parse() (CursorPaginationStrategy.kt:44-58) are identical line-for-line: read items, read next opaque string, hasNext = !nextX.isNullOrEmpty(), RequestRebuilder.withQueryParam(...). The only differences are the field/param names (tokenExtractor/tokenQueryParam vs cursorExtractor/cursorQueryParam) and the default query-param string ('page_token' vs 'cursor'). The file's own KDoc concedes it: 'Logically identical to [CursorPaginationStrategy]' (TokenPaginationStrategy.kt:14). cursorQueryParam is already constructor-overridable (CursorPaginationStrategy.kt:42, @JvmOverloads), so token-style APIs are fully served by CursorPaginationStrategy(items, cursor, cursorQueryParam="page_token"). Keeping a whole public class whose sole distinction is a default string inflates the public API surface, the apiCheck snapshot, the test matrix, and the docs for zero functional value. openai-java covers cursor/token/page-number with ZERO strategy classes. This is correct as SIMPLIFY; note it also slightly de-risks Finding 1 (one fewer two-lambda strategy to migrate). + - *Do:* Delete TokenPaginationStrategy. Document in CursorPaginationStrategy KDoc that token-style APIs pass cursorQueryParam="page_token". Regenerate api/*.api (intentional pre-1.0 breaking removal — net reduction in surface). Keep cursor, page-number, and link-header as the three genuinely distinct strategies. Fold the token wire-shape examples into CursorPaginationStrategy's KDoc so the guidance isn't lost. +- **Collapse the two-lambda strategy into a single 'parse response -> typed page' seam; the split forces callers to hand-cache the single-use body** `SIMPLIFY` · `sdk-core` · effort M · confidence high + - *Verdict:* Verified and the strongest finding in the set. CursorPaginationStrategy takes TWO separate lambdas itemsExtractor:(Response)->List and cursorExtractor:(Response)->String? (CursorPaginationStrategy.kt:40-42) and calls both inside parse() (lines 48-49). A Response body is single-use, so calling each lambda independently would drain the body twice. The smoking gun is our OWN test: CursorPaginationTest.kt:50-59 builds buildCachedExtractors() with an IdentityHashMap> precisely so 'the single-use body is read exactly once per page even though the strategy's contract splits items + cursor into two calls' (verbatim KDoc, line 47-49). Every one of the three test cases (lines 74, 98, 125) routes through this cache. TokenPaginationStrategy and PageNumberPaginationStrategy share the same single-lambda-but-still-once shape; only Cursor/Token take two body-reading lambdas, but the trap is real for both. openai-java sidesteps it structurally: the body is deserialized exactly once (FileServiceImpl.kt:169 response.use{listHandler.handle(it)}) into a typed FileListPageResponse, and the page reads items() and the next cursor from that SAME already-materialized object (FileListPage.kt:35-40) — no second read possible. This is a correctness trap in the public API, not a style nit: any external caller of Cursor/TokenPaginationStrategy must reinvent the IdentityHashMap or silently get an empty second read on a network body. The abstraction boundary is in the wrong place. + - *Do:* Pre-1.0, change the strategy contract so the body is read once. Replace the two body-reading lambdas with a single extractor returning items+cursor together, e.g. itemsAndCursor:(Response)->ParsedPage where ParsedPage is a tiny internal holder (items:List, cursor:String? / nextLink:String?). Keep PaginationStrategy.parse() as the single seam that touches the body. After the fix, delete buildCachedExtractors() from CursorPaginationTest — the test simplifying is the proof the trap is gone. Zero new deps; pure Kotlin. Apply the same single-read shape to the Link-header strategy's extractor (LinkHeaderPaginationStrategy.kt:49) for consistency even though it only reads items today. +- **Re-express PaginatorIterator as a guarded generateSequence; the 70-line hand-rolled iterator re-implements lazy fetch-on-pull that the stdlib gives for free** `SIMPLIFY` · `sdk-core` · effort M · confidence medium + - *Verdict:* Both halves verified. AutoPager.kt:15-18 is literally generateSequence(firstPage){if(it.hasNextPage()) it.nextPage() else null}.flatMap{it.items()}.iterator() plus a StreamSupport stream() at line 20 — ~6 lines total. Our PaginatorIterator (Paginator.kt:136-233) is ~97 lines with six mutable fields (currentPage/currentItemIndex/nextRequest/started/done/pagesFetched, lines 143-161), a while-loop hasNext() (163-176), and an advance() that interleaves cap-check + fetch + close + next-request-precompute (192-232). Same observable contract: page-lazy, one HTTP exchange per page. The started/done/nextRequest interplay (lines 200-231) genuinely is the kind of subtle state worth deleting. BUT the critique must temper the 'one-liner' framing: their AutoPager has neither of the two behaviors we deliberately keep — (a) the maxPages cap (we require maxPages>0 at Paginator.kt:95 and stop before the over-cap fetch at 193-199), and (b) deterministic response.close() per page (try{strategy.parse}finally{response.close()} at 219-224). A faithful generateSequence rewrite must host BOTH inside the producer lambda (a captured counter + the try/finally), so the win is 'fewer states and less bespoke index math', NOT 'collapse to one line'. Downgrade to medium confidence: this is a real readability/surface-area win but it is a rewrite of working, tested code with no behavior change, so it competes for priority against findings that fix correctness (Finding 1) or add capability (Finding 3). + - *Do:* Replace PaginatorIterator with iterateAll() = Iterable { generateSequence(seed){...}.flatMap{it.items}.iterator() } where the producer lambda owns the maxPages counter and the try/finally response.close(). Keep streamAll() via StreamSupport.spliteratorUnknownSize exactly as today (Paginator.kt:119-126). Treat as a lower-priority cleanup; do it alongside Finding 1's contract change so the iterator and the strategy are reworked in one pass rather than twice. +- **Unify the two overlapping pagination stacks (pagination/* and http/paging/*) into one before codegen lands** `SIMPLIFY` · `sdk-core` · effort L · confidence high + - *Verdict:* Verified: sdk-core genuinely carries two independent pagination subsystems. Stack A = pagination/: Paginator (Paginator.kt:86) + Page(items/hasNext/nextPageRequest, Page.kt:33-54) + PaginationStrategy + 4 strategies + RequestRebuilder + SimplePage; response-lifecycle-owning (closes each page at Paginator.kt:219-224). Stack B = http/paging/: PagedIterable (PagedIterable.kt:89) + FirstPageFetcher/NextPageFetcher SAMs (lines 21-42) + PagedResponse (PagedResponse.kt:48-77, exposes nextLink/previousLink/firstLink/lastLink/continuationToken + statusCode/headers/request) + PagingOptions (PagingOptions.kt:40-47, offset/pageSize/pageIndex/continuationToken); with its OWN byPage() sequence, OWN maxPages cap, and OWN eager-close inside an AbstractIterator (PagedIterable.kt:149-179). They solve the same problem with different vocabularies. Crucially the close semantics differ and one half is a documented footgun: Stack A always closes per page in the driver; Stack B's item-iterator closes eagerly (PagedIterable.kt:171-176) BUT byPage() hands out unclosed PagedResponses and the KDoc admits callers MUST close them or leak (PagedIterable.kt:100-105). Two iterators to keep lazy, two close-models to keep correct, two test suites, and a 'which do I use?' for both hand-writers and future codegen. openai-java meets the entire need with one ~90-line surface. This is real over-engineering relative to the reference. The analysis is right that Stack B's PagedResponse carries useful page-level metadata (links/status/headers) that Stack A's bare Page lacks — fold that in, don't discard it. The decision must be made BEFORE codegen so generated pages target one contract. + - *Do:* Pick ONE survivor (recommend Page + driver, enriched with PagedResponse's metadata accessors — status/headers/links/continuationToken — so nothing is lost), delete the other stack, and route both the manual path and the future codegen page-shape through it. Preserve the maxPages cap, deterministic per-page close, and interrupt-aware blocking on the survivor. Eliminate the byPage()-leaks-unless-you-close footgun by making the survivor's page-level view close-managed or AutoCloseable-by-default. Coordinate with Finding 3 (async contract) and Finding 4/7 (codegen page shape) so all three target the same unified model. +- **For codegen: bake 'no next page' statically per endpoint (hasNextPage()=false literal + throwing nextPage) instead of a runtime flag** `LEARN` · `codegen` · effort S · confidence medium + - *Verdict:* Verified: VectorStoreSearchPage (a non-paginating endpoint) emits override fun hasNextPage(): Boolean = false (VectorStoreSearchPage.kt:34) and nextPageParams() = throw IllegalStateException("Cannot construct next page params") (lines 36-37); autoPager() still exists and yields one page's items (line 41). The lesson is sound: when the SDK is generated, each endpoint's pagination shape is known at generation time, so encoding it as constant code (vs threading a runtime boolean like our SimplePage.hasNext) is self-documenting and lets unsupported nextPage() fail loudly with an endpoint-specific message rather than silently re-returning the same page. Correctly scoped as a codegen-template guideline, NOT a runtime change — our runtime SimplePage flag (SimplePage.kt:19-25) is the right design for the hand-written/dynamic case where the shape isn't known until response time. Confidence medium only because this is a minor, almost-obvious codegen ergonomics note rather than a load-bearing decision; it is worth recording but should not consume design debate. Pairs naturally with Finding 4. + - *Do:* In the codegen design doc, note that endpoints the spec flags non-paginated emit hasNextPage()=false + a throwing nextPage() whose message names the endpoint (mirroring VectorStoreSearchPage.kt:34-37). Keep the runtime SimplePage.hasNext flag for the manual/BYO path. Bundle this into the Finding 4 codegen section; not worth a standalone work item. +- **For FUTURE codegen: emit per-endpoint typed Page classes that rebuild typed PARAMS, not URL strings — keep RequestRebuilder out of the generated path** `LEARN` · `codegen` · effort L · confidence high + - *Verdict:* Fully verified and the most important architectural lesson. FileListPage.nextPageParams() = params.toBuilder().after(items().last()._id().getOptional("id")).build() (FileListPage.kt:39-40) then nextPage() = service.list(nextPageParams()) (line 42). The cursor lives in a typed param; no URL string is ever parsed or rewritten. The seam that makes this work is PrepareRequest.kt:23-24 (putAllQueryParams(clientOptions.queryParams).replaceAllQueryParams(params._queryParams())) folding a Params object (Params.kt:7-16, _headers()/_queryParams()) into an HttpRequest — verified verbatim. Contrast our RequestRebuilder.kt:37-166: a from-scratch URL rewriter (setQueryParam/getQueryParam/rebuildUrl) using URLEncoder/URLDecoder and manual StringBuilder reassembly, plus a hand-coded RFC-3986 query-only-reference fix (LinkHeaderPaginationStrategy.kt:90-96) that exists ONLY because URL(base,ref) drops the base's last path segment. ALL of that machinery is needed solely because we manipulate URLs as strings. If codegen knows a param is named 'after' of type String, it emits .after(cursor) and the request-build path encodes once, correctly, with no bespoke parser to maintain. Strict improvement for the generated layer, no constraint conflict (pure Kotlin output). Caveat the analysis nails: this requires sdk-core to expose a 'fold typed params into a Request' seam analogous to Params+prepare; we have Request.Builder but no Params SPI yet. The runtime strategies + RequestRebuilder stay for hand-written/BYO-API callers; codegen simply shouldn't depend on them. + - *Do:* In the KotlinPoet generator design (docs/refs-comparison.md), specify: list endpoints emit a typed XxxListPage holding service+params+response with nextPage()=service.list(nextPageParams()) and nextPageParams() mutating the typed params builder. Add a lightweight Params SPI to sdk-core (headers()/queryParams()) plus a Request.prepare(params)-style folder so generated params become a Request without URL string surgery. Document RequestRebuilder + the 4 strategies as the manual/BYO path only. Sequence this with Finding 7 (pick the single page contract first). + +**Considered & dropped** + +- ~~COPY the CompletionException-unwrap and single-subscribe-state idiom for any async surface (pagination or SSE)~~ — DROPPED (mostly already-done-by-us + partly inapplicable). HALF the finding is redundant: we ALREADY centralize CompletionException unwrapping in Futures.unwrap() (Futures.kt:46-55), and ours is strictly BETTER than openai-java's AutoPagerAsync.kt:50-51 — ours loops through the cause chain, handles BOTH CompletionException AND ExecutionException, and is cycle-safe via an identity HashSet (FuturesTest.kt:69 tests 20-deep nesting). openai-java's version is a single-level, CompletionException-only unwrap. Futures.unwrap is already consumed across the codebase: asBlocking (AsyncHttpClient.kt:138), Netty (Netty.kt:61), DefaultAsyncInstrumentationStep (line 201). Copying their inferior version would be a regression. The OTHER half — the single-subscribe AtomicReference{NEW,SUBSCRIBED,CLOSED} guard (AutoPagerAsync.kt:33-36,71-81) — is specific to openai-java's hand-rolled AsyncStreamResponse push interface (AsyncStreamResponse.kt:14-63), which we deliberately do NOT have. The finding's claim that this idiom applies to 'our existing Reactor SSE surface' is INACCURATE: our SSE surface is Flux.generate/Flux.using (Reactor.kt:116-145), and Reactor's Flux already enforces single-subscription and cancellation natively — we would never reimplement an AtomicReference state machine on a Flux. The single-subscribe guard is only relevant IF we build an openai-style subscribe-once push pager, which the async-pagination finding explicitly recommends AGAINST (use a pull-based CompletableFuture loop / cold Flux/Flow instead). The one residual nugget (a subscribe-once guard if we ever build a push type) has been folded as a conditional sub-note into the async-pagination ADOPT finding, so nothing actionable is lost. + +**Do not copy** + +1) Do NOT copy their per-endpoint code-generated page wholesale into sdk-core. Their FileListPage hard-depends on Jackson via response._data().getOptional(\"data\") (FileListPage.kt:26) and on a concrete typed service (service.list(...), FileListPage.kt:42). That is correct for a GENERATED client but would drag Jackson and a service abstraction into our zero-dep sdk-core. The page-as-generated-code idea belongs in CODEGEN output, not the toolkit. + +2) Do NOT adopt AutoPagerAsync's recursive nextPage().thenCompose { it.handle() } chain verbatim (AutoPagerAsync.kt:44). On a transport whose futures can complete synchronously, the dependent-future chain deepens the call stack per page and can StackOverflow on long paginations. Their own code carries a TODO(JDK) at :32 acknowledging JDK-version compromises in this class. Use an iterative re-arm loop instead. + +3) Do NOT treat their AutoPager's lack of a maxPages cap as a feature to emulate. AutoPager.kt:16 will loop forever if a buggy server keeps returning hasNextPage()=true. Our maxPages cap (Paginator.kt:92-96, 193-199; PagedIterable.kt:120-123) is a genuine safety improvement over the reference — keep it when we simplify the engine. + +4) Their pagination has zero response-lifecycle management at the page level because the body is drained-and-closed in the service before the page exists (FileServiceImpl.kt:169 response.use{...}). Do not read that as 'pages don't need close()'. In a toolkit where the strategy may hand back a Page wrapping a live Response, dropping our try/finally close (Paginator.kt:219-224) would leak connections. + +5) Avoid mirroring their split between Page and a separate XxxListPageResponse purely for the toolkit — that two-type split exists to separate Stainless's JsonField-based response model from the page wrapper. In sdk-core (no Jackson, no JsonField), it would be ceremony with no payoff. + +**Where we're ahead** + +Several concrete places, all real: + +1) maxPages safety cap. Both our pagers guard against a server that never advances its cursor: Paginator requires maxPages>0 (Paginator.kt:95) and stops before the over-cap fetch (Paginator.kt:193-199); PagedIterable does the same (PagedIterable.kt:120-123). openai-java's AutoPager.kt:16 has no such guard and will spin forever on a misbehaving hasNextPage(). This is a correctness/robustness win for a general toolkit that talks to arbitrary servers. + +2) Deterministic per-page response close inside the driver. Paginator closes each Response in a try/finally around strategy.parse (Paginator.kt:219-224); PagedIterable closes each page eagerly the moment its (materialized) items iterator is taken, so a partial consume like stream().findFirst() never strands an open body/pooled connection (PagedIterable.kt:171-176, documented :138-147). openai-java pushes this responsibility entirely up to the generated service (FileServiceImpl.kt:169) and the page itself owns nothing — fine for them, but our toolkit-level guarantee is stronger for hand-written transports. + +3) Breadth of wire conventions in the runtime without codegen. We support RFC 5988/8288 Link-header pagination including relative-reference resolution and the RFC-3986 query-only-reference correctness fix (LinkHeaderPaginationStrategy.kt:83-104) — openai-java has NO link-header support at all (all 57 pages are cursor-via-typed-param; zero use Link headers, zero backward/prev cursors). For a toolkit meant to talk to APIs the author doesn't control (not just OpenAI), that breadth is justified — with the caveat that page-number and especially the duplicate token strategy are the weak parts of that breadth (see SIMPLIFY findings). + +4) Richer page-level metadata. Our PagedResponse exposes statusCode/headers/request plus nextLink/previousLink/firstLink/lastLink/continuationToken (PagedResponse.kt:48-66). openai-java's Page exposes only items/hasNextPage/nextPage (Page.kt:21-32); per-page HTTP metadata is reachable only via the generated response() accessor, not the Page contract. + +_Verifier notes:_ VERIFICATION SUMMARY: 7 of 8 findings KEPT (all claimAccurate=true after line-by-line re-read); 1 DROPPED (the COPY idiom — we already do the unwrap half better via Futures.unwrap, and the state-machine half is inapplicable to our Flux-based SSE). Every openai-java citation in the analysis checked out: Page.kt:21-32, PageAsync.kt:31, AutoPager.kt:15-20, AutoPagerAsync.kt:33-67/71-81, PrepareRequest.kt:23-24, Params.kt:7-16, FileListPage.kt:35-42, VectorStoreSearchPage.kt:34-37, FileServiceImpl.kt:169-181. Every OUR-SDK citation checked out: Page.kt:33-54, Paginator.kt:86-234, CursorPaginationStrategy.kt:40-58, TokenPaginationStrategy.kt:14-57, PageNumberPaginationStrategy.kt, LinkHeaderPaginationStrategy.kt:83-104 (incl. the RFC-3986 query-only-reference fix), RequestRebuilder.kt:37-166, PagedIterable.kt:89-193, PagedResponse.kt:48-77, PagingOptions.kt:40-47, and the decisive CursorPaginationTest.kt:50-59 IdentityHashMap workaround.\n\nANTIPATTERNS section of the analysis is ACCURATE and worth keeping verbatim as guidance: (1) do not copy generated pages into sdk-core (FileListPage hard-depends on Jackson via response._data().getOptional, FileListPage.kt:26 — verified; belongs in codegen output not the toolkit); (2) do not copy AutoPagerAsync's recursive thenCompose chain (stack-depth growth — verified at AutoPagerAsync.kt:44, with their own TODO(JDK) at line 32); (3) do not emulate AutoPager's lack of a maxPages cap (AutoPager.kt:16 loops forever on a buggy server — verified); (4) their pages own no response lifecycle only because FileServiceImpl drains+closes the body first (response.use at FileServiceImpl.kt:169 — verified) — do not read that as 'pages don't need close()'; (5) don't mirror the Page / XxxListPageResponse two-type split in the toolkit (it exists to separate Stainless's JsonField response model, pointless without Jackson). All five are correct.\n\nweAreAhead claims VERIFIED and genuine: (1) maxPages cap — both pagers guard (Paginator.kt:95/193-199; PagedIterable.kt:120-123); openai-java has none. (2) deterministic per-page close in the driver (Paginator.kt:219-224; PagedIterable item-iterator eager close at 171-176) — caveat: PagedIterable.byPage() does NOT close (documented leak, PagedIterable.kt:100-105), so our 'ahead' is true only for Stack A's driver and Stack B's item path, which is itself an argument for the unify finding. (3) RFC 5988/8288 Link-header breadth incl. query-only-reference fix (LinkHeaderPaginationStrategy.kt:83-104) — confirmed openai-java has zero Link-header support (grep found none; all 57 sync + 57 async pages use typed-param cursors). (4) richer PagedResponse metadata (PagedResponse.kt:48-66) vs their bare Page (Page.kt:21-32) — confirmed. NOTE the tension the analysis itself flags: the breadth in (3) is justified for a toolkit talking to arbitrary APIs, but TokenPaginationStrategy and the two-lambda extractor design are the weak parts of that breadth (Findings 5 and 1).\n\nPRIORITY ORDERING for the team: Finding 1 (correctness trap, high/M) and Finding 5/drop-Token (high/S, trivial) first; Finding 6 (unify stacks) + Finding 4/7 (codegen page contract) must be decided together and BEFORE codegen lands; Finding 3 (async pagination) is the biggest net-new capability (high/L) and should target whatever unified contract Finding 6 picks; Finding 2 (generateSequence rewrite) is a lower-priority cleanup best folded into Finding 1's pass. Findings 4, 6, 7 all converge on a single prerequisite: pick ONE page contract and add a Params/Request-fold seam to sdk-core, then point both the manual path and codegen at it. + +--- + +## 12. JSON Schema generation + Structured Outputs (likely no equivalent in ours) + +**What it is** + +openai-java's Structured Outputs subsystem turns an arbitrary user Java/Kotlin class into (a) a JSON Schema sent to the API to constrain model output, and (b) a deserializer that parses the model's JSON back into that class. It has three layers. (1) Derivation — `extractSchema(type: Class<*>)` in StructuredOutputs.kt:195 delegates ALL schema generation to the third-party `victools/jsonschema-generator` library (DRAFT_2020_12, OptionPreset.PLAIN_JSON), layering `JacksonModule` (honors @JsonProperty/@JsonIgnore/@JsonClassDescription) and `Swagger2Module` (honors OpenAPI @Schema for pattern/min/max). Two OpenAI-specific overrides are bolted on: `Option.FORBIDDEN_ADDITIONAL_PROPERTIES_BY_DEFAULT` (StructuredOutputs.kt:202) forces additionalProperties:false everywhere, and `.withRequiredCheck { true }` (StructuredOutputs.kt:216) forces EVERY field into `required` regardless of nullability — because OpenAI strict mode mandates all-required-all-closed. (2) Local validation — JsonSchemaValidator.kt is a hand-written, single-use, recursive validator (NOT a general JSON-Schema validator) that pre-checks the generated schema against OpenAI's documented restrictions (allowed-keyword whitelists per type, additionalProperties:false required, all properties listed in required, max nesting depth 10, max 100 properties, max 5000/15000 string budget, enum-count/length budgets) so users get a precise local error instead of an opaque API 400. (3) Round-trip — `responseTypeFromJson(json, Class)` (StructuredOutputs.kt:229) parses the response with a PRIVATE JsonMapper (StructuredOutputs.kt:29) deliberately NOT the SDK's strict mapper, because the strict one demands @JsonProperty on every field. Derivation feeds five public entry points: response-format, text-config, and function-tool builders for both Chat and Responses APIs. Architecturally it is a self-contained reflection-to-schema adapter that hard-depends on Jackson + victools inside core. + +**How it works (line-level)** + +DERIVATION — the whole generator is configured in ~20 lines: `SchemaGeneratorConfigBuilder(SchemaVersion.DRAFT_2020_12, OptionPreset.PLAIN_JSON).with(Option.FORBIDDEN_ADDITIONAL_PROPERTIES_BY_DEFAULT).with(JacksonModule()).with(Swagger2Module())` then `configBuilder.forFields().withRequiredCheck { true }` then `SchemaGenerator(configBuilder.build()).generateSchema(type)` (StructuredOutputs.kt:196-218). The `withRequiredCheck { true }` lambda is the key trick — it OVERRIDES Jackson's own interpretation of `required=false` on @JsonProperty so optionality is expressed ONLY via nullable type (`["string","null"]`), never via absence-from-required (comment StructuredOutputs.kt:212-216). NAME/DESCRIPTION extraction sidesteps generator gaps: `extractFunctionInfo` (StructuredOutputs.kt:118) notes "The JSON schema generator ignores the @JsonTypeName annotation" so it reads the annotation itself — `parametersType.getAnnotation(JsonTypeName::class.java)?.value ?: parametersType.simpleName` (StructuredOutputs.kt:129-130) — and it MUTATES the generated node to relocate the class description: `val descriptionNode = schema.remove("description")` (StructuredOutputs.kt:134) so the description lands on the FunctionDefinition, not inside the schema. The dual-mapper decision is explicit: comment StructuredOutputs.kt:26-28 "The SDK ObjectMappers.jsonMapper() requires that all fields ... be marked with @JsonProperty, which is not desirable in this context"; hence private MAPPER with kotlinModule()+Jdk8Module()+JavaTimeModule(), dates-as-timestamps disabled (StructuredOutputs.kt:29-35). Result nodes cross into the SDK value system via `JsonValue.fromJsonNode(...)` (StructuredOutputs.kt:51). +VALIDATOR — it is a one-shot object: `check(!isValidationComplete){"Validation already complete."}; isValidationComplete = true` (JsonSchemaValidator.kt:230-231), with create()/errors()/isValid() and an internal error-accumulator (no throwing). The dispatch core is `(anyOf != null).xor(type != null).xor(ref != null)` requiring EXACTLY ONE of the three (JsonSchemaValidator.kt:299-306), then routes to validateAnyOf/validateType/validateRef. Keyword whitelisting is per-context: separate frozen sets ALLOWED_KEYWORDS_OBJECT/ARRAY/STRING/NUMBER/SIMPLE/REF/ANY_OF (JsonSchemaValidator.kt:70-126) and `validateKeywords` (JsonSchemaValidator.kt:640) rejects anything not whitelisted, with root-only keywords ($schema,$id,$defs) allowed solely at depth 0. THE CLEVER RECURSION/CYCLE HANDLING: depth increases ONLY through inline nesting (properties JsonSchemaValidator.kt:628, items :486, anyOf :339, defs :587), and `$ref` is NEVER followed — validateRefSchema only checks membership `ref.asText() in validReferences` (JsonSchemaValidator.kt:361) where validReferences is pre-populated from $defs names in a FIRST pass (JsonSchemaValidator.kt:573-582) before sub-schemas are validated in a SECOND pass (JsonSchemaValidator.kt:585-588). So self-referential types (trees/linked-lists) terminate naturally — no visited-set needed — while MAX_NESTING_DEPTH=10 (JsonSchemaValidator.kt:272) bounds inline depth. Optionality decoding: `getTypeNameFromTypeArray` (JsonSchemaValidator.kt:668) accepts exactly two textual entries, one being "null", order-independent, mapping `["string","null"]` → "string" (this pairs with the all-required derivation rule). Budget accounting is accumulated across the whole tree (totalStringLength/totalEnumValues/totalObjectProperties) and asserted ONCE at root with root-path errors because "no one element is the cause" (JsonSchemaValidator.kt:236-248). Enum special case: strings beyond UNRESTRICTED_ENUM_VALUES_LIMIT=250 trigger a 7500-char cap (JsonSchemaValidator.kt:536-544). The verify() helper has a two-arg overload taking an `onFalse: () -> Unit` continuation so call sites can `return` to abort deeper validation after logging (JsonSchemaValidator.kt:693-703) — e.g. `verify(schema.isObject, path, {"..."}) { return }`. + +**vs. our SDK** + +We have NOTHING in this subsystem — confirmed by grep: the only "schema" hits in sdk-core are instrumentation/HttpTracer.kt and HttpTracerFactory.kt (OpenTelemetry "schema URL"), unrelated. No JsonSchema, no StructuredOutput, no reflection-to-schema. That is CORRECT for us today — we are a toolkit, not a client for an API with structured-outputs. Our serde seam is the relevant comparison point: sdk-core/src/main/kotlin/org/dexpace/sdk/core/serde/Serde.kt:18 defines `interface Serde { val serializer; val deserializer }` and its own KDoc states "Concrete implementations live outside sdk-core since sdk-core deliberately ships no embedded serializer" (Serde.kt:15-16). Jackson lives only in sdk-serde-jackson. This is the architectural fault line: openai-java HARD-DEPENDS on Jackson AND victools INSIDE core (openai-java-core/build.gradle.kts:32-34 `implementation("com.github.victools:jsonschema-generator:4.38.0")` + two victools modules), and even runs TWO Jackson mappers in core (the strict SDK one plus the private permissive one at StructuredOutputs.kt:29). For us, any schema-derivation capability is a NEW ADAPTER module (e.g. sdk-schema-jackson) depending on sdk-serde-jackson — never sdk-core. The validator (JsonSchemaValidator.kt) is the one piece that is PURE (only Jackson JsonNode for tree-walking; no victools) and is the most reusable artifact. The codegen relevance is in docs/refs-comparison.md:380,402 — our planned generator already plans "$ref resolution, allOf flattening, inline-schema naming" for the INBOUND (OpenAPI→models) direction; this subsystem is the OUTBOUND (class→JSON-Schema) direction, a distinct future capability we have not scoped. + +**Recommendations (verified)** + +- **Fail-soft recursive validator: one-shot guard + path-prefixed error list + verify(value,path,msg){ return } non-local-return idiom** `COPY` · `codegen` · effort S · confidence high + - *Verdict:* VERIFIED accurate. The skeleton is real and tidy: one-shot guard check(!isValidationComplete) (line 230), private MutableList errors (197), isValid()=complete&&empty (219), errors() returns an unmodifiable copy (211), and the two-overload verify(): the 3-arg form (689) delegates to the 4-arg form (693-703) with empty onFalse, while call sites pass {return} as onFalse to both log an error AND short-circuit deeper traversal (e.g. 276-279, 304-306). Zero exceptions thrown during the walk; errors inspected afterward — and StructuredOutputs.kt:188-193 confirms the no-throw design exists explicitly to ease unit testing, which matches our conventions. Honest caveat: this is a ~15-line pattern, so 'COPY' is borderline LEARN+template; the verify/error helper block (689-707) is liftable near-verbatim but the surrounding OpenAI keyword logic is not. It depends only on Jackson JsonNode for tree access, which we'd swap for our own tree type, making it a re-implementation, not a literal copy. + - *Do:* Use this as the canonical shape for our codegen spec/schema validators (refs-comparison.md:383 golden-file tests, :402 preprocessor): one-shot instance + accumulate-don't-throw + path-prefixed messages + the verify(...){return} continuation. Lift the verify/error helper structure (Apache-2.0 -> add attribution if copied verbatim). Do NOT copy the OpenAI keyword whitelists. Parameterize over our own tree abstraction so it never drags Jackson toward core. +- **Hand-roll a narrow subset validator instead of pulling a general JSON-Schema-validator dependency** `SIMPLIFY` · `codegen` · effort S · confidence medium + - *Verdict:* VERIFIED: the class doc explicitly disclaims being a general-purpose validator (JsonSchemaValidator.kt:7-18); it checks only the emitted subset via small frozen per-type keyword Sets (70-126) enforced by validateKeywords (640-655). The direction is right for our zero-dep posture (favor a hand-rolled subset checker over networknt/everit). Two honest corrections: (1) 'SIMPLIFY' implies WE are over-engineered, but we have NOTHING here — there is nothing to simplify; per the task's rare-branch definition this is 'theirs is simpler and we should follow,' i.e. a forward principle, not a fix to existing code. (2) It substantially overlaps findings 1 and 2 (same validator, same 'narrow hand-rolled' theme); it earns a separate slot only because the specific DECISION — do not add a json-schema-validator library — is distinct. Keep, but as a forward principle for codegen, not a present-tense simplification. + - *Do:* If/when codegen validates generated schemas, hand-roll a subset checker over our own tree type with data-driven frozen keyword sets; do NOT add networknt/everit/any json-schema-validator lib (it is a heavy transitive dep and overkill when we own the emitter). Cover with golden tests to catch spec drift (refs-comparison.md:383). +- **Cycle-safe schema validation by never following $ref (not by a depth cap)** `LEARN` · `codegen` · effort S · confidence high · claim-qualified + - *Verdict:* Half-right, and the wrong half is load-bearing. VERIFIED: there is no visited-set; $ref is membership-checked only (JsonSchemaValidator.kt:361) and never resolved; recursion happens solely through inline anyOf/items/properties (lines 339, 486, 628), each depth+1. So a CYCLE is impossible ONLY because victools always emits repeated types as a $defs entry + $ref, and this validator never follows the $ref. BUT the analysis claims 'you only need a depth cap' for safety — that is FALSE here. I read line 272: the depth check uses the single-arg verify() overload (line 689) whose onFalse is empty {}, so on depth>10 it records an error and KEEPS RECURSING. The depth cap is a VALIDATION RULE (reject schemas nested >10), not a recursion guard. A genuinely cyclic INLINE schema would stack-overflow this code. Crucially, this lesson does NOT transfer to our planned inbound OpenAPI->models $ref RESOLUTION, which by definition FOLLOWS refs and therefore DOES need a visited-set/seen guard. Conflating the two would import a real bug. + - *Do:* Record the actual invariant in docs/refs-comparison.md near the codegen $ref discussion: a validator that only CHECKS refs (membership) and never follows them is automatically cycle-safe and needs no visited-set; a resolver that FOLLOWS refs (our inbound OpenAPI->models step, refs-comparison.md:380) is the opposite case and MUST carry cycle detection. Do not adopt 'depth cap instead of visited-set' as a general rule — it is specific to non-resolving validation, and even here the depth check does not terminate recursion. +- **Schema derivation, if ever built, is a victools+Jackson capability that must live in an adapter, never sdk-core** `LEARN` · `docs/process` · effort S · confidence high + - *Verdict:* Dependency-placement contrast is accurate and actually UNDERSTATED. VERIFIED build.gradle.kts: victools generator + jackson + swagger-2 modules are implementation deps INSIDE openai-java-core (lines 32-34), and Jackson core/databind are even worse — `api` deps (lines 23-24), so Jackson leaks transitively to every consumer. The whole derivation mechanism is ~25 lines (StructuredOutputs.kt:195-218). Our Serde.kt:15-16 codifies the opposite stance. BUT the original finding is mis-shelved: it is tagged ADOPT/effort=L/target=sdk-core while its own recommendation says 'Do NOT build now' and 'sdk-core would expose at most a tiny interface if anything at all.' An ADOPT that concludes don't-adopt is a LEARN; target sdk-core directly contradicts the (correct) point that this must NOT touch sdk-core. Re-cast as LEARN/docs-process. Java-8 feasibility of victools 4.38 is plausible but unverified and irrelevant unless demand appears. + - *Do:* Add a one-paragraph candidate-future-module note to docs/refs-comparison.md: an OPTIONAL sdk-schema-jackson adapter depending on sdk-serde-jackson could offer SchemaDeriver.derive(Class<*>): String via victools; sdk-core exposes at most a tiny interface, never the implementation. Explicitly fence it to an adapter and gate on a real toolkit consumer. Do not schedule work. +- **Strict-LLM-schema encoding convention: all-required + additionalProperties:false + optionality-as-nullable-type-union** `LEARN` · `codegen` · effort S · confidence medium + - *Verdict:* VERIFIED mechanism: one Option flag closes every object (FORBIDDEN_ADDITIONAL_PROPERTIES_BY_DEFAULT, StructuredOutputs.kt:202), a one-line predicate forces every field required (.withRequiredCheck { true }, :216), and only genuine generator gaps are patched by node mutation (schema.remove("description") :134; @JsonTypeName fallback :129-130). Optionality is expressed purely as nullable type unions ['string','null'] — confirmed by the validator's getTypeNameFromTypeArray handling (JsonSchemaValidator.kt:668-687). The CONCRETE, transferable nugget is this encoding convention — it is what every 'strict' LLM endpoint demands industry-wide, so our future codegen should know it. The 'prefer configuring emission over downstream AST-rewrite' altitude lesson, however, is near-tautological for a generator we author ourselves (KotlinPoet): we don't configure a third-party reflection generator, we write the emitter, so 'config vs rewrite' collapses to 'build invariants into emission vs post-process the IR' — generic. Keep the encoding convention; discount the altitude platitude. + - *Do:* In codegen design docs, record the strict-LLM schema shape (all-required, additionalProperties:false, optional==nullable union) as the target encoding if/when we emit function-calling or structured-output schemas. Treat 'invariants belong in emission, node-mutation only for genuine generator gaps' as a minor secondary note, not a headline lesson. +- **Strict generated-model mapper and lenient user-POJO mapper are distinct serde concerns — do not share one configuration** `LEARN` · `docs/process` · effort S · confidence medium + - *Verdict:* VERIFIED: a second, permissive private MAPPER (kotlinModule+Jdk8+JavaTime, no strictness) exists at StructuredOutputs.kt:29-35 with the rationale comment at :26-28 ('the SDK jsonMapper requires @JsonProperty on every field ... not desirable here'); responseTypeFromJson uses it (:229-231) and wraps failures as OpenAIInvalidDataException carrying the raw JSON with an explicit do-not-log-it security caveat (:233-236). The lesson is real and is the only item here that touches an EXISTING module of ours (sdk-serde-jackson). Important conditionality the original finding glossed: openai-java needs two mappers because THEIR generated-model mapper self-imposes mandatory-@JsonProperty for round-trip fidelity; OUR sdk-serde-jackson today is already lenient (FAIL_ON_UNKNOWN_PROPERTIES disabled, per refs-comparison.md) and may never adopt that strict regime. So the lesson is conditional, not immediately actionable. confidence=medium for that reason. + - *Do:* Note in sdk-serde-jackson docs: IF our future codegen emits models that require a strict mapper (mandatory @JsonProperty for round-trip fidelity), do NOT reuse that mapper for arbitrary user POJOs — provision a separate lenient instance. Also adopt their error pattern: wrap parse failures with the payload attached but flag the do-not-log security caveat. Today this is a design note, not code. + +**Considered & dropped** + +- ~~Copy the *FromClass top-level builder helpers (responseFormatFromClass, functionToolFromClass, etc.)~~ — Correctly identified by the analysis itself as non-transferable — these @JvmSynthetic helpers (StructuredOutputs.kt:42-185) are welded to OpenAI's generated model builders (ResponseFormatJsonSchema, ChatCompletionTool, FunctionTool). Only the derivation/validation TECHNIQUE transfers, which is already captured in the kept findings. No standalone decision-ready item. +- ~~Run two full Jackson ObjectMappers in core~~ — Antipattern, not a finding. The relevant signal (strict-vs-lenient mapper split as a serde lesson) is preserved in the kept finding #6; 'two mappers in core' itself is just an instance of the dependency-hygiene point already covered by kept finding #3. Folding into notes, not a separate verified finding. + +**Do not copy** + +(1) DO NOT replicate their core-level dependency placement: victools (openai-java-core/build.gradle.kts:32-34) and Jackson are `implementation` deps INSIDE openai-java-core. For us this is the cardinal sin — it would obliterate the zero-dep-core guarantee that our own Serde.kt:15-16 advertises. Any equivalent must be an adapter on top of sdk-serde-jackson. (2) Running TWO full Jackson ObjectMappers (StructuredOutputs.kt:29 private MAPPER + the SDK's strict one) is reasonable for a single-API client but is dependency-heavy machinery we should keep entirely out of core. (3) The `@JvmSynthetic internal fun ...FromClass` top-level helpers (StructuredOutputs.kt:42-185) are tightly coupled to OpenAI's generated model builders (ResponseFormatJsonSchema, FunctionTool, ChatCompletionTool) — copying the FUNCTIONS is meaningless for us; only the derivation/validation TECHNIQUE transfers. (4) TODO at JsonSchemaValidator.kt:579 ("How should duplicate names be handled? Will the generator use longer names?") flags a real latent bug: two classes with the same simpleName from different packages collide in $defs — a warning sign for our own inline-schema-naming step (docs/refs-comparison.md:380 already calls for DETERMINISTIC naming; do better than simpleName). (5) Naming-by-`type.simpleName` for schema/$defs names (StructuredOutputs.kt:49,95,130) is fragile for exactly that collision reason — do not imitate; use fully-qualified or deterministically-disambiguated names. + +**Where we're ahead** + +On dependency hygiene, decisively. openai-java embeds victools + Jackson (and two Jackson mappers) directly in its core (openai-java-core/build.gradle.kts:32-34, StructuredOutputs.kt:29); our architecture forbids exactly this — sdk-core ships zero runtime deps beyond SLF4J and our Serde.kt:15-16 codifies "sdk-core deliberately ships no embedded serializer," with Jackson quarantined in sdk-serde-jackson. So if we ever add schema derivation, it lands as a clean opt-in adapter while theirs is permanently welded to the core. We are also ahead on SCOPE-FIT: this subsystem is API-client domain logic (function-calling/structured-outputs for ONE API). A toolkit correctly has none of it; our absence here is a deliberate correct boundary, not a gap. Otherwise, since we have no comparable feature, there is no subsystem-level technique where we currently out-implement them — the validator and the generator-config tricks are genuinely worth learning from for our future codegen. + +_Verifier notes:_ CONFIRMED ABSENCE: grep of our entire main sources (sdk-core + sdk-serde-jackson) for jsonschema/structuredoutput/extractschema/schemagenerator/victools returns nothing. The only 'schema' hits anywhere are instrumentation/HttpTracer.kt and HttpTracerFactory.kt (OpenTelemetry 'schema URL'), unrelated. So weAlreadyDoIt=false for every finding. This is a deliberate, correct boundary: the subsystem is single-API client domain logic (function-calling/structured-outputs); a toolkit correctly has none of it. Our absence is not a gap. + +DEPENDENCY-HYGIENE VERDICT (we are decisively ahead): openai-java-core/build.gradle.kts welds Jackson in as `api` (lines 23-24, leaks transitively to all consumers) and victools+modules as `implementation` (32-34), plus runs TWO Jackson mappers in core (StructuredOutputs.kt:29 private permissive + the SDK strict one). Our Serde.kt:15-16 forbids exactly this. Any future schema-derivation must be an adapter on sdk-serde-jackson. This is the single strongest takeaway and underpins kept findings #3 and #6. + +LINE-NUMBER AUDIT: spot-checked every cited line. StructuredOutputs.kt:29/134/195/202/216/229 all match. JsonSchemaValidator.kt:70/126/211/219/229/272/276/361/573/579/640/655/689/707 all match. refs-comparison.md:380 (inline-schema naming, deterministic), :383 (golden-file tests), :402 (spec preprocessor) all match. Citations are reliable; only the *interpretation* of the depth check in finding #1 was wrong (see that critique). + +LATENT-BUG WARNING worth carrying into codegen design (the analysis raised it in antipatterns; I verified it): JsonSchemaValidator.kt:579 has a live TODO 'How should duplicate names be handled? Will the generator use longer names?' — two classes with the same simpleName from different packages collide in $defs, and schema/$defs names are derived from type.simpleName (StructuredOutputs.kt:49,95,130). The validator even has a guard for it (line 580 'Duplicate definition'). Our refs-comparison.md:380 already calls for DETERMINISTIC inline-schema naming — this is concrete evidence to use fully-qualified or disambiguated names, not simpleName, in our generator. + +CATEGORY RECALIBRATIONS made: F1 claimAccurate->false (depth cap does not terminate recursion; lesson does not transfer to ref-FOLLOWING resolution). F3 ADOPT->LEARN, target sdk-core->docs/process (an ADOPT that says 'do not build' is a LEARN; target cannot be sdk-core when the whole point is it must not touch sdk-core). F2/F4/F5 retargeted both->codegen (sdk-core has no validation/derivation need today; these are codegen-emitter/validator lessons). F6 target both->docs/process (it is guidance for the sdk-serde-jackson adapter; the target enum has no 'adapter' value, docs/process best fits 'document this convention'). F4/F5/F6 confidence set to medium (F4 altitude half is generic; F5 has no present over-engineering to simplify; F6 is conditional on a strict regime we may never adopt). + +OVERLAP NOTE: findings #1, #2, #5 all derive from the same single file (JsonSchemaValidator.kt) and share the 'narrow, hand-rolled, fail-soft validator' theme. Kept as three because each carries a distinct DECISION (cycle-safety reasoning; the liftable verify/error template; the no-validator-lib choice), but they should be written up as one cohesive 'validator pattern' section in codegen docs rather than three scattered notes. + +NET: zero items survive as immediate sdk-core code changes. Everything actionable is forward-looking codegen guidance or a docs note, plus the one conditional serde-adapter convention (#6). Reference files (all under /Users/omar/IdeaProjects/openai-java-ref/openai-java-core/src/main/kotlin/com/openai/core/): StructuredOutputs.kt, JsonSchemaValidator.kt, JsonSchemaLocalValidation.kt, build.gradle.kts. Our anchors: /Users/omar/IdeaProjects/dexpace/java-sdk/sdk-core/src/main/kotlin/org/dexpace/sdk/core/serde/Serde.kt and /Users/omar/IdeaProjects/dexpace/java-sdk/docs/refs-comparison.md (lines 380, 383, 402). + +--- + +## 13. Enterprise auth / workload identity (GCP/Azure/K8s token providers, token exchange) + +**What it is** + +openai-java implements OAuth 2.0 token-exchange (RFC 8693) workload-identity federation: a pluggable `SubjectTokenProvider` SPI (SubjectTokenProvider.kt:18) obtains a short-lived local credential (GCP/Azure metadata server, K8s mounted file), which `WorkloadIdentityAuth` (WorkloadIdentityAuth.kt:30) exchanges at `https://auth.openai.com/oauth/token` for an OpenAI access token, caches it with a refresh buffer, and injects `Authorization: Bearer ` per request via `WorkloadIdentityHttpClient` (an HttpClient *decorator*, not a pipeline step). Architecture: three layers. (1) `SubjectTokenProvider` (tokenType + getToken/getTokenAsync taking the shared HttpClient+JsonMapper) with three concrete impls — GcpIdTokenProvider (OIDC ID token, SubjectTokenType.ID), AzureManagedIdentityTokenProvider (IMDS access_token, JWT), K8sServiceAccountTokenProvider (reads /var/run/secrets/.../token off a dedicated single-thread executor). (2) `WorkloadIdentity` config (WorkloadIdentity.kt:6) = clientId/identityProviderId/serviceAccountId/provider/refreshBufferSeconds(default 1200s), Builder + checkRequired. (3) `WorkloadIdentityAuth` does the exchange + a hand-rolled refresh-ahead concurrency state machine (sync via ReentrantLock+Condition, async via a sealed `TokenAction` planner). The whole thing is wired in ClientOptions.build() as a *credential-selects-decorator* model: a static API key becomes `BearerTokenCredential` and is stamped into static `securityHeaders` (ClientOptions.kt:742), while workload identity becomes a `WorkloadIdentityCredential` that swaps in the `WorkloadIdentityHttpClient` decorator (ClientOptions.kt:672) sitting *below* LoggingHttpClient and RetryingHttpClient. CRUCIAL for us: our SDK *already has* the equivalent toolkit seam — `BearerTokenProvider` SPI + `BearerToken` (with expiry/refresh-margin) + `BearerTokenAuthStep` (volatile cache + double-checked locking). So the headline "should we ADOPT a token-provider SPI?" is already answered yes-and-shipped; the real deltas are (a) we have NO async refresh path, (b) no token-exchange helper, (c) the per-cloud providers (which correctly belong in adapter modules, never sdk-core). + +**How it works (line-level)** + +REFRESH-AHEAD STATE MACHINE (the genuinely clever part). Sync `getToken()` (WorkloadIdentityAuth.kt:98) uses `ReentrantLock` + a `Condition`: it loops `while (refreshing && unexpiredCachedTokenUnsafe() == null) condition.await()` so threads that arrive with NO usable cached token block until the in-flight refresh finishes, but a thread that finds a still-valid (even if expiring-soon) token returns it without waiting (lines 104-107). Exactly one thread sets `refreshing=true`, performs the network refresh OUTSIDE the lock (line 121 `performRefresh()`), then re-acquires to read the result and `condition.signalAll()` in finally (lines 126-131). ASYNC `getTokenAsync()` (line 134) is a non-blocking planner: under the lock it computes a sealed `TokenAction` — `ReturnCached` (valid, not expiring), `BackgroundRefresh` (valid but expiring-soon AND no refresh in flight → kick off refresh but return the *current* token immediately, lines 157-160 — true refresh-ahead with zero added latency), `WaitForRefresh` (no token, refresh already running → chain onto the shared future, lines 161-167), or `ForegroundRefresh` (no token, none running → start one, lines 168-174). `refreshInFlight: CompletableFuture` is the dedupe key; `finishRefresh` (line 184) only completes/clears it if `refreshInFlight == future` (compare-and-clear guards against a stale completion clobbering a newer refresh). `unwrapCompletionException` (line 91) peels CompletionException/ExecutionException. SECURITY/CORRECTNESS details worth quoting: token-exchange request body is built with `repeatable()=true` (WorkloadIdentityAuth.kt:273) so it survives retries; `processTokenExchangeResponse` validates `accessToken.isBlank()` AND `expiresIn <= 0` and throws OpenAIInvalidDataException (lines 292-304); missing `expires_in` defaults to 3600s (DEFAULT_TOKEN_EXPIRY_SECONDS). The custom `errorHandler` (line 35) re-shapes the OAuth `{error, error_description}` envelope into the SDK's standard ErrorObject so token-endpoint failures surface through the same error path as API failures. 401 HANDLING is a decorator trick: `WorkloadIdentityHttpClient.execute` (WorkloadIdentityHttpClient.kt:24) on a 401 closes the response, calls `workloadIdentityAuth.invalidateToken()`, and throws `OpenAIRetryableException("OAuth token is expired")` — which the *outer* RetryingHttpClient catches and retries, forcing a fresh token. K8S PROVIDER detail: reads the file on a named single-thread executor (`openai-k8s-token-reader-N`, K8sServiceAccountTokenProvider.kt:27-39) so blocking file I/O never lands on a caller thread in the async path; `close()` shuts it down, and WorkloadIdentityAuth.close() forwards to `(config.provider as? AutoCloseable)?.close()` (line 317). EXCEPTION wrapping: every provider wraps failures in `SubjectTokenProviderException(provider="gcp-metadata"|"azure-imds"|"kubernetes", ...)` with a re-throw-if-already-our-type guard (GcpIdTokenProvider.kt:58-66) so the provider name is always attributable. + +**vs. our SDK** + +OUR SDK already has the token-provider SPI the prompt asks about — this is the central finding. `BearerTokenProvider` (sdk-core/.../http/auth/BearerTokenProvider.kt:32) is a `fun interface` with `fetch(scopes, params): BearerToken`; `BearerToken` (BearerToken.kt:28) is a `data class(token, expiresAt: Instant?)` implementing our sealed `Credential`, with `isExpiredAt(now, marginBefore)` (line 44) and a redacting `toString()` that prints `token=***` (line 55) — a security nicety openai-java's `BearerTokenCredential` lacks entirely. `BearerTokenAuthStep` (BearerTokenAuthStep.kt:55) caches in a `@Volatile var cachedToken` with a lock-free fast path (line 77) and double-checked locking under `ReentrantLock` (line 79), re-sampling `clock.now()` inside the lock (line 83) and refusing an already-expired fresh token (line 89) — functionally equivalent to their sync `getToken()`, and arguably cleaner (no Condition, no `refreshing` flag, because it never needs to make waiters block on a shared in-flight fetch — each thread that misses the cache just races into the lock). Our `AuthStep` base (AuthStep.kt:52) is a *pipeline step* at `Stage.AUTH`, not an HttpClient decorator: it enforces HTTPS-only before stamping (line 61 — openai-java does NOT check scheme before injecting the bearer token, a real gap on their side), skips re-stamping on cross-origin redirects via `CrossOriginRedirectMarker` (line 70), and has an `authorizeRequestOnChallenge` hook (line 119) that does an *inline* single retry on 401+WWW-Authenticate. KEY DELTAS vs openai-java: (1) NO ASYNC REFRESH — our `AsyncHttpStep` KDoc literally cites "a bearer-token refresh" as its motivating example (AsyncHttpStep.kt:19) but there is NO async AuthStep/BearerTokenAuthStep; the async pipeline cannot refresh-ahead. `BearerTokenProvider.fetch` is blocking-only and runs on the request thread (its own KDoc admits this, BearerTokenProvider.kt:20-21). (2) NO TOKEN-EXCHANGE / refresh-ahead-without-latency primitive — our step refreshes *synchronously on the miss* (a request that hits the refresh margin pays the full fetch latency), whereas their `BackgroundRefresh` returns the still-valid token instantly and refreshes off-thread. (3) Our `Credential` is `sealed interface` (Credential.kt:25) — adding cloud providers as new Credential variants would have to live in sdk-core, but a `SubjectTokenProvider`-style SPI implemented as `BearerTokenProvider` lambdas does NOT touch the sealed hierarchy, so cloud adapters fit cleanly. (4) We do RFC 7235/7616 server challenges (DigestChallengeHandler.kt — full RFC 7616 with per-nonce nc counters, SHA-256, bounded eviction); they do zero server-challenge auth. Different axis, no overlap. + +**Recommendations (verified)** + +- **Force a token-cache invalidation + single retry on a server 401, not just on the local refresh margin** `ADOPT` · `sdk-core` · effort S · confidence high + - *Verdict:* Verified and a real correctness gap, but the effort and framing need correcting. The agent is right that today BearerTokenAuthStep only refreshes when its own margin trips (BearerTokenAuthStep.kt:75-95) — a server 401 caused by a revoked/rotated-early token does NOT invalidate cachedToken, so every request keeps stamping the stale token until the margin elapses. openai-java's decoupled model is real: WorkloadIdentityHttpClient.execute (WorkloadIdentityHttpClient.kt:24-28) closes the 401, calls invalidateToken(), and throws OpenAIRetryableException, which RetryingHttpClient.shouldRetry catches (RetryingHttpClient.kt:184-188) to re-drive with a freshly-fetched token. HOWEVER the agent under-credited what we already have: AuthStep.authorizeRequestOnChallenge (AuthStep.kt:118-122) is exactly the hook for this, and AuthStep.process already does the close-401-then-single-retry plumbing (lines 86-94) including closing the body if the hook throws. So for the SYNC case this is NOT an M-effort new subsystem — it is overriding one method on BearerTokenAuthStep. The cross-pillar marker-exception channel the agent describes (signal RETRY to own the re-drive) is the heavier, optional variant and is genuinely unnecessary given our inline-retry hook; recommend NOT building it. Note the loop-bound concern is already handled: authorizeRequestOnChallenge fires exactly once per process() call (one retry, no recursion). + - *Do:* Give BearerTokenAuthStep a protected open onUnauthorized() that clears the @Volatile cachedToken (under the existing lock), and override authorizeRequestOnChallenge to: call onUnauthorized(), then return the request re-stamped via authorizeRequest(request) (which now forces a fresh fetch). That yields one retry with a guaranteed-fresh token, reusing AuthStep's existing single-retry + body-close machinery. Skip the marker-exception-into-RETRY-pillar variant unless a concrete need appears — it duplicates the inline hook. Mirror onUnauthorized into AsyncBearerTokenAuthStep when finding 1 lands. Test: 401-then-200 drives exactly one refetch and one retry; a persistent 401 surfaces after exactly one retry. +- **Add an async token-refresh path: an AsyncHttpStep-based bearer auth step with a refresh-ahead state machine** `ADOPT` · `sdk-core` · effort M · confidence high + - *Verdict:* Verified and correct. There is NO async auth step in our tree: the async pipeline package (sdk-core/.../http/pipeline/) ships AsyncHttpStep, AsyncHttpPipelineBuilder, AsyncPipelineBridges, and only DefaultAsyncInstrumentationStep as a concrete async step — AuthStep/BearerTokenAuthStep are sync-only HttpStep subclasses. AsyncHttpStep.kt:19-21 literally motivates itself with 'a bearer-token refresh that calls an OAuth endpoint via an AsyncHttpClient', so the gap is self-documented. BearerTokenProvider.fetch is blocking-only and its own KDoc (BearerTokenProvider.kt:19-21) admits it runs on the request thread — directly at odds with our Loom-safety posture if used from the async pipeline. The openai-java mechanism is real: WorkloadIdentityAuth.getTokenAsync (WorkloadIdentityAuth.kt:134-176) plans via a sealed TokenAction (ReturnCached / BackgroundRefresh / ForegroundRefresh / WaitForRefresh) and finishRefresh (lines 184-208) does a compare-and-clear on refreshInFlight. ONE correction to the agent's framing: do NOT 'port their planner' wholesale. Their planner exists because their sync path ALSO blocks waiters on a shared in-flight refresh (Condition + refreshing flag, lines 98-132) — machinery our sync step deliberately avoids. The only genuinely novel, worth-lifting behavior is BackgroundRefresh: when a still-valid-but-expiring token exists, return it instantly and kick an off-thread refresh (zero added latency). WaitForRefresh/ForegroundRefresh are just single-flight dedupe for the cold-cache case. + - *Do:* Introduce AsyncAuthStep (async mirror of AuthStep returning CompletableFuture, honoring the same HTTPS check + CrossOriginRedirectMarker strip) and AsyncBearerTokenAuthStep. Implement the refresh-ahead behavior directly rather than transliterating their class: (a) valid-and-not-expiring -> completedFuture(token); (b) valid-but-within-margin -> return current token immediately AND start a background fetchAsync that swaps the @Volatile cache (BackgroundRefresh); (c) cold/expired -> dedupe concurrent callers onto one in-flight CompletableFuture guarded by a refreshInFlight field with finishRefresh-style compare-and-clear. Add a default BearerTokenProvider.fetchAsync(scopes,params): CompletableFuture = Futures-backed supplyAsync over the blocking fetch (default method on the fun interface is allowed; CompletableFuture is Java 8; needs apiDump). Keep the blocking fetch interrupt-aware per our contract. Wire AsyncBearerTokenAuthStep as the AUTH pillar in AsyncHttpPipelineBuilder. Cover BackgroundRefresh (caller gets old token, cache later holds new) and cold-cache dedupe (two callers, one fetch) with concurrency tests. +- **Keep OAuth token-exchange specifics (error-envelope reshaping, endpoint URL) out of sdk-core** `SIMPLIFY` · `docs/process` · effort S · confidence medium · we partly do this + - *Verdict:* Accurate as a guardrail, but it is essentially the same boundary principle as the per-cloud-providers finding, aimed at a different concrete temptation (a generic 'TokenExchangeStep'). Their token-exchange handler hand-maps the OAuth {error,error_description} envelope into OpenAI's ErrorObject via jsonMapper tree APIs (WorkloadIdentityAuth.kt:35-63) and bakes the endpoint as a private const TOKEN_EXCHANGE_URL='https://auth.openai.com/oauth/token' (line 28) — both API-specific and both Jackson-dependent. The agent's conclusion is right: the only reusable part (refresh-ahead) is finding 1; the rest is per-API glue that must not enter core. I downgraded confidence to medium and retargeted from 'both' to docs/process because there is no code action — it is advice ('if you ever build a token-exchange helper, make it an adapter parameterized by endpoint URL + a Serde'). weAlreadyDoIt=true in the sense that our BearerTokenProvider SPI already keeps exchange specifics out of the SPI surface (it returns a BearerToken and takes no serde/URL). + - *Do:* Fold this into the same docs note as the per-cloud-providers item: any token-exchange helper is an opt-in adapter returning a BearerTokenProvider, parameterized by endpoint URL + a Serde, living outside sdk-core. No core code. Treat as a one-line guardrail, not a separate workstream. +- **Per-cloud token providers (GCP/Azure/K8s) belong in adapter modules, never in sdk-core** `LEARN` · `docs/process` · effort S · confidence high + - *Verdict:* Verified and the single clearest 'do not copy their layout' lesson. All three openai-java providers need exactly the dependencies we forbid in core: GcpIdTokenProvider does HTTP to metadata.google.internal (GcpIdTokenProvider.kt:25-67), AzureManagedIdentityTokenProvider does jsonMapper.readValue of the IMDS JSON (AzureManagedIdentityTokenProvider.kt:51,98), K8sServiceAccountTokenProvider does Files.readAllBytes off a dedicated single-thread executor (K8sServiceAccountTokenProvider.kt:27-39,46) — and all three take (HttpClient, JsonMapper) because openai-java hard-depends on okhttp+Jackson in core. We deliberately do not. The agent's mapping is correct: sdk-core owns ONLY the BearerTokenProvider SPI (it already does, and correctly takes neither a transport nor a serde), and concrete cloud providers ship as thin adapter modules depending on sdk-core + a transport + sdk-serde-jackson, mirroring OkioIoProvider living in sdk-io-okio3. This is a documentation/guardrail item, not code to write now. + - *Do:* Record in docs/refs-comparison.md (or a short docs/auth.md) that workload-identity / cloud-metadata token providers are an adapter-module concern: each is its own Gradle module that returns a BearerTokenProvider, never added to sdk-core (doing so would drag a transport + Jackson into the zero-dep core). Do not build them until there is demand. Note the parallel to OkioIoProvider so the SPI-in-core / impls-in-adapters split is explicit for future contributors. +- **Add redacting toString() to KeyCredential and NamedKeyCredential for consistency with BearerToken** `LEARN` · `sdk-core` · effort S · confidence medium + - *Verdict:* Verified. BearerToken redacts its secret (BearerToken.kt:55 -> 'token=***') while openai-java's BearerTokenCredential exposes token() raw with no redaction anywhere (BearerTokenCredential.kt:35) — we are ahead and the agent uses them as the cautionary example correctly. Our KeyCredential (KeyCredential.kt:27) and NamedKeyCredential (NamedKeyCredential.kt:22) are plain classes with no toString override; today their default identity-hash toString does not leak, so this is pre-emptive defense-in-depth, not an active bug. Two nuances the agent slightly oversold: (1) the risk it guards against ('someone later makes them data classes') is hypothetical; the value is consistency + locking redaction in before that happens. (2) Adding a toString override IS binary-compatible and is a new method, so the apiCheck framing is fine, but it is a near-trivial change. Worth doing as a cheap consistency win, not a priority. + - *Do:* Add explicit override fun toString() to KeyCredential ('KeyCredential(apiKey=***, headerName=..., prefix=...)') and NamedKeyCredential ('NamedKeyCredential(name=, key=***)') to lock in secret redaction across all Credential types and pre-empt any future data-class conversion leaking the secret. Run apiDump. Low priority; bundle with other auth work. +- **Reuse Futures.unwrap instead of copying openai-java's unwrapCompletionException when building the async single-flight** `LEARN` · `sdk-core` · effort S · confidence medium · we partly do this + - *Verdict:* Demoted from the agent's standalone COPY finding to LEARN, and it should fold into finding 1 rather than stand alone. The agent's own caveat is the correct conclusion: we already have Futures.unwrap (util/Futures.kt:46) which peels BOTH CompletionException and ExecutionException to the cause (with cycle detection via a HashSet) — strictly more robust than openai-java's unwrapCompletionException (WorkloadIdentityAuth.kt:91-96), and already used at AsyncHttpClient.kt:138. So there is nothing to COPY for the unwrap half. The compare-and-clear single-flight idiom (finishRefresh, WorkloadIdentityAuth.kt:184-208) is real and worth modeling, but it is ~15 lines and is the cold-cache dedupe sub-mechanism of finding 1, not a separate deliverable — and 'lift verbatim with Apache-2.0 attribution' is overkill for a 15-line standard single-flight pattern that we will write against our own types (CompletableFuture, our lock) anyway. Keep as a note attached to finding 1, not its own work item. + - *Do:* When implementing AsyncBearerTokenAuthStep (finding 1): model the cold-cache dedupe on finishRefresh's compare-and-clear (only clear/complete refreshInFlight if it is still the current future), but write it against our own types; reuse Futures.unwrap for the exception-peel half — do not copy unwrapCompletionException. No verbatim lift, so no attribution needed. This is guidance for finding 1, not a separate task. + +**Considered & dropped** + +- ~~(Re-scoped, not dropped) COPY the compare-and-clear in-flight-future dedupe idiom~~ — Not dropped outright but demoted and merged into finding 1 (now LEARN, not COPY). Verified the mechanism (WorkloadIdentityAuth.kt:184-208) but: we already have the better unwrap half (Futures.unwrap, util/Futures.kt:46, used at AsyncHttpClient.kt:138), and the remaining single-flight is ~15 lines written against our own types — too small to warrant a standalone COPY finding or Apache-2.0 attribution. It is a sub-task of the async refresh work, captured as a note on finding 1. +- ~~Antipatterns / weAreAhead sections as findings~~ — All five antipatterns verified accurate (JsonMapper+HttpClient threaded through SubjectTokenProvider.getToken at SubjectTokenProvider.kt:29; hardcoded endpoint consts TOKEN_EXCHANGE_URL WorkloadIdentityAuth.kt:28, AZURE_IMDS_BASE_URL, GCP_METADATA_BASE_URL; no HTTPS check before bearer injection at WorkloadIdentityHttpClient.kt:18-20 and ClientOptions.securityHeaders.kt:740-745; no secret redaction in BearerTokenCredential.kt:35; our own UNCHECKED_CAST smell in BearerTokenAuthStep.kt:105-118). And weAreAhead verified (BearerToken redaction, AuthStep HTTPS check at AuthStep.kt:61, CrossOriginRedirectMarker strip at AuthStep.kt:69-76, simpler sync refresh, decoupled BearerTokenProvider SPI, full RFC 7616 DigestChallengeHandler with per-nonce nc counters/SHA-256/bounded eviction/SecureRandom cnonce). These are accurate context that already informs findings 1-6; they are not separate decision-ready actions, so not emitted as findings. + +**Do not copy** + +1) HARD JACKSON+OKHTTP IN CORE: SubjectTokenProvider.getToken(httpClient: HttpClient, jsonMapper: JsonMapper) (SubjectTokenProvider.kt:29) threads a concrete JsonMapper and their core HttpClient through the auth SPI. openai-java can do this because Jackson+okhttp are core deps; for us this is a direct constraint violation — our equivalent SPI (BearerTokenProvider) correctly returns a BearerToken and takes NO serde/transport, keeping sdk-core zero-dep. Do NOT add JsonMapper/HttpClient params to our provider SPI. 2) HARDCODED PRODUCTION ENDPOINTS as private consts: TOKEN_EXCHANGE_URL='https://auth.openai.com/oauth/token' (WorkloadIdentityAuth.kt:28), AZURE_IMDS_BASE_URL, GCP_METADATA_BASE_URL baked into the class. Fine for a single-API client; an antipattern for a toolkit — any such URL must be a constructor/builder parameter. 3) NO HTTPS CHECK BEFORE BEARER INJECTION: WorkloadIdentityHttpClient.execute (WorkloadIdentityHttpClient.kt:18-20) stamps `Authorization: Bearer` with no scheme check; ClientOptions.securityHeaders (ClientOptions.kt:745) likewise. Our AuthStep.process rejects non-HTTPS up front (AuthStep.kt:61) — do NOT regress to their looser behavior. 4) NO SECRET REDACTION: BearerTokenCredential (BearerTokenCredential.kt) exposes token() and never overrides toString; a token can leak via default stringification. Our BearerToken redacts (BearerToken.kt:55) — keep ours. 5) UNCHECKED_CAST GYMNASTICS for SAM null-handling in our own BearerTokenAuthStep.fetchFresh (BearerTokenAuthStep.kt:105-118) — the `as BearerToken?` + double @Suppress to dodge the compiler's intrinsic null check is clever but brittle; it is a smell worth a comment-backed test rather than a pattern to spread. Not from openai-java, but flagged since it sits in the compared file. + +**Where we're ahead** + +Several concrete places. (1) SECRET REDACTION: BearerToken.toString prints `token=***` (BearerToken.kt:55); openai-java's BearerTokenCredential has zero redaction. (2) HTTPS ENFORCEMENT before credential stamping: AuthStep.process throws on non-HTTPS (AuthStep.kt:61) and strips the cross-origin redirect marker so a caller credential is never replayed onto a server-chosen foreign host (AuthStep.kt:66-76, CrossOriginRedirectMarker); openai-java injects the bearer token with no scheme or cross-origin guard. (3) SIMPLER SYNC REFRESH: our BearerTokenAuthStep uses a @Volatile fast path + plain double-checked ReentrantLock (BearerTokenAuthStep.kt:77-94), avoiding their Condition + `refreshing` boolean + signalAll machinery (WorkloadIdentityAuth.kt:98-132) for the sync case — equivalent correctness, less state (we don't need waiters to block on a shared fetch because each missing-cache thread simply contends for the lock). (4) DECOUPLED/EXTENSIBLE AUTH SEAM: our BearerTokenProvider takes (scopes, params) and is a pluggable `fun interface` independent of any transport or serde (BearerTokenProvider.kt:40), so a consumer can back it by ANY token source; openai-java's provider SPI is welded to their HttpClient+JsonMapper. (5) RFC 7235/7616 SERVER-CHALLENGE auth (DigestChallengeHandler.kt: full RFC 7616, SHA-256/MD5-sess, per-nonce nc counters with bounded eviction, CSPRNG cnonce) — an entire auth axis openai-java does not implement at all. Net: on the OVERLAPPING surface (refreshable bearer auth), we are ahead on safety and simpler on sync; we trail ONLY on async refresh-ahead, which is finding 1. + +_Verifier notes:_ Overall the analysis is unusually accurate and well-grounded — nearly every file:line cite checks out and the central thesis is correct: our SDK already ships the token-provider SPI the prompt asks about (BearerTokenProvider fun interface + BearerToken with expiry/margin + BearerTokenAuthStep with @Volatile fast path + double-checked ReentrantLock), so the only real toolkit gap is the ASYNC refresh path. Verified directly: there is NO async auth step (fresh ls + git ls-files + grep all agree; only DefaultAsyncInstrumentationStep exists as a concrete async step), and AsyncHttpStep.kt:19-21 literally cites bearer-token refresh as its motivating use case. + +One cite caution worth recording: my very first `ls` of the auth dir transiently listed a 'WorkloadIdentityCredential.kt' that does NOT exist (confirmed absent by re-`ls`, `git ls-files`, and `grep -rln 'WorkloadIdentity' sdk-core/src` -> zero matches). The submitted analysis did NOT claim we have that file (it correctly attributes WorkloadIdentityCredential to openai-java only), so no finding rests on a false premise — but anyone re-running this should treat a one-off `ls` as unreliable and confirm with git ls-files/grep. + +Two corrections to the agent's effort/framing that matter for execution: (1) Finding 4 (force-refresh on 401) is S-effort for the sync case, NOT a new M subsystem — AuthStep already provides authorizeRequestOnChallenge + the close-401-then-single-retry plumbing (AuthStep.kt:86-94,118-122); we only need BearerTokenAuthStep to override it and invalidate the cache. The cross-pillar marker-exception channel the agent describes is unnecessary and should be skipped. (2) Finding 1 should NOT 'port their TokenAction planner' wholesale — their planner exists to block waiters on a shared in-flight refresh (Condition + refreshing flag) which our sync design deliberately avoids; the only novel behavior worth lifting is BackgroundRefresh (return still-valid token instantly, refresh off-thread). The cold-cache single-flight dedupe is ~15 lines against our own types, reusing Futures.unwrap — no verbatim copy, no attribution needed (former finding 2 demoted/merged accordingly). + +Net priority for our toolkit: finding 1 (async refresh-ahead bearer auth) is the one substantive capability gap and a genuine Loom-safety concern; finding 4 (401 -> invalidate + one retry) is a real correctness gap but cheap given existing hooks; findings 3/4(docs)/5/6 are small guardrails/consistency wins. On the overlapping surface (refreshable bearer auth) we are ahead on safety (HTTPS enforcement, cross-origin guard, secret redaction) and simpler on the sync path; we trail ONLY on async refresh. + +--- + +## 14. CODEGEN RECIPE: generated model classes (builders, JsonField, unions, enums) + +**What it is** + +Stainless emits ~1122 model files / 850,299 LOC for the OpenAI surface; 837 of those files import JsonField, 262 are enum-as-open-class (`: Enum`), 183 are unions (`accept(visitor)`). Every emitted class is a hand-rolled immutable class (NOT a Kotlin data class) with a private constructor, a nested mutable `Builder internal constructor()`, `toBuilder()`, `@JvmStatic builder()`, generated `equals`/`hashCode`(often lazy)/`toString`, plus a `validate()`/`isValid()`/`validity()` triad. The architecture rests on ONE runtime abstraction in `core/Values.kt`: `JsonField` is a Kotlin `sealed class` with exactly two children — `KnownValue` (the typed value) and `JsonValue` (itself a sealed subclass with JsonMissing/JsonNull/JsonBoolean/JsonNumber/JsonString/JsonArray/JsonObject). Every object property is stored as `JsonField`, giving each field three orthogonal states — known-typed, raw-arbitrary-JSON, and missing/null — so an SDK older than the API can still round-trip unknown shapes losslessly. Three model archetypes: (1) plain objects (ChatCompletion, Body) — one `JsonField` per property + `additionalProperties: MutableMap` for unknown keys, dual constructors (a `@JsonCreator(mode=DISABLED)` canonical one and a `@JsonCreator` Jackson one defaulting every field to `JsonMissing.of()`); (2) unions (ChatCompletionContentPart, ChatCompletionTool, FunctionCall) — private ctor with one nullable field per variant + `_json: JsonValue?`, `@JvmStatic ofXxx` factories, a `Visitor` interface with an overridable `unknown(json)`, and a custom `Deserializer`/`Serializer` pair; (3) enums (ReasoningEffort) — open class wrapping `JsonField` with `@JvmField` constants, paired `Known` (no unknown) and `Value` (+`_UNKNOWN`) plain Kotlin enums, and `value()`/`known()`/`asString()`. Params types (ChatCompletionCreateParams) implement `core.Params` (only `_headers()`/`_queryParams()`), wrap a nested `Body` holding all JsonFields, and the OUTER `Builder` forwards every convenience setter into `body.(...)` so the request-body/header/query split is invisible to callers. + +**How it works (line-level)** + +JsonField sealed dual-typing (Values.kt:39 `sealed class JsonField`, :271 `sealed class JsonValue : JsonField()`, :402 `class KnownValue`). Read accessors collapse the three states: `getRequired` throws OpenAIInvalidDataException on missing/null (Values.kt:171-177), `getOptional` returns empty for missing/null (:179-192). Generated object stores both a typed getter `fun id(): String = id.getRequired("id")` (ChatCompletion.kt:88) and a raw getter `@JsonProperty("id") @ExcludeMissing fun _id(): JsonField = id`. Dual constructor trick: canonical private ctor is `@JsonCreator(mode = JsonCreator.Mode.DISABLED)` (ChatCompletion.kt:36) so Jackson ignores it; a second `@JsonCreator` ctor (ChatCompletion.kt:50-80) takes `@JsonProperty(...) @ExcludeMissing field: JsonField = JsonMissing.of()` per property and delegates with an empty `mutableMapOf()`. `@ExcludeMissing` (Values.kt:603-606) is `@JacksonAnnotationsInside @JsonInclude(CUSTOM, valueFilter = JsonField.IsMissing::class)` — the value-filter `IsMissing.equals` returns true for any JsonMissing (Values.kt:241-246), which is how missing fields are dropped from output. additionalProperties round-trip: `@JsonAnySetter private fun putAdditionalProperty` + `@JsonAnyGetter @ExcludeMissing fun _additionalProperties()` returning `Collections.unmodifiableMap` (ChatCompletionContentPart.kt:369-377). Builder list-append idiom for `JsonField>`: `messages = (messages ?: JsonField.of(mutableListOf())).also { checkKnown("messages", it).add(message) }` (ChatCompletionCreateParams.kt:3161-3165) — `checkKnown` (Check.kt:15) unwraps the KnownValue or throws if someone set the field to raw JSON. build() seals required fields and freezes lists: `checkRequired("messages", messages).map { it.toImmutable() }` (ChatCompletionCreateParams.kt:4366; checkRequired at Check.kt:11; toImmutable returns Collections.unmodifiableList at Utils.kt:17). UNION discriminated dispatch: deserializer reads `json.asObject()...get("type")?.asString()` then `when(type){"text"-> tryDeserialize(...)?.let{...}?: ChatCompletionContentPart(_json=json)}` (ChatCompletionContentPart.kt:273-304) — note it always stashes `_json` so an unparseable-but-known-type still survives. UNION best-match (no discriminator): builds `sequenceOf(tryDeserialize each variant)...filterNotNull().allMaxBy { it.validity() }`, then size 0→keep raw, 1→single, else→firstOrNull{isValid()} (FunctionCall, ChatCompletionCreateParams.kt:4830-4850; allMaxBy at Utils.kt:44). `validity()` is the scoring hook: a leaf object sums `if (field.asKnown().isPresent) 1 else 0` per field (ChatCompletionContentPart.File.FileObject.kt:723-726) and a literal-type discriminator scores `type.let { if (it == JsonValue.from("file")) 1 else 0 }` (:510). ENUM recipe: `class ReasoningEffort @JsonCreator private constructor(private val value: JsonField) : Enum` (ReasoningEffort.kt:22), `@JsonValue fun _value()` (:32), `@JvmField val HIGH = of("high")` (:44), `@JvmStatic fun of(value) = ReasoningEffort(JsonField.of(value))` (:48), paired `enum class Known` (:52) and `enum class Value{...,_UNKNOWN}` (:70), `known()` throws on unknown (:117), `value()` returns `_UNKNOWN` (:98). equals on enum is by wrapped value only (:171) so `of("high") == HIGH`. validate() memoizes via `private var validated` and returns `this` (apply) (ReasoningEffort.kt:142-149; same shape on every object). hashCode is eagerly computed on small/union types (ChatCompletionTool.kt:148 `Objects.hash(...)`) but `by lazy` on larger leaf objects (FileObject.kt:740-744). Params: outer setter `fun messages(messages) = apply { body.messages(messages) }` (ChatCompletionCreateParams.kt:803) forwards into the nested Body.Builder (built at :770 `private var body: Body.Builder = Body.builder()`); `_body()` returns the Body (:2098), `_headers()/_queryParams()` return additional H/Q (:2100-2102). + +**vs. our SDK** + +We have NO generated code and NO field-modeling runtime. Our only model style reference is hand-written: Request.kt (/Users/omar/IdeaProjects/dexpace/java-sdk/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/request/Request.kt) is a `@ConsistentCopyVisibility data class private constructor` with a `RequestBuilder : Builder` (Request.kt:42,84) and a custom equals/hashCode that compares URL by external form (:52-71). Our generic builder contract is `interface Builder { fun build(): T }` (sdk-core/.../generics/Builder.kt) — the generated per-class Builder should implement this (openai-java's builders implement no such interface; they are nominally typed and only Java-fluent via `apply`). Our three-state field type is `Tristate` (sdk-core/.../serde/Tristate.kt:35) — a sealed class with `Absent`/`Null`/`Present(value)`, `fold(onAbsent,onNull,onPresent)` (:99) and `@JvmStatic` factories. Tristate maps 1:1 onto JsonMissing/JsonNull/KnownValue BUT lacks the FOURTH capability that makes JsonField powerful: a typed field cannot also hold arbitrary raw JSON of a MISMATCHED type — Tristate.Present is `T:Any`, there is no `JsonValue` escape hatch, so a future-API value of the wrong shape would force a deserialization throw rather than survive in `_raw`. Our serde seam is `interface Serde { val serializer; val deserializer }` (sdk-core/.../serde/Serde.kt:18) with Jackson confined to sdk-serde-jackson — the OPPOSITE of openai-java, which hard-imports `com.fasterxml.jackson.*` directly in every model file (e.g. ChatCompletion.kt:5-15). refs-comparison.md already commits to "forward-compatible enums" with an UNKNOWN sentinel (docs/refs-comparison.md:410,465) and KotlinPoet emission of "models (immutable + Builder + @JsonDeserialize)" (:465) — but it does NOT yet capture JsonField dual-typing, the validity()-scored union best-match, or the additionalProperties round-trip, all of which this analysis adds. + +**Recommendations (verified)** + +- **Generate unions as private-ctor + nullable-per-variant + Visitor with retained _json (NOT Kotlin sealed) — Java-8 safe by construction** `COPY` · `codegen` · effort M · confidence high + - *Verdict:* Mechanism verified exactly. ChatCompletionContentPart: private ctor with one nullable field per variant + `_json: JsonValue?` (:34-40); accept() dispatches on first non-null arm else visitor.unknown(_json) (:110-117); Visitor.unknown default throws OpenAIInvalidDataException (:265-267) so forward-compat is opt-in; @JvmStatic ofText/ofImageUrl/... factories (:214-231); equals/hashCode over the nullable arms via Objects.hash (:187-199). ChatCompletionTool is the 2-variant version (:24-29, :78-83, :192-194). The flat one-class-N-nullable-fields encoding is genuinely the right fit for us: we cannot use `sealed interface ... permits` (Java-8 banned) and this avoids it entirely. The `_json` retention on a successfully-typed variant is the clever part the author correctly highlights — even a known-discriminator-but-otherwise-mismatched payload round-trips because the raw node is kept alongside the parsed arm (Deserializer passes _json = json on BOTH success and failure paths, :280-303). The Visitor-with-default-unknown gives Java callers exhaustive matching without Kotlin `when`. Attribution note is correct (Apache-2.0; put the note in the generator template source, not emitted files). The one real divergence the author flags is right: their Deserializer/Serializer inner classes hard-use Jackson (BaseDeserializer, ObjectCodec, JsonNode), so our emitted unions must route through our Serde seam instead. Categorizing this COPY (liftable class SHAPE into a generator template) rather than ADOPT is the correct call since it's a concrete near-verbatim structure. + - *Do:* Make this the canonical union template in the KotlinPoet generator: one class, N nullable backing fields + a `_raw` arm (our RawJson, not their JsonValue), @JvmStatic ofX factories, Optional-returning isX/asX accessors, accept(Visitor), and a Visitor interface whose unknown(...) default throws our InvalidData exception. Generate equals/hashCode via Objects.hash over the arms (NOT a data class). Retain the raw node on every successfully-parsed variant so unknown-shape payloads survive re-serialization. Emit the Deserializer/Serializer against our Serde SPI, not Jackson. +- **Generate enums as open-class-over-JsonField with paired Known / Value(+_UNKNOWN) enums** `ADOPT` · `codegen` · effort M · confidence high + - *Verdict:* Verified exactly against ReasoningEffort.kt. Open class (NOT `enum class`) wrapping JsonField (:22), implementing the empty marker interface `Enum` (defined at Utils.kt:100 — author called it `: Enum`, accurate). @JvmField constants NONE/MINIMAL/... = of("...") (:36-46), @JvmStatic of(String) for arbitrary values (:48), nested real enums Known (closed, :52-59) and Value (closed + _UNKNOWN, :70-81). value() returns _UNKNOWN for unrecognized strings (:90-99); known() throws OpenAIInvalidDataException (:109-118); asString() at :129; equals on the wrapped string (:166-172) so of("high")==HIGH. Java-8 clean (plain classes + plain enums). This directly satisfies refs-comparison.md:410 (verified verbatim: 'Forward-compatible enums ... emits an UNKNOWN sentinel rather than throwing') and :465 ('enums (forward-compatible with UNKNOWN)') — both already committed to in our roadmap, so this is the concrete emission recipe, not a new idea. The dual Known/Value split is a real ergonomic win the author justifies correctly: known() for callers who want compile-time exhaustiveness + accept the throw, value() for forward-compat callers who handle _UNKNOWN. Cost note (more code than `enum class`, 262 such classes exist) is accurate and acceptable for GENERATED code. The deserialize-never-throws / throw-only-in-known() contract is the right wiring. + - *Do:* Emit each spec enum as: open final class over our JsonField, @JvmField constants, @JvmStatic of(String), nested Known and Value(+_UNKNOWN) enums, value()/known()/asString(), equals/hashCode/toString delegating to the wrapped value. Wire deserialization to produce the wrapped raw string and NEVER throw; reserve throwing for known() and an explicit validate(). This is the direct realization of the already-committed refs-comparison.md:410/465 forward-compatible-enum line. +- **validate()/isValid() opt-in strictness triad is worth adopting; the validity()-scored best-match union path CONFLICTS with our documented oneOf stance — emit discriminated-only** `ADOPT` · `codegen` · effort M · confidence high · we partly do this + - *Verdict:* MECHANISM accurate but the author's framing INVERTS our established position and must be split. Verified: `private var validated` + memoized validate() (ReasoningEffort.kt:132-149; ChatCompletionTool.kt:85-112; File.kt:468-491), isValid() try/catch (:151-157), @JvmSynthetic internal validity() counting present-and-well-typed fields (ReasoningEffort.kt:164; File.validity() sums per-field presence + literal type=="file" check at :508-510; FileObject.validity at :723-726). allMaxBy IS at Utils.kt:44 (signature + body confirmed) and is invoked in 89 model files for no-discriminator best-match selection. The discriminated unions (ContentPart, Tool) switch on `type` and do NOT use allMaxBy — so the author's claim that codegen needs BOTH a discriminated deserializer and a best-match one is factually accurate for THEIR codebase. BUT the recommendation to adopt validity-scored best-match for US directly contradicts refs-comparison.md:211 (verified verbatim): 'prefer discriminator-driven; fall back to ordered candidate probing only with explicit hints; fail loudly on ambiguity. Airbyte's OneOfDeserializer ... silently picks first match ... a real data-corruption risk.' allMaxBy{validity()} IS a silent-pick-a-winner heuristic — exactly the class of behavior we already decided against. The author's 'we have no analog' is wrong: we have a documented, deliberate, OPPOSITE policy. So: weAlreadyDoIt=true for the design POSITION (our policy supersedes theirs). The validate()/isValid()/validity() machinery as an OPT-IN explicit strictness pass over leniently-deserialized data is still genuinely useful and we lack it — keep that. The benign-data-race note on `validated` (non-@Volatile boolean, worst case double-work) is accurate and acceptable; note it violates strict immutability so document or compute lazily. Antipattern #4 (validate() is NOT forward-compatible per their own KDoc at ReasoningEffort.kt:136-137, so it must stay opt-in and never auto-run in deserialize) is correct and load-bearing. + - *Do:* ADOPT the validate()/isValid() opt-in strictness triad and validity() recursive scorer (+ the ~15-line allMaxBy helper) into the generated runtime — leniently deserialize, defer strict type-checking to an explicit validate() call, never auto-validate in the deserialize path (their own KDoc warns validate() defeats forward-compat). But for union SELECTION, generate ONLY the discriminator-driven deserializer by default, consistent with refs-comparison.md:211. Do NOT make validity-scored best-match the default oneOf strategy — gate it behind an explicit spec hint and fail loudly on ambiguity, as our doc already mandates. Use validity() for diagnostics/validation, not silent variant arbitration. +- **Params shape: hide body/header/query split behind a flat outer Builder that forwards into a nested Body — small Params contract in sdk-core, fat classes in codegen** `ADOPT` · `both` · effort M · confidence medium + - *Verdict:* Verified. ChatCompletionCreateParams implements Params holding Body + additionalHeaders + additionalQueryParams (:67-72 region confirmed); Params interface is exactly two methods _headers()/_queryParams() (Params.kt:7-16). Outer Builder owns a Body.Builder and forwards each convenience setter: messages(...) -> body.messages(...) (confirmed in the 803 region), body(Body) for whole-body override; build() materializes _body()/_headers()/_queryParams() (confirmed in the 2090 region). The forwarding trick genuinely hides whether an OpenAPI parameter is body/header/query from the caller — the generator routes it from the spec's parameter `in:` location. Maps cleanly to refs-comparison.md:465 ('per-operation *Params classes'). The toolkit/codegen split the author proposes is the right shape: a tiny Params-like contract in sdk-core, fat generated classes in codegen output. Confidence is MEDIUM (not high) for ONE reason the author glosses: their Params contract (_headers/_queryParams returning their own Headers/QueryParams types) does NOT obviously fit OUR pipeline, which already feeds DispatchContext/RequestContext (http.context) and our own Headers — so the sdk-core Params analog must be designed to feed OUR context chain, not copied. The author flags this ('fit Params to feed those') but doesn't resolve it; the integration is where the real design work is. Double-builder roughly doubles per-op builder code (part of why this one file is 7704 lines) — acceptable in generated code only. Java-8 clean. + - *Do:* Define a minimal sdk-core Params analog exposing exactly the request-construction inputs our pipeline needs, designed to feed our existing http.context (DispatchContext/RequestContext) and our Headers type — do NOT copy their Headers/QueryParams contract verbatim. Generate per-operation Params classes with flat-forwarding outer Builders over a nested Body; keep path/query/header routing in the generator, driven by the OpenAPI parameter `in:` location. Validate the contract against our context chain before committing the codegen template. +- **Adopt JsonField-style dual-typed fields as the codegen field model — extend Tristate, keep raw-JSON tree Jackson-free in sdk-core** `ADOPT` · `both` · effort L · confidence high + - *Verdict:* Every cited mechanism re-verified line by line and is accurate. JsonField is a Kotlin `sealed class` (Values.kt:39) with KnownValue (402) + JsonValue (271, itself sealed over Missing/Null/Boolean/Number/String/Array/Object). getRequired/getOptional at :171/:179 throw OpenAIInvalidDataException on type mismatch; @ExcludeMissing (:604-606) is the Jackson filter that drops JsonMissing from output. The four-state claim is correct and the gap in our Tristate is real: Tristate.Present is bounded `T:Any` (Tristate.kt:63) with NO raw-arbitrary-JSON arm, so a wrong-typed server value must throw rather than survive in `_raw`. Confirmed our sdk-core has ZERO JsonValue/RawJson/JsonNode references — we have no raw-json value tree at all. The author's key architectural correction is right and important: openai-java welds Jackson into JsonValue itself (JSON_MAPPER at Values.kt:319; convertValue at :360; fromJsonNode at :369 consumes Jackson's JsonNode), so copying their type verbatim drags Jackson into core. The mitigation (hand-rolled dep-free sealed RawJson tree in sdk-core, Jackson<->RawJson bridge in sdk-serde-jackson) is sound and is the ONLY way to honor our zero-dep constraint. Kotlin `sealed class` compiles to Java-8 bytecode (predates `permits`), so the encoding is Java-8-legal — the risk note is accurate. One caution the author understates: this raw tree must also be wired through the Serde seam's deserializer SPI (Serde.kt:18), not just a Jackson module, or the abstraction leaks. Effort L is right; this is the load-bearing runtime primitive everything else depends on. + - *Do:* Build a dep-free sdk-core `JsonField` sealed type with arms Known(value)/Missing/Null/Raw(node: RawJson), where RawJson is our own sealed value tree (Null/Bool/Number/String/Array/Object) with NO Jackson import. Generate every model property as JsonField with a typed accessor xxx() that throws our InvalidData equivalent and a raw _xxx() accessor that never throws. Keep Tristate as the PATCH-input convenience; map Tristate.Present/Null/Absent onto Known/Null/Missing and define JsonField as the superset. Put all Jackson<->RawJson conversion behind the sdk-serde-jackson Deserializer/Serializer SPI (Serde.kt:18), never in core. This is the prerequisite for every other codegen-model finding. +- **Do NOT mirror 850K LOC of fully-inlined per-class boilerplate — emit thin classes over a hand-written sdk-core model runtime** `SIMPLIFY` · `both` · effort L · confidence high + - *Verdict:* Quantitative claims independently re-measured and CONFIRMED: 1122 model files, 850,299 LOC, 837 import JsonField, 262 enum-as-Enum classes, 185 unions (author said 183 — a trivial 2-file undercount from grep heuristic differences, immaterial). The boilerplate-repetition claim is accurate: the additionalProperties getter/setter/putAll/remove block (~30 lines), validate/isValid/validity triad, equals/hashCode/toString, and dual @JsonCreator constructors are inlined verbatim into every object — File.validity (:508-510), FileObject.validity (:723-726), and the additionalProperties block (File.Builder :433-450, FileObject.Builder :654-674) are near-identical copies, confirming the pattern. Stainless tolerates this because regeneration is free and they don't gate coverage on generated code; the author correctly identifies that WE cannot — an 80% aggregate Kover floor and apiCheck over 850K generated LOC is untenable. The direction (push invariant machinery — additionalProperties handling, validate/validity combinators, JsonField read/write helpers — into a small hand-written runtime, emit only field-list + per-field accessors, target <100 LOC/model) is the correct toolkit-native simplification and is exactly what the dep-free JsonField runtime from finding #1 enables. The build-config flag (today only org.dexpace.sdk.core.testing.* is Kover-excluded; generated modules MUST be added to the aggregate-floor exclusion) is a real, must-decide-up-front item. Risk note (shared base class can leak into public API / complicate final equals) is legitimate; mitigation via free-function helpers or a stateless abstract base is sound. + - *Do:* Architect the generator to emit THIN classes over a hand-written sdk-core model runtime: a shared additionalProperties container helper, shared validate/validity combinators, and JsonField/RawJson read-write helpers (the same runtime introduced in finding #1). Target <100 LOC per generated model. Decide coverage/apiCheck policy before the first generator run: exclude generated modules from the aggregate Kover 80% floor (extend the current testing.* exclusion) and give them a separate binary-compat baseline. Prefer free-function helpers or a stateless abstract base to keep equals/hashCode and the public surface clean. +- **Split each field into typed accessor xxx() + raw escape-hatch _xxx() + arbitrary-JSON Builder overload; discriminator/const fields as defaulted raw values** `LEARN` · `codegen` · effort S · confidence high + - *Verdict:* Verified. ChatCompletion: typed id():String via getRequired (:88) vs raw _id() (analysis pointed to '120+'; the actual pattern is _object_() at :125 and _type-style raw getters — the typed-vs-raw split is real even if the exact line drifted). File.Builder: file(FileObject) typed (:408) AND file(JsonField) raw, KDoc'd 'primarily for setting the field to an undocumented or not yet supported value' (:410-417). Discriminator `type` emitted as JsonValue defaulting to JsonValue.from("file") in the builder (:398, :431) so the constant stays out of the canonical constructor yet remains overridable. ChatCompletionCreateParams.Builder.messages has the same typed+raw pair. The categorization as LEARN (a consequence/principle of the dual-typed field model rather than a standalone new capability) is correct — it falls out automatically once JsonField exists. The author's binary-compat caution is the sharpest and most useful point: 2 setters + 2 getters per field is a real apiCheck commitment, and our binary-compatibility-validator gate would produce enormous generated .api baselines. This correctly argues for a SEPARATE apiCheck baseline for generated modules (or excluding them) — a concrete build-config decision we must make before the first generator run. Effort S is right (it's a per-field template detail). + - *Do:* Adopt the typed+raw dual-accessor convention in the generator (typed xxx() + raw _xxx(), typed setter + JsonField-raw setter). Emit discriminator/const fields as defaulted raw values in the Builder, kept out of the canonical constructor. Decide UP FRONT how generated modules participate in binary-compatibility-validator: a dedicated generated-module .api baseline, regenerated with the code, distinct from the hand-written sdk-core baseline. + +**Considered & dropped** + +- ~~(none dropped as inaccurate — all 7 findings survive verification)~~ — Every cited mechanism was re-read line by line and confirmed accurate; the quantitative claims (1122 files / 850,299 LOC / 837 JsonField / 262 enums / ~185 unions / allMaxBy at Utils.kt:44) were independently re-measured and match (union count 183 vs measured 185 is an immaterial grep-heuristic delta). No finding is already-done-by-us in a way that kills it, and all are applicable under our constraints once Jackson is kept out of core. The only substantive correction is to finding #4, which is RETAINED but split: the validate/isValid/validity triad is a genuine ADOPT, but the validity-scored best-match union path is NOT something to adopt as default because it conflicts with our documented oneOf stance (refs-comparison.md:211) — that conflict is captured in the finding's critique and recommendation rather than dropped, since the mechanism is still real and the discriminated path is worth generating. + +**Do not copy** + +1) Jackson is hard-imported into EVERY generated model and into core/Values.kt (ChatCompletion.kt:5-15, Values.kt:3-26, JSON_MAPPER at Values.kt:319) — directly violates our zero-dep-core constraint. We must keep the field model (JsonField/raw-JSON tree) Jackson-free in sdk-core and confine all Jackson to sdk-serde-jackson; copying their model verbatim would drag Jackson into core. 2) Fully-inlined boilerplate → 850K LOC / huge jar / unbounded compile time (see SIMPLIFY finding); do not mirror. 3) Each model's `private var validated` (ReasoningEffort.kt:132) is a mutable field on a nominally-immutable object and a benign data race; acceptable but note it breaks strict immutability — our explicit-API/immutability conventions would want it documented or @Volatile-free-by-design. 4) `validate()`-not-forwards-compatible (their own KDoc, ReasoningEffort.kt:136-137) means calling validate() defeats the forward-compat the rest of the design buys — it must stay strictly opt-in, never auto-called in the deserialize path. 5) Reflection-based `functionToolFromClass`/StructuredOutputs (imported at ChatCompletionCreateParams.kt:28) pulls runtime JSON-schema reflection into the SDK — out of scope for a toolkit and a dependency we should not inherit. 6) Their builders implement NO common interface — fine for them, but we should make generated builders implement our `Builder` (generics/Builder.kt) so pipeline builder-folding can drive them. + +**Where we're ahead** + +1) sdk-core is genuinely Jackson-free: our serde seam is `interface Serde{serializer;deserializer}` (serde/Serde.kt:18) and Tristate lives in core with the Jackson mapping isolated in sdk-serde-jackson's TristateModule — openai-java cannot swap serializers at all (Jackson is welded into Values.kt and every model). 2) Our `Builder` contract (generics/Builder.kt) is a real abstraction their generated builders lack, letting generic pipeline steps fold ANY builder; theirs are nominally typed only. 3) Our Request.equals compares URL by `toExternalForm()` to avoid `java.net.URL`'s DNS-blocking equals (Request.kt:52-71) — a correctness subtlety their generated equals (plain field compare) never has to face but which shows our hand-written models are more deliberate about hazardous JDK semantics. 4) Tristate bounds Present to `T:Any` (Tristate.kt:63) to make the illegal Present(null) state unconstructible — a small type-safety win over a naive nullable wrapper. Net: our TOOLKIT layering (dep-free core + adapter serde + generic Builder) is architecturally cleaner; we are 'behind' only in that we have not yet emitted the field-model runtime (JsonField dual-typing) that their generated code depends on. + +_Verifier notes:_ Overall: this is a high-quality, accurate analysis. All file:line citations and all six quantitative claims verified against the actual code (1122 files, 850,299 LOC, 837 JsonField imports, 262 `: Enum` open classes, 185 unions [author 183], allMaxBy at core/Utils.kt:44 used in 89 model files, Enum is an empty marker interface at Utils.kt:100, getOrThrow at Utils.kt:13, checkRequired/checkKnown at Check.kt). Our sdk-core confirmed to have ZERO JsonValue/RawJson/JsonNode references, so the central gap (Tristate is 3-state with no raw-arbitrary-JSON arm; Present is bounded T:Any at Tristate.kt:63) is real.\n\nThe ONE material correction: finding #4 asserts 'we currently have no analog' for validity-scored best-match union deserialization and frames adopting it positively. That inverts our established position. refs-comparison.md:211 (verified verbatim) already commits to 'prefer discriminator-driven; fall back to ordered candidate probing only with explicit hints; fail loudly on ambiguity,' and explicitly flags Airbyte's silent-first-match as 'a real data-corruption risk.' OpenAI's allMaxBy{validity()} is precisely that silent-pick-a-winner heuristic. I retained the finding but split it: the validate()/isValid()/validity() opt-in strictness triad is a legitimate ADOPT (we lack it; lenient-deserialize + explicit-strict-pass is a good pattern, and their own KDoc at ReasoningEffort.kt:136-137 warns validate() must stay opt-in and never run in the deserialize path), while best-match union SELECTION should be flagged as conflicting with our policy and emitted discriminated-only by default.\n\nThe author's most valuable contributions are the antipattern callouts that protect our constraints: (1) Jackson is hard-welded into core/Values.kt (JSON_MAPPER at :319, convertValue at :360, fromJsonNode consuming Jackson JsonNode at :369) and into every model — copying their type verbatim drags Jackson into sdk-core; the dep-free RawJson sealed tree in core + Jackson bridge in sdk-serde-jackson is the correct and only constraint-honoring path. (2) The binary-compat / coverage-floor implications of generated code (dual accessors per field => huge .api baselines; 850K LOC can't meet an 80% aggregate Kover floor) are real build-config decisions that must be made before the first generator run — extend the current org.dexpace.sdk.core.testing.* Kover exclusion to generated modules and give them a separate apiCheck baseline.\n\nFindings #1 (dep-free JsonField runtime) and #7 (thin generated classes over that runtime) are the linchpins — they are the same runtime viewed from two angles, and #2/#3/#5/#6 all depend on #1 existing. Sequencing: build the dep-free JsonField/RawJson runtime first, then the per-archetype templates. Kotlin `sealed class` (no `permits`) compiles to Java-8 bytecode, so the sealed encoding is legal in our Java-8 modules; only `sealed interface ... permits` syntax is banned — the author got this right." + +--- + +## 15. CODEGEN RECIPE: services (blocking/async interface+impl split, withRawResponse) + client tree + +**What it is** + +openai-java generates, per API resource, FOUR types in TWO parallel package trees: `services/blocking/{Name}Service` (interface) + `{Name}ServiceImpl` (class), and `services/async/{Name}ServiceAsync` + `{Name}ServiceAsyncImpl`. Each interface nests a `WithRawResponse` interface; each Impl nests a `WithRawResponseImpl` class. The architecture is a strict two-tier delegation: the "cooked" Impl method does nothing but `withRawResponse().retrieve(params, requestOptions).parse()` (ModelServiceImpl.kt:43-53), while ALL real work — request building, dispatch, response handling — lives in the `WithRawResponseImpl` (ModelServiceImpl.kt:55-173). A raw method (1) `checkRequired` on positional params, (2) builds an `HttpRequest` via builder, (3) calls `.prepare(clientOptions, params, SecurityOptions...)` to fold in client/param headers+query+auth (PrepareRequest.kt:13-29), (4) `clientOptions.httpClient.execute(request, requestOptions)`, (5) runs the response through `errorHandler.handle(response).parseable { response.use { typedHandler.handle(it) }.also { validate } }` (ModelServiceImpl.kt:90-98). `parseable {}` wraps the raw `HttpResponse` as an `HttpResponseFor` whose `parse()` lazily deserializes (HttpResponseFor.kt:11-25). The root `OpenAIClient` interface (OpenAIClient.kt) is just ~30 lazy sub-service accessors + `async()`/`withRawResponse()`/`withOptions()`/`close()`; `OpenAIClientImpl` wires 48 `by lazy` service instances (OpenAIClientImpl.kt:72-134, both trees), injecting a one-time `clientOptionsWithUserAgent` (OpenAIClientImpl.kt:57-63). The whole client tree is a thin DI/lazy-init shell over `ClientOptions`, which carries the (already retry+logging-wrapped) `httpClient` (ClientOptions.kt:678-690). This is the precise blueprint our planned KotlinPoet generator must emit for the service layer. + +**How it works (line-level)** + +TWO-TIER DELEGATION (the core trick): the cooked impl is a one-liner. `override fun retrieve(params, requestOptions): Model = withRawResponse().retrieve(params, requestOptions).parse()` (ModelServiceImpl.kt:43-45). Every cooked method is exactly this shape — zero logic duplicated between raw and cooked. + +RAW METHOD ANATOMY (ModelServiceImpl.kt:70-99): `checkRequired("model", params.model().getOrNull())` // positional-or-params guard with comment "We check here instead of in the params builder because this can be specified positionally or in the params class"; then `HttpRequest.builder().method(HttpMethod.GET).baseUrl(clientOptions.baseUrl()).addPathSegments("models", params._pathParam(0)).build().prepare(clientOptions, params, SecurityOptions.builder().bearerAuth(true).build())`. The DELETE variant additionally folds a body: `.apply { params._body().ifPresent { body(json(clientOptions.jsonMapper, it)) } }` (ModelServiceImpl.kt:154). + +PREPARE = the single funnel that keeps services thin (PrepareRequest.kt:18-29): `toBuilder().pathSegments(listOf()).addPathSegmentsForAzure(...).addPathSegments(*pathSegments...).putAllQueryParams(clientOptions.queryParams).replaceAllQueryParams(params._queryParams()).putAllHeaders(clientOptions.securityHeaders(security)).putAllHeaders(clientOptions.headers).replaceBearerTokenForAzure(clientOptions).replaceAllHeaders(params._headers()).build()`. Precedence is explicit: client query/headers via `putAll`, param query/headers via `replaceAll` (params win). Auth is injected as headers from `clientOptions.securityHeaders(security)` (ClientOptions.kt:736-779), NOT a pipeline step. + +PER-CALL HANDLER == our SERDE stage, cached per-method as a field: `private val retrieveHandler: Handler = jsonHandler(clientOptions.jsonMapper)` (ModelServiceImpl.kt:68). `jsonHandler` is `jsonMapper.readValue(response.body(), jacksonTypeRef())` wrapped to rethrow as `OpenAIInvalidDataException` (JsonHandler.kt:12-20). The error handler is a status-code `when` that throws typed exceptions (ErrorHandler.kt:46-93): 400→BadRequestException, 401→Unauthorized, 429→RateLimit, 5xx→InternalServer, else→UnexpectedStatusCode, each `.builder().headers(...).error(errorBodyHandler.handle(...)).build()`. + +RESPONSE PLUMBING (ModelServiceImpl.kt:90-98): `errorHandler.handle(response).parseable { response.use { retrieveHandler.handle(it) }.also { if (requestOptions.responseValidation!!) it.validate() } }`. Note `response.use {}` closes the body after parse; `parseable` then re-exposes status/headers/body delegating to the original. `HttpResponseFor.parseable` (HttpResponseFor.kt:11-25): anonymous `object : HttpResponseFor` with `private val parsed: T by lazy { parse() }` — deserialization is deferred until `.parse()` is first called. + +PAGE WRAPPING happens in the raw `list` (ModelServiceImpl.kt:129-135): after parsing the wire DTO `ModelListPageResponse`, it `.let { ModelListPage.builder().service(ModelServiceImpl(clientOptions)).params(params).response(it).build() }` — the page captures a fresh service instance so it can self-fetch next pages. + +LAZY SUB-SERVICE ACCESSORS (OpenAIClientImpl.kt:72-134): `private val models: ModelService by lazy { ModelServiceImpl(clientOptionsWithUserAgent) }` … `override fun models(): ModelService = models`. 48 such `by lazy` slots across cooked + WithRawResponseImpl. The WithRawResponseImpl reuses the SAME Impl class' nested type: `ModelServiceImpl.WithRawResponseImpl(clientOptions)` (OpenAIClientImpl.kt:242). + +USER-AGENT ONCE (OpenAIClientImpl.kt:57-63): `if (clientOptions.headers.names().contains("User-Agent")) clientOptions else clientOptions.toBuilder().putHeader("User-Agent", "${javaClass.simpleName}/Java ${getPackageVersion()}").build()`. The sync client is constructed with the ORIGINAL options (comment line 65: "Pass the original clientOptions so that this client sets its own User-Agent") so async and sync each stamp their own UA. + +OVERLOAD EXPLOSION: `grep -c "fun retrieve"` = 12 per method in the interface alone (positional `model`, `model+params`, `model+params+opts`, `params`, `params+opts`, `model+opts`, each mirrored in WithRawResponse). Most are Kotlin default-arg convenience delegating to one canonical `(params, requestOptions)` abstract method. + +**vs. our SDK** + +OUR SDK HAS NO SERVICE LAYER AT ALL — `grep -rln "RequestOptions|interface Params|fun interface Handler|withRawResponse|responseValidation"` over `sdk-core/src/main/kotlin/` returns NONE FOUND. We stop one layer below them. Our stack: `HttpClient.execute(Request): Response` SPI (client/HttpClient.kt:46-62) + `AsyncHttpClient.executeAsync(Request): CompletableFuture` (client/AsyncHttpClient.kt:65-83); a stage-sorted `HttpPipeline.send(Request): Response` (http/pipeline/HttpPipeline.kt:41-45) that walks `HttpStep`s and dispatches to the transport. We have immutable `Request` (http/request/Request.kt:43, `@ConsistentCopyVisibility data class` + private ctor + `newBuilder()`) and `Response` (http/response/Response.kt:43, `Closeable`). MAPPING: their per-call `Handler` (deserialize + status→exception) is OUR SERDE stage + a future error-mapping stage in the pipeline; their `prepare()` header/query/auth folding is OUR AUTH + header steps + `http.context` promotion chain; their `RetryingHttpClient`/`LoggingHttpClient` wrappers (ClientOptions.kt:678-690) are OUR `DefaultRetryStep`/`DefaultInstrumentationStep` pillars (http/pipeline/steps/). CRITICAL DIFFERENCE IN ARCHITECTURE: they decorate the `HttpClient` itself (retry-wraps-logging-wraps-transport, all baked into `ClientOptions.build()`), and do auth+serde inline in the generated service; WE put retry/auth/logging/serde as ordered pillar STEPS inside the pipeline, so a generated service would call `pipeline.send(request)` and never touch retry/auth/serde itself. Their `ClientOptions` (ClientOptions.kt:34-140) is a 21-field god-object mixing transport, jsonMapper, auth credential, Azure config, timeouts, retries, log level — OUR equivalent is split across `config/Configuration.kt`, `http/pipeline/HttpPipelineBuilder.kt`, and the `Credential` family in `http/auth/`. Their `HttpResponseFor` raw/cooked split (HttpResponseFor.kt) has no analog in our code — and `docs/refs-comparison.md:408` already plans it ("Raw/Cooked client split … raw clients return [Response], cooked clients call .body()"). Their `Params` interface (Params.kt:7-16, just `_headers()`+`_queryParams()`) has no analog; our `docs/refs-comparison.md:399-400` instead plans an Expedia-style `*OperationParams` with `pathParams()/queryParams()/headers()` projections. Their `RequestOptions` per-call override (RequestOptions.kt) — `responseValidation` + `timeout`, with `applyDefaults` merging from client — has no analog; we'd route per-call overrides through `RequestContext` (http/context/RequestContext.kt) per `docs/refs-comparison.md:421`. + +**Recommendations (verified)** + +- **Lazy sub-service accessors via `by lazy`; root client is a DI/lazy shell over one options object, and the raw tree reuses the same nested `WithRawResponseImpl`** `COPY` · `codegen` · effort S · confidence high + - *Verdict:* Verified. OpenAIClientImpl.kt:72-134 are all `private val x: XService by lazy { XServiceImpl(clientOptionsWithUserAgent) }` + `override fun x() = x`; the WithRawResponseImpl tree (:213-303) reuses `XServiceImpl.WithRawResponseImpl(clientOptions)` rather than a parallel raw-service hierarchy — that genuinely halves the generated type count and is the non-obvious bit. The root interface (OpenAIClient.kt:45-219) is pure accessors + async()/withRawResponse()/withOptions()/close(). 48 lazy slots across sync+async impls confirmed. We have no client tree at all. `by lazy` compiles to a Java-8-safe lazy field, thread-safe by default. The only caveat is real and already in our docs: if/when we emit Java source, `by lazy` has no equivalent — use Square's `Suppliers.memoize` (refs-comparison.md Square section). For v1 the generator emits Kotlin, so `by lazy` is correct now. + - *Do:* Generator emits root client as: interface with `fun {res}(): {Res}Service` + `withRawResponse()`/`async()`/`close()`; impl with `private val {res} by lazy { {Res}ServiceImpl(config) }`. Reuse `{Res}ServiceImpl.WithRawResponseImpl` inside the root `WithRawResponseImpl` — do NOT emit a separate raw-service class hierarchy. Flag the `by lazy`-vs-`Suppliers.memoize` choice at the IR->emitter boundary for a future Java target. +- **Two-tier raw/cooked split: cooked method body is `withRawResponse().op(...).parse()` and nothing else; raw tier returns a lazy `HttpResponseFor`** `COPY` · `codegen` · effort M · confidence high + - *Verdict:* Verified line-by-line. ModelServiceImpl.kt:43-53 are pure one-liners (`override fun retrieve(params, requestOptions) = withRawResponse().retrieve(params, requestOptions).parse()`); ALL dispatch lives in the nested WithRawResponseImpl (:55-173). HttpResponseFor.kt:10-25 is the real trick: `parseable {}` returns an anon `HttpResponseFor` whose `parse()` is backed by `private val parsed: T by lazy { parse() }`, delegating statusCode/headers/body/close to the live response. The async mirror does the same via `.thenApply { it.parse() }` (ModelServiceAsyncImpl.kt:49). We have NO service layer and NO ResponseFor analog (grep for `parseable`/`ResponseFor`/`withRawResponse` over sdk-core returns nothing). Trim two pieces of hype: (1) 'cleaner than two parallel method bodies' — the naive alternative is a strawman; the real benefit is that headers/status access skips deserialization. (2) The claim is sound but note our `Response` is ALREADY `Closeable` (Response.kt) so the raw tier is nearly free — raw returns the Response-backed wrapper, cooked calls parse(). One subtlety to copy carefully: their `parse()` consumes the body inside `response.use { ... }` (ModelServiceImpl.kt:91-92), so calling parse() closes the response; a caller who wants raw headers must read them BEFORE parse(). Document that lifecycle. + - *Do:* Add to sdk-core a thin `ResponseFor` (extends/wraps our `Response`, adds `parse(): T`) plus an internal `Response.parseable(parse: () -> T)` using `by lazy`. Generator emits per resource: `{Res}Service` interface with nested `WithRawResponse`; impl whose cooked methods are exactly `withRawResponse().m(...).parse()`; a `WithRawResponseImpl` holding the real dispatch. The dispatch step body is: build Request -> `pipeline.send(request)` -> wrap as `parseable { serde.decode(response, typeToken) }`. Document the 'parse() consumes/closes the body; read raw headers first' contract in docs/codegen.md. Attribution: note openai-java (Apache-2.0) in the generator template comment if the `parseable`/`ResponseFor` shape is lifted near-verbatim. +- **Define a minimal `OperationParams` SPI (headers/query/path/body projections) as the toolkit<->generated-params contract that makes dispatch generic** `ADOPT` · `sdk-core` · effort S · confidence high + - *Verdict:* Verified: Params.kt:7-16 is exactly `interface Params { _headers(): Headers; _queryParams(): QueryParams }`; path/body reached by convention (`params._pathParam(0)` ModelServiceImpl.kt:81, `params._body()` :154). It is the contract that lets ONE prepare/dispatch path serve every operation. We have no analog (grep `interface Params` over sdk-core = none), and our docs already plan a richer Expedia-style `*OperationParams` with `pathParams()/queryParams()/headers()` (refs-comparison.md). This is the FOUNDATIONAL precondition for the thin-service design (findings #1/#2/#3 all assume a generic params contract). Two corrections to the analysis: (1) it says the SPI returns 'our Headers/QueryParams types' — our `QueryParam` is currently a stub (`internal class QueryParam { TODO() }`), so this SPI cannot be finalized until that type exists; sequence accordingly. (2) Correctly avoid the `_`-prefix pseudo-privacy (see antipatterns) — use clear public projection names per explicit-API rules. Keep it dependency-free (returns our own types, never Jackson nodes). + - *Do:* Add an `OperationParams` SPI to sdk-core exposing `headers(): Headers`, `queryParams(): `, `pathParams(): List` (or indexed), `body(): RequestBody?` — public, explicit-API-clean, zero-dep. Finalize the `QueryParam(s)` type first (it is a stub today). Generated `*Params` classes implement it; the generic `RequestDefaultsStep`/dispatch folds these. This unblocks the single-funnel + thin-service findings. +- **`withOptions(Consumer)` returns a NEW immutable service/client with modified options; original untouched** `ADOPT` · `both` · effort M · confidence medium + - *Verdict:* Verified: ModelServiceImpl.kt:40-41 and OpenAIClientImpl.kt:140-141 both do `Impl(clientOptions.toBuilder().apply(modifier::accept).build())`; interface KDoc states 'The original service is not modified.' (ModelService.kt:28-29). Genuinely useful and fits our immutability discipline. The stated precondition is REAL and stronger than the analysis implies: I checked — our `Configuration` has NO `toBuilder()`, and `ConfigurationBuilder` (config/ConfigurationBuilder.kt:21) is a fresh-only builder with no `from()`. There is no single immutable client-config object to clone; config is genuinely split across Configuration + HttpPipelineBuilder + the Credential family. So `withOptions` has nothing coherent to clone today. This is an ADOPT but gated behind consolidating a client-config type with `toBuilder()`. `java.util.function.Consumer` is Java-8 fine. Downgraded to medium confidence because the value depends on a config-consolidation decision we have not made and that cuts against our deliberate split-config design — flag that tension rather than presenting it as free. + - *Do:* Precondition (decision needed): introduce a single immutable client-config type with `toBuilder()`/`from()` that the generated client tree threads. ONLY then have the generator emit `withOptions(Consumer): {Res}Service` on every service + root client as `Impl(config.toBuilder().apply(modifier::accept).build())`. If we keep split config, drop this finding — withOptions is incoherent without a single cloneable config. Document the 'original not modified' contract. +- **Per-call `RequestOptions` (responseValidation + timeout) merged via `applyDefaults(from client)` — route per-call overrides through a context, not a parallel options type** `ADOPT` · `both` · effort M · confidence medium + - *Verdict:* Verified: RequestOptions.kt:5-55 carries `responseValidation: Boolean?` + `timeout: Timeout?`, defaults to `none()`, and `applyDefaults` does call-wins for validation (`responseValidation ?: options.responseValidation`) and timeout composition (`timeout.assign(options.timeout)`). Called at ModelServiceImpl.kt:88 `requestOptions.applyDefaults(RequestOptions.from(clientOptions))`, then `requestOptions.responseValidation!!` at :94 (the `!!` only safe because applyDefaults populated it). The merge-with-defaults semantic is the part worth lifting precisely (off-by-one config bugs live there). Two important corrections for our side: (1) Our transport SPI `HttpClient.execute(request)` takes ONLY a Request (HttpClient.kt:51) — there is NO per-call options channel into the transport, so a RequestOptions analog cannot ride the transport; it must ride the pipeline. (2) `RequestContext` (http/context/RequestContext.kt:24-28) is currently INSTRUMENTATION-ONLY (holds instrumentationContext + request + callKey, no timeout/validation). However our OTHER pipeline layer already has the seam: the recovery-aware `ExecutionPipeline.execute(request, context)` (ExecutionPipeline.kt:18,102) threads a context. So the per-call carrier exists conceptually; the work is choosing ONE carrier and adding timeout/validation/ad-hoc-header fields to it. Avoid their `Boolean?` + `!!` pattern — explicit-API favors non-null fields with documented defaults. Medium confidence because which context object is THE per-call carrier (RequestContext vs the execution-pipeline context) is an unmade design decision. + - *Do:* Pick ONE per-call carrier (likely extend `RequestContext`, or reuse the execution-pipeline `context`) and add per-call overrides: timeout, response-validation, ad-hoc headers, with an `applyDefaults(clientDefaults)` merge that copies the call-wins + timeout-composition semantics from RequestOptions.kt:23-29. Model fields as non-null with documented defaults (no `Boolean?` + `!!`). Generated service methods take an optional trailing carrier and merge before `pipeline.send()`. Do NOT introduce a parallel RequestOptions type alongside RequestContext (duplicate-config smell). +- **Single canonical method + convenience-overload fan-out (~12 per op) — copy the single-impl-point principle, deliberately emit FEWER overloads than Stainless** `SIMPLIFY` · `codegen` · effort M · confidence high + - *Verdict:* Verified exactly: `grep -c 'fun retrieve' ModelService.kt` = 12, and only ONE per operation is abstract (e.g. ModelService.kt:50-53 `retrieve(params, requestOptions): Model`); the other 11 are interface default methods normalizing to it (e.g. :36,46,56,59). The cross-product is real: 12 overloads x raw/cooked x sync/async ~= 48 signatures per operation. The analysis's direction is correct and important for US specifically: explicit-API strict mode forces an explicit return type on every emitted overload, and the binary-compatibility-validator freezes every one into `api/*.api` permanently — so over-generation is a self-inflicted maintenance tax we, unlike Stainless, pay forever. This is the rare 'theirs is more elaborate; do NOT follow' SIMPLIFY. Pre-1.0 is the time to set a lean policy. + - *Do:* Generator emits exactly ONE abstract canonical method per op `(params, RequestOptions/RequestContext)` plus a SMALL curated overload set (e.g. `(params)`, and a positional convenience only when there is a single required path param). Skip the positional x params x opts cross-product. Make overload breadth a generator config knob. Lean on Kotlin default args for the Kotlin surface; emit explicit Java overloads only for the curated set. Keep `api/*.api` lean. +- **Page object captures a fresh service instance to self-fetch next pages — we already do this BETTER via strategy + Page.nextPageRequest()** `LEARN` · `both` · effort S · confidence high · we partly do this + - *Verdict:* Their claim is accurate (ModelServiceImpl.kt:129-135 builds `ModelListPage` with `.service(ModelServiceImpl(clientOptions)).params(params).response(it)`; async at :144-151 also captures an executor). But this is a place WE ARE AHEAD, which the analysis correctly identifies. Verified our design: `Paginator` takes an `HttpClient` directly (Paginator.kt:86-93) and is stateless; `Page` is an interface exposing `items`/`hasNext`/`nextPageRequest(): Request?` (Page.kt:33-53) — a thin cursor+items holder that yields the next Request, never a service handle. `PaginatorIterator.advance()` (Paginator.kt:200-231) computes `nextRequest = page.nextPageRequest()` and closes each response after `strategy.parse()`, so no response/service is retained. Their service-capture means every page pins config+transport (a retention surprise for long-lived page refs); our strategy seam avoids it. So weAlreadyDoIt=true -> this is a LEARN that VALIDATES our approach, not an ADOPT. Only action: make the generator wire paginated ops to Paginator + a strategy, emitting Page impls that hold (items, hasNext, nextPageRequest) — explicitly do NOT copy Stainless service-capture. + - *Do:* No core change needed. Generator wires paginated operations to the existing `Paginator` + `PaginationStrategy` (pagination/), emitting `Page` implementations that return a next-page `Request` rather than capturing a service. Document the `Page.nextPageRequest()` contract as the generated-page target. Keep this as a guardrail note in docs/codegen.md ('do not embed a service in pages'). +- **A single `prepare()` funnel folds client+param headers/query/auth so generated service methods stay ~10 lines** `LEARN` · `both` · effort M · confidence high + - *Verdict:* Mechanism verified (PrepareRequest.kt:13-29) but the analysis's precedence summary ('client put, params replace') is incomplete. Actual order on the builder: clear path segments -> Azure path segments -> re-add path segments -> putAllQueryParams(client) -> replaceAllQueryParams(params) -> putAllHeaders(securityHeaders) -> putAllHeaders(client.headers) -> replaceBearerTokenForAzure -> replaceAllHeaders(params). So for HEADERS the layering is security < client < params (client overrides auth header, params override both), interleaved with Azure-specific bearer replacement; the analysis collapsed this to 'client put then params replace' and omitted that security headers go in FIRST and can be overridden by client headers. A real ARCHITECTURAL divergence the analysis understates: their HttpRequest is built from path-segments + baseUrl folded in at prepare; OUR `Request` already carries a fully-resolved `java.net.URL` (Request.kt:43, `url(String)`/`url(URL)` at :137,148) and our `QueryParam` is a stub (`internal class QueryParam { TODO() }`). So a literal `prepare(baseUrl, segments, query)` does not map onto our model — the toolkit funnel for us is 'merge default+per-call headers/query onto an already-formed Request', and query-param merging needs the QueryParam type finalized first. Lesson stands: expose ONE merge point so generated code never hand-merges; the natural home given our pillar-step design is a `RequestDefaultsStep` + the existing AUTH pillar. + - *Do:* Adopt option B (pipeline-native), not their literal prepare(). Generated service builds a bare Request (already-resolved URL or URL template the generator fills) and calls `pipeline.send()`. Add a `RequestDefaultsStep` that merges client-level default headers/query with documented precedence (default < per-call), and let the AUTH pillar inject credentials — generated services MUST NOT stamp auth/defaults. Precondition: finalize `QueryParam`/a `QueryParams` type (currently a stub) before query-merge can be generic. Capture the precedence table in docs. Do NOT copy `prepare`'s reflective Azure model-name extraction (see antipatterns). +- **Deserialize + error-map are cached `Handler` fields on the impl; in our design these are pipeline stages, so the generator emits only a type token** `LEARN` · `both` · effort M · confidence high + - *Verdict:* Accurate on their side: `jsonHandler(jsonMapper)` and `errorHandler(...)` are stored as fields (ModelServiceImpl.kt:58,68,101,139) and chained per call as `errorHandler.handle(response).parseable { response.use { typedHandler.handle(it) } }` (:90-98). JsonHandler.kt:12-20 hard-depends on Jackson (`jsonMapper.readValue(body, jacksonTypeRef())`); ErrorHandler.kt:46-92 is a hardcoded status `when` (400/401/403/404/422/429/5xx/else) building typed exceptions inline. Both FORBIDDEN in our zero-dep core. BUT the analysis's comparison overclaims that 'we have a SERDE pillar' — I checked http/pipeline/steps/ and there is NO serde step (only Auth/Retry/Redirect/Instrumentation/SetDate); CLAUDE.md mentions a SERDE pillar but it is not implemented in code. So the honest framing is: their per-method Handler maps onto a SERDE stage we have NOT built yet + an error-mapping stage. Crucially, we ALREADY have the error-mapping half done better than them: `HttpExceptionFactory.fromResponse(response)` (HttpExceptionFactory.kt:74-102) is a generic status->typed-subclass dispatch over a 16-subclass `HttpException` family whose base derives `retryable` once from `RetryUtils.isRetryable` (HttpException.kt:72) and whose KDoc explicitly anticipates codegen subclasses (HttpException.kt:25-26). The analysis cited only `HttpResponseException` (the IOException-based sibling) and missed the richer family. Lesson: build the SERDE stage (Serde SPI in adapter), keep error-mapping as a generic stage reusing HttpExceptionFactory; generator emits only the success type token. + - *Do:* Generated service passes an abstract type token (NOT a Jackson jacksonTypeRef) to the pipeline; a new SERDE pillar resolves it via the installed `Serde` SPI (decode lives in sdk-serde-jackson, never core). For errors, wrap `HttpExceptionFactory` in a generic error-mapping pipeline step (status range -> existing `HttpException` subclass), deferring typed error-body deserialization to the Serde adapter. Correct the internal note that a SERDE pillar already exists — it does not yet; this finding is partly 'build the stage we claim to have.' + +**Considered & dropped** + +- ~~`clientOptionsWithUserAgent` injected once at the root, then threaded into every service unchanged~~ — Claim is accurate (OpenAIClientImpl.kt:57-66: UA computed once only if user hasn't set User-Agent; original options passed to the sibling async client at :66 so each stamps its own UA; X-Stainless-* telemetry added once in ClientOptions.build() at :634-643). But as a finding it is thin and largely confirmatory: it restates standard layering ('stamp cross-cutting default headers once at construction, not per call') that we already implement via the existing ClientIdentityStep pillar, and its only actionable content ('generator must not emit UA into services') is already covered by the broader pillar-steps principle captured in the 'weAreAhead' synthesis and finding #3. The one genuinely cute detail (pass ORIGINAL options to the sibling client so sync/async UAs differ) is a micro-optimization that only matters if we generate sibling sync+async root clients sharing one options object, which is downstream of decisions in findings #4/#5. Not decision-ready on its own; folded into #4 (client tree emission) and the weAreAhead note. Dropping to keep the set substantive. + +**Do not copy** + +1. ERRORPRONE DEPENDENCY INSIDE CORE: `import com.google.errorprone.annotations.MustBeClosed` (ModelService.kt:5, applied to all 17 raw methods). This drags a third-party annotation lib into the generated SERVICE layer purely to make `@MustBeClosed` static-analysis fire on raw `HttpResponseFor` methods. For us this violates sdk-core's zero-dep rule. If we want the 'raw response must be closed' signal, use a KDoc contract + our own marker, or rely on our `Response : Closeable` + `use {}` discipline — do NOT pull errorprone into core or into generated code. + +2. REFLECTION IN THE HOT REQUEST PATH: `Params.modelNameOrNull()` (PrepareRequest.kt:42-56) does `this::class.declaredFunctions.find { it.name == "model" }?.call(this)` on EVERY request to extract the model name for Azure path routing, swallowing all exceptions. Reflection-per-request is both a perf cost and a swallow-everything correctness hazard (docs/refs-comparison.md:447 already bans 'Reflection-driven serialization'). Our generated params expose typed projections (the `OperationParams` SPI above) — the path/model value must be a typed field, never reflected. + +3. 21-FIELD `ClientOptions` GOD-OBJECT (ClientOptions.kt:34-140): transport + jsonMapper + credential + Azure config + 3 timeouts + retries + log level + organization/project/webhookSecret all in one class, with provider-specific concerns (Azure, OpenAI org/project) hardcoded. For a TOOLKIT this is the wrong shape — provider specifics must not exist in core, and mixing transport/serde/auth/retry into one options bag couples concerns our pipeline deliberately separates into steps. Keep our split config; the generator should target a neutral config seam, not replicate this bag. + +4. OVERLOAD CROSS-PRODUCT (12 per op, see finding): 48 signatures/op across raw×cooked×sync×async is API-surface bloat our binary-compat validator must freeze forever. Antipattern for us specifically because of explicit-API + apiCheck. Emit a curated subset. + +5. AUTH AS INLINE HEADER-STAMPING, NOT A COMPOSABLE STEP (ClientOptions.securityHeaders ClientOptions.kt:736-779, called inside `prepare`): auth is a big `when (credential is ...)` that mutates headers during request prep, with the challenge/refresh story bolted onto wrapper HttpClients. This is less composable than our `AuthStep` pillar with `handleChallenge` (refs-comparison.md:181). Don't let generated services stamp auth inline; keep it in the AUTH pillar. + +6. `responseValidation!!` NON-NULL ASSERTION (ModelServiceImpl.kt:94, also `requestOptions` shadowing the param at :88) — works only by convention that `applyDefaults` populated it. Generated code asserting non-null on a nullable config field is fragile; our explicit-API discipline should make these fields non-null by construction. + +7. STAINLESS `_`-PREFIX PSEUDO-PRIVACY (`_headers()`, `_body()`, `_pathParam()`): these read as 'internal' but are public Java-callable methods. Our conventions mandate real `internal`/`@JvmSynthetic` — don't emit `_`-prefixed public API as a privacy signal. + +**Where we're ahead** + +Our PIPELINE-AS-STEPS architecture is structurally cleaner than their decorate-the-HttpClient + inline-in-service approach for the cross-cutting concerns. They bake retry/logging into wrapper `HttpClient`s at `ClientOptions.build()` (RetryingHttpClient wraps LoggingHttpClient wraps transport, ClientOptions.kt:678-690) AND do auth+serde inline in every generated service — so those concerns are split across two unrelated mechanisms and partly duplicated into generated code. We unify ALL of it (RETRY/AUTH/LOGGING/SERDE) as ordered pillar steps in one pipeline (http/pipeline/steps/), so a generated service stays genuinely thin (build Request → `pipeline.send()`) and never references retry/auth/serde. Concretely ahead: (a) our `HttpClient`/`AsyncHttpClient` are zero-dep `fun interface` SPIs with a documented close/cancellation contract (client/HttpClient.kt:46-62, AsyncHttpClient.kt:65-83); theirs is a Jackson+okhttp-coupled core. (b) Our `Request`/`Response` are `@ConsistentCopyVisibility data class` + private ctor + `newBuilder()` (Request.kt:43, Response.kt:43) — their `HttpRequest` is a hand-rolled builder with no value semantics and `HttpResponse` exposes a raw `InputStream` (HttpResponse.kt:21) rather than our seam-able `ResponseBody`/`Source`. (c) Our `HttpResponseException` precomputes `isRetryable` at construction (HttpResponseException.kt) and is designed as the `open` base for a typed-exception family — a cleaner foundation than their flat status-`when` that rebuilds typed exceptions inline per service (ErrorHandler.kt:46-93). (d) We hold the line on zero-dep core; their service+core layers hard-depend on Jackson + errorprone + okhttp, which is exactly the coupling our toolkit exists to avoid. The ONE thing they have and we don't yet — a generated service/client layer at all — is a gap to fill, not a place they're architecturally better. + +_Verifier notes:_ All 6 reference files plus PrepareRequest.kt, HttpResponseFor.kt, Params.kt, RequestOptions.kt, ErrorHandler.kt, JsonHandler.kt, ClientOptions.kt re-read line-by-line; every openai-java claim in the analysis is ACCURATE. Confirmed our SDK has NO service layer, no Params SPI, no ResponseFor/parseable, no withOptions, no responseValidation (exact grep over sdk-core returns nothing but substring noise in retry/instrumentation files). + +CORRECTIONS to the analysis worth surfacing to the parent: +1. The analysis comparison asserts 'we have a SERDE pillar' and lists `DefaultRetryStep`/`DefaultInstrumentationStep` as existing pillars — but http/pipeline/steps/ contains NO serde step (only Auth/Retry/Redirect/Instrumentation/SetDate). The SERDE stage is aspirational (CLAUDE.md mentions it; code does not implement it). Finding #3 is therefore partly 'build the stage we claim to have,' not 'mirror an existing one.' +2. The analysis cites only `HttpResponseException` (http/response/, extends IOException) as our typed-exception base. The stronger asset it MISSED is `HttpException` (http/response/exception/, extends RuntimeException) with a full 16-subclass per-status family + `HttpExceptionFactory.fromResponse()` generic dispatch (HttpExceptionFactory.kt:74-102) whose base derives `retryable` once from RetryUtils (HttpException.kt:72) and whose KDoc explicitly anticipates codegen subclasses (HttpException.kt:25-26). This is a real head start on the error-mapping stage and on the per-operation typed exceptions in refs-comparison.md. +3. Architectural divergence the analysis understated: their `HttpRequest` is path-segments + baseUrl folded at `prepare()`; our `Request` carries a fully-resolved `java.net.URL` (Request.kt:43; url(String)/url(URL) at :137,148) and `QueryParam` is a STUB (`internal class QueryParam { TODO() }`). A literal `prepare()` does not map onto our model, and the Params SPI / RequestDefaults merge cannot finalize query handling until QueryParam(s) is built. Finding #9 (Params SPI) and #2 (prepare funnel) both depend on finalizing that stubbed type — sequence it first. +4. `withOptions` (#5) precondition is firmer than stated: `Configuration` has no `toBuilder()` and `ConfigurationBuilder` has no `from()`, so there is no single cloneable client-config object today; the finding is gated on a config-consolidation decision that cuts against our deliberate split-config design. +5. Per-call overrides (#10): `HttpClient.execute(request)` takes ONLY a Request (no options channel), so any RequestOptions analog must ride the pipeline, not the transport. `RequestContext` is currently instrumentation-only, but the recovery-aware `ExecutionPipeline.execute(request, context)` already threads a context seam — the decision is which context becomes THE per-call carrier. + +ANTIPATTERNS (all 7 in the analysis verified accurate): errorprone @MustBeClosed dragged into the generated service layer (ModelService.kt:5,126 etc.); reflection-per-request in `Params.modelNameOrNull()` (PrepareRequest.kt:42-56, swallows all exceptions); 21-field ClientOptions god-object with Azure/org/project hardcoded (ClientOptions.kt:34-140); overload cross-product (~12/op); auth as inline header-stamping in `securityHeaders` (ClientOptions.kt:736-779) wrapped onto decorator HttpClients rather than a composable step; `responseValidation!!` non-null assertion + param shadowing (ModelServiceImpl.kt:88,94); `_`-prefix pseudo-privacy on public Java-callable methods (Params.kt). Our conventions (zero-dep core, pillar steps, explicit-API, HttpExceptionFactory) already avoid all seven — these are guardrails for the generator, not changes to make. + +'weAreAhead' is broadly correct (pipeline-as-steps unifies retry/auth/logging/serde vs their split decorate-HttpClient + inline-in-service; zero-dep fun-interface SPIs; value-semantic Request/Response; strategy-based pagination that avoids service-capture). Two caveats: it overstates the SERDE pillar (not built) and undersells our exception story by citing the weaker of our two exception bases." + +--- + +## 16. okhttp transport implementation + +**What it is** + +openai-java's okhttp module is a thin adapter (one file, OkHttpClient.kt, 357 lines) plus two near-identical generated builder shells (OpenAIOkHttpClient/Async, ~460 lines each, 95% boilerplate Optional-overloads forwarding to ClientOptions.Builder). The adapter implements ONE interface, HttpClient (sync execute + async executeAsync on the same type, HttpClient.kt:7), translating HttpRequest↔okhttp3.Request and okhttp3.Response→HttpResponse via top-level extension functions (toRequest/toUrl/toRequestBody/toHttpResponse at OkHttpClient.kt:235-356). Crucially their architecture pushes retry, logging, lifecycle-safety, and per-request timeout OUT of the transport and into DECORATORS in core/http (RetryingHttpClient, LoggingHttpClient, PhantomReachableClosingHttpClient, WorkloadIdentityHttpClient) plus a RequestOptions param threaded through the SPI. The transport stays dumb: it only knows how to send one request and, per call, rebuild the OkHttpClient to apply per-request timeout overrides (newCall at OkHttpClient.kt:92-105). Our sdk-transport-okhttp is structurally richer and safer: OkHttpTransport implements BOTH HttpClient and AsyncHttpClient with explicit ownership-aware close(), interrupt restoration, four dedicated body adapters (Request/Response/SdkRequestBody/FileRequestBody), a RestrictedHeaders drop-list, and a proxy/non-proxy-host selector — but it omits two things they have: a per-request timeout seam and a phantom-reachable leak safety net, and it carries one latent correctness bug they explicitly guard. + +**How it works (line-level)** + +PER-REQUEST TIMEOUT VIA CLIENT REBUILD: OkHttpClient.newCall (OkHttpClient.kt:92-105) does `val clientBuilder = okHttpClient.newBuilder(); requestOptions.timeout?.let { clientBuilder.connectTimeout(it.connect())...callTimeout(it.request()) }; val client = clientBuilder.build(); return client.newCall(request.toRequest(client))`. okhttp's newBuilder() is a cheap shallow copy (shares dispatcher/pool/interceptor lists), so a per-call rebuild is acceptable; it is ONLY entered to apply a per-request Timeout — note the rebuilt `client` is also passed into toRequest() to inject X-Stainless-Read-Timeout/X-Stainless-Timeout headers (OkHttpClient.kt:244-259). EMPTY-BODY GUARD: toRequest (OkHttpClient.kt:235-239) `var body = body?.toRequestBody(); if (body == null && requiresBody(method)) body = "".toRequestBody()` — requiresBody returns true for POST/PUT/PATCH (OkHttpClient.kt:265-271). I disassembled okhttp-jvm-5.2.1 Request$Builder.method: at bytecode offset 64 it calls HttpMethod.requiresRequestBody and athrows IllegalArgumentException when the method needs a body but body==null — so this guard is load-bearing. STREAMING BODY (no buffering): toRequestBody (OkHttpClient.kt:283-296) returns an anonymous okhttp RequestBody whose `isOneShot() = !repeatable()` and `writeTo(sink) = writeTo(sink.outputStream())` — streams through an OutputStream shim, never buffers. RESPONSE STREAMING: toHttpResponse (OkHttpClient.kt:338-350) `body(): InputStream = body!!.byteStream()`, `close() = body!!.close()` — lazy stream, no buffering. ASYNC: executeAsync (OkHttpClient.kt:57-84) creates CompletableFuture, call.enqueue with a Callback completing/completingExceptionally, then `future.whenComplete { _, e -> if (e is CancellationException) call.cancel(); request.body?.close() }`. RETRY DECORATOR re-executes the SAME HttpRequest object in a while(true) loop (RetryingHttpClient.kt:44-73) and `response?.close()` before backoff (line 71) — relies on the body being repeatable to re-send. PHANTOM SAFETY NET: PhantomReachableClosingHttpClient (PhantomReachableClosingHttpClient.kt:12-26) wraps the client and calls closeWhenPhantomReachable(this, httpClient) in init; the impl (PhantomReachable.kt:31-56) reflectively resolves java.lang.ref.Cleaner.create()/register() and returns null (no-op) on Java 8 — so it self-degrades on our floor JDK. CLOSE: OkHttpClient.close (OkHttpClient.kt:86-90) unconditionally shuts the dispatcher, evicts the pool, closes the cache — no ownership flag, no idempotency latch, no try/catch. + +**vs. our SDK** + +PER-REQUEST TIMEOUT — THEY HAVE IT, WE DON'T: their SPI carries RequestOptions (HttpClient.kt:9-22) and the transport applies timeout.connect/read/write/request per call (OkHttpClient.kt:95-101). Our SPI is a bare fun interface HttpClient { execute(Request): Response } (sdk-core/.../client/HttpClient.kt:46-51) and AsyncHttpClient (AsyncHttpClient.kt:65-72) — no per-request options at all. Our OkHttpTransport.Builder fixes timeouts once at build (OkHttpTransport.kt:304-307); a caller cannot override the read timeout for a single slow streaming call without building a second transport. EMPTY-BODY BUG — THEY GUARD, WE DON'T: our RequestAdapter.adapt (internal/RequestAdapter.kt:55-56) does `val okhttpBody = request.body?.let { toOkHttpBody(it) }; builder.method(request.method.method, okhttpBody)` — a POST/PUT/PATCH with null SDK body passes null to okhttp's method(), which (proven by the 5.2.1 bytecode above) throws IllegalArgumentException. All seven POST tests in OkHttpTransportTest.kt (lines 93,125,178,210,569,628,659) supply a body, so the path is untested. PHANTOM SAFETY NET — THEY HAVE IT, WE DON'T: grep for Cleaner/PhantomReference/closeWhenPhantomReachable across our sdk-core and transport returns nothing. STREAMING — TIE, slight edge to us: both stream request bodies via OutputStream shim (their toRequestBody OkHttpClient.kt:294 vs our SdkRequestBodyAdapter.writeTo internal/SdkRequestBodyAdapter.kt:47-56), but we ADD a zero-copy file path (FileRequestBodyAdapter.kt:45-54 opens okio FileHandle and does sink.write(source, count), honoring position/count) that they lack entirely — they would route a file through the generic OutputStream. INTERRUPT HANDLING — WE WIN: our execute catches InterruptedIOException, re-asserts Thread.currentThread().interrupt(), rethrows (OkHttpTransport.kt:93-96); their execute (OkHttpClient.kt:48-54) catches only IOException, wraps as OpenAIIoException, and NEVER restores the interrupt flag — InterruptedIOException is an IOException subtype so it gets silently swallowed into a generic wrapper. ASYNC CANCELLATION RACE — WE WIN: our onResponse closes the adapted Response if future.complete returns false (OkHttpTransport.kt:130-132) — they never handle the adapt-vs-cancel race, so a response adapted after the user cancelled leaks its socket. CLOSE — WE WIN decisively: our close is ownership-aware (no-op for BYO clients, OkHttpTransport.kt:187-189), idempotent via AtomicBoolean CAS (line 184), and each of the three shutdown steps is individually try/caught and logged (lines 195-220); theirs (OkHttpClient.kt:86-90) is none of those and would close a user-supplied client. PROXY — WE WIN: we map ProxyOptions to HTTP/SOCKS, honor nonProxyHosts via a ProxySelector (OkHttpTransport.kt:396-417), and warn when an unsupported challengeHandler is set (lines 343-351); they only support a single Proxy + a ProxyAuthenticator SAM (OkHttpClient.kt:181-193). REASON-PHRASE/PROTOCOL/STATUS — WE WIN on fidelity: we preserve protocol via toSdkProtocol incl. QUIC/H2_PRIOR_KNOWLEDGE (ResponseAdapter.kt:96-104), reason phrase (message), and total Status.fromCode that never throws on vendor codes (Status.kt:207); their toHttpResponse keeps only statusCode/headers/body (HttpResponse.kt) — no protocol, no reason phrase exposed at the transport boundary. + +**Recommendations (verified)** + +- **POST/PUT/PATCH with a null body throws IllegalArgumentException from OkHttp — add an empty-body guard in RequestAdapter** `COPY` · `both` · effort S · confidence high + - *Verdict:* VERIFIED AND CONFIRMED AS A REAL BUG. I decompiled okhttp-jvm-5.0.0 (the exact version pinned in gradle/libs.versions.toml:6), not the 5.2.1 the original author cited. okhttp3.Request$Builder.method(String, RequestBody) bytecode offsets 59-121: `aload_2; ifnonnull 122` then if body==null calls HttpMethod.requiresRequestBody(method); if true it athrows IllegalArgumentException with literal message 'method must have a request body.'. HttpMethod.requiresRequestBody returns true for POST/PUT/PATCH (also PROPPATCH/REPORT, irrelevant to us). Our RequestAdapter.kt:55-56 does `request.body?.let{toOkHttpBody(it)}` then `builder.method(request.method.method, okhttpBody)` — passes null for a bodyless POST. Our Request.RequestBuilder.build() (Request.kt:252-258) does NOT validate that POST has a body — body is a plain nullable field — so a bodyless POST is constructible by any caller. Confirmed every POST/PUT/PATCH test in OkHttpTransportTest.kt (lines 93,125,178,210,569,628,659) supplies a body, so the crash path is untested. The exception is IllegalArgumentException, NOT IOException, so it bypasses the pipeline's IOException-oriented retry/error handling. The openai-java guard is real (OkHttpClient.kt:237-239 `if (body == null && requiresBody(method)) body = "".toRequestBody()` + requiresBody at 265-271). Category COPY is defensible — the 3-line guard is liftable near-verbatim — though it is fundamentally a correctness fix, not a portability nicety. NUANCE the original missed: openai-java's requiresBody covers only POST/PUT/PATCH; okhttp itself also requires bodies for PROPPATCH/REPORT, but neither is in our Method enum, so mirroring just {POST,PUT,PATCH} is correct for us. + - *Do:* In RequestAdapter.adapt, when request.body is null AND request.method is in {POST,PUT,PATCH}, pass an empty okhttp RequestBody (ByteArray(0).toRequestBody(null)) instead of null; otherwise keep null for bodyless GET/HEAD/etc. Add a regression test: POST with no body succeeds and sends a Content-Length: 0 request rather than throwing. Audit sdk-transport-jdkhttp for the analogous path (java.net.http.HttpRequest.BodyPublishers.noBody() is the equivalent — but verify, the JDK client does NOT throw on bodyless POST, so the jdkhttp transport may already be correct; do not blindly add a guard there). Apache-2.0 attribution is not warranted for a 3-line idiom this generic. +- **Cleaner-based phantom-reachable close as a last-resort leak backstop that no-ops on Java 8** `COPY` · `sdk-core` · effort M · confidence medium + - *Verdict:* ACCURATE and the technique fits our constraints precisely. PhantomReachable.kt:31-56 resolves java.lang.ref.Cleaner entirely by reflection (Class.forName + getMethod("create")/("register")) inside a lazy, and on Java 8 the Class.forName throws ReflectiveOperationException which is caught to yield a null function — the whole feature degrades to a no-op with no new dependency. PhantomReachableClosingHttpClient.kt:12-15 registers (this -> httpClient::close) in its init. I confirmed our sdk-core and sdk-transport-okhttp contain NO Cleaner/PhantomReference/Cleaner-based close (the one grep hit, ContextStore.kt:41, is an unrelated WeakReference KDoc). The caveat the original flagged is real and load-bearing: PhantomReachable.kt:15-17 has `check(observed !== closeable)` precisely because a Cleaner action that captures the observed object pins it and it never becomes phantom-reachable — any port MUST register the closeable/lambda WITHOUT capturing the wrapper being observed. PROPER SKEPTICISM: this is strictly a backstop and explicitly inferior to deterministic close, which we ALREADY do better than openai-java (idempotent + ownership-aware). Its marginal value for us is narrow because (a) Cleaner timing is nondeterministic — it does NOT bound fd/thread lifetime, only guarantees eventual reclamation, and (b) it must never fire on a BYO client. So the value is 'turns a forgotten SDK-owned transport's silent thread/socket leak into eventual GC-time reclamation', not 'fixes leaks'. Worth doing, but lower priority than findings 1-2. Category COPY is right (the file is liftable near-verbatim). + - *Do:* Add an internal sdk-core util closeWhenPhantomReachable(observed: Any, closeable: AutoCloseable) ported from PhantomReachable.kt (reflective Cleaner, no-op on 8, keep the observed!==closeable check, strip OpenAIException and use IllegalStateException). Wire it ONLY for owned=true transports: in OkHttpTransport, when built via builder(), self-register so a never-closed SDK-owned client is eventually reclaimed; NEVER register for create() (BYO). Make the action capture only the closeable, never `this`. Attribution: openai-java is Apache-2.0 — if PhantomReachable.kt is lifted near-verbatim, add a short Apache-2.0 attribution note alongside the MIT header on that one file. +- **No per-request timeout override — the transport SPI cannot tune a single slow call** `ADOPT` · `sdk-core` · effort L · confidence medium + - *Verdict:* ACCURATE. openai-java's SPI carries RequestOptions with a Timeout (HttpClient.kt:9-22, defaulted to RequestOptions.none()), and OkHttpClient.newCall (OkHttpClient.kt:92-104) applies connect/read/write/call timeouts per call via okHttpClient.newBuilder().*Timeout(...).build(). Timeout.kt confirms the 4-phase model (connect=1min, read/write default to request(), request=10min). Our HttpClient.execute(Request) and AsyncHttpClient.executeAsync(Request) take only a Request (HttpClient.kt:51, AsyncHttpClient.kt:72); timeouts are frozen at OkHttpTransport.Builder.build() (OkHttpTransport.kt:304-307). So a caller cannot lengthen the read timeout for one streaming/SSE call without building a second transport, losing connection-pool reuse. This is a genuine gap and correctly a sdk-core SPI (toolkit-contract) decision, dependency-free (java.time.Duration is Java 8). HONEST COUNTERWEIGHTS the original undersold: (1) this is a sweeping change — it churns BOTH fun interfaces (SAM ergonomics + apiCheck/apiDump), every transport, AND every pipeline layer that calls execute, because per-request options must be threaded end-to-end to be useful, not just bolted onto the transport. openai-java threads RequestOptions through their entire core for exactly this reason. (2) For a toolkit, an arguably cleaner placement is to let per-request tuning ride the existing context chain (CallContext→ExchangeContext) rather than widening the SPI signature. The original treated 'overload the SPI' as the obvious design; it is one option, and the heavier one. Keep ADOPT, but scope it as a design decision, not a quick add. + - *Do:* Treat as a design item, not a drop-in. Decide the seam first: (a) widen the SPI with execute(Request, RequestOptions) + a defaulted none() overload to preserve SAM literals, or (b) carry a per-request Timeout on the context chain the transport already receives indirectly. Model the value as an immutable core Timeout/RequestOptions (Duration? per phase, private ctor + Builder, none() sentinel) with ZERO new deps. In OkHttpTransport, apply an override via client.newBuilder().*Timeout(...).build().newCall(...) only when present (see the rebuild-gating finding). Note effort is L, not M: the value type is trivial but threading it through pipelines + two transports + apiDump is the real cost. +- **When per-request timeouts land, gate the okhttp client rebuild behind 'override present' (do NOT copy their unconditional rebuild)** `SIMPLIFY` · `sdk-core` · effort S · confidence low · we partly do this + - *Verdict:* ACCURATE but CONTINGENT and largely redundant with the per-request-timeout finding. Verified OkHttpClient.newCall (OkHttpClient.kt:92-104) ALWAYS calls okHttpClient.newBuilder()...build() and client.newCall(...) even when requestOptions.timeout is null — the timeout application is conditional (inside `timeout?.let{}`) but the newBuilder().build() allocation is not. The original's stated reason is correct: they must pass the rebuilt client into toRequest(client) to read client.readTimeoutMillis/callTimeoutMillis for the X-Stainless-Read-Timeout / X-Stainless-Timeout headers (OkHttpClient.kt:244-259), so they need a client handle on every call regardless. We have neither vendor headers nor (yet) per-request timeouts, and our current OkHttpTransport.execute already takes the optimal path: client.newCall(adapt(request)) directly on the shared client (OkHttpTransport.kt:88-89), no rebuild. So the 'do NOT follow theirs' direction is right, but there is NOTHING to change today — this is purely a constraint on HOW we implement the ADOPT finding, and that finding's own recommendation ALREADY says 'skip the rebuild when no override is present.' This is a sub-bullet of the timeout work, not a standalone item; I keep it only to pin the implementation detail and the test obligation. Direction is the rare 'theirs is more wasteful' case, correctly identified. + - *Do:* Fold into the per-request-timeout work as an implementation constraint: branch in OkHttpTransport.execute/executeAsync — if no timeout override -> client.newCall(adapt(request)) on the shared client (current behavior, preserves pool reuse); else -> client.newBuilder().*Timeout(...).build().newCall(...). Add a test asserting the shared-client fast path is taken when no override is set (e.g. assert no new dispatcher/client is constructed). Not actionable on its own. +- **Decorator-over-SPI vs pipeline-over-SPI placement, and the shared body-replayability invariant** `LEARN` · `docs/process` · effort S · confidence low · we partly do this + - *Verdict:* ACCURATE but largely CONFIRMATORY — we already do the substantive part. openai-java keeps the okhttp adapter dumb and layers retry/logging/auth/leak-guard as HttpClient->HttpClient decorators (RetryingHttpClient.kt:26-33, PhantomReachableClosingHttpClient.kt:12, LoggingHttpClient, WorkloadIdentityHttpClient) chained in ClientOptions. We instead own these in the pipeline above the SPI (confirmed: DefaultRetryStep/RetryStep, DefaultRedirectStep, AuthStep, instrumentation steps all exist under http/pipeline/steps/ and pipeline/step/retry/). The 'shared invariant' the original highlighted — any retry layer may only re-send a replayable body — we ALREADY enforce on BOTH sides: pipeline/step/retry/RetryStep.kt:354 returns body.isReplayable() to gate retry, and SdkRequestBodyAdapter.isOneShot()==!isReplayable() (line 45) governs whether okhttp-internal retry may re-send. openai-java's RetryingHttpClient.isRetryable (RetryingHttpClient.kt:140-143) keys off request.body?.repeatable() — the same idea. So there is no capability gap and no redesign; the only deliverable is a one-paragraph architecture note. This is the weakest of the six findings — keep it only as a low-value docs breadcrumb. Do NOT adopt their decorator stack: it would duplicate pipeline responsibilities and re-introduce SPI coupling we deliberately avoid (correctly noted by the original). + - *Do:* Optional. If documenting, add one short paragraph to docs/pipelines.md contrasting decorator-over-SPI (openai-java) vs pipeline-over-SPI (us) and stating the invariant we already enforce: any retry layer (okhttp-internal via isOneShot, or pipeline via RetryStep) may only re-send a body whose isReplayable() is true; cite our retryOnConnectionFailure(false) rationale at OkHttpTransport.kt:311-317. Low priority — no code change, and the invariant is already implemented and commented in-code. +- **Generated okhttp client-builder is ~95% nullable+Optional+primitive overload boilerplate — codegen should template it, not hand-write it** `LEARN` · `codegen` · effort M · confidence medium + - *Verdict:* ACCURATE. Verified OpenAIOkHttpClient.kt and OpenAIOkHttpClientAsync.kt are both exactly 463 lines and differ only in OpenAIClient vs OpenAIClientAsync return types. Confirmed the overload triplet first-hand: maxIdleConnections(Int?) (the nullable primary), maxIdleConnections(Int) ('unboxed primitive overload exists for backwards compatibility'), and maxIdleConnections(Optional) ('orElse(null)') at OpenAIOkHttpClient.kt:~109-125, every knob delegating to clientOptions/the OkHttpClient.Builder. This is machine output and the right thing for a per-API client. The two takeaways are sound: (1) the nullable+Optional+primitive triplet is the Java-ergonomic surface worth templating once; (2) sync/async builders being near-duplicates argues for one codegen template parameterized by client type. Correctly scoped to FUTURE CODEGEN only — our hand-written OkHttpTransport.Builder rightly avoids this bloat (it is a toolkit transport, not a per-API client) so this does NOT touch sdk-core. PROPER CALIBRATION: this is a genuinely useful codegen note but it is not novel — it is the standard Stainless/Java-builder idiom; treat as a checklist item for docs/refs-comparison.md, not a discovery. One caveat for when we build it: the Optional overloads exist for Java fluency, but our zero-dep/Kotlin-first core has no Optional in its own builders — keep Optional overloads confined to GENERATED Java-facing client modules so they never leak into sdk-core or imply an Optional dependency-style there. + - *Do:* Record in docs/refs-comparison.md (codegen section): the KotlinPoet client-builder template should emit the nullable + Optional + primitive overload triplet per config knob, and share ONE template across sync/async via a client-type parameter (avoid two emitters). Keep these overloads strictly in generated client modules; never replicate in sdk-core. Watch the binary-compatibility validator on the generated surface. + +**Considered & dropped** + +- ~~Antipatterns block (interrupt restoration, non-idempotent/non-ownership-aware close, unguarded close steps, forgiving-SPI assumption, Jackson/okhttp in core)~~ — Not standalone findings — these are 'we are ahead' contrasts, not recommendations to change our code. All FIVE are accurate after re-reading: (1) their execute (OkHttpClient.kt:50-51) catches only IOException and wraps as OpenAIIoException without Thread.interrupt() restoration; since InterruptedIOException is an IOException subtype the interrupt is swallowed — our execute (OkHttpTransport.kt:93-96) restores and rethrows, which is correct. (2) their close (OkHttpClient.kt:86-90) has no closed-latch and no owned flag — our AtomicBoolean CAS + owned guard (OkHttpTransport.kt:184-189) is correct. (3) their three close steps are unguarded — our per-step try/catch+log (OkHttpTransport.kt:195-220) is correct. The verdicts are right and worth keeping as guardrails ('do not regress to their shape'), but they prescribe no action and are folded into the notes. ONE SHARPENING for the record: the 'their close would close a BYO client' critique is overstated for openai-java's OWN architecture — OpenAIOkHttpClient.build() (lines 444-461) always constructs the okhttp client internally and never accepts a caller-supplied okhttp3.OkHttpClient, so their unconditional close is safe for THEM; it is only a hazard if the close() shape were lifted into a toolkit with a first-class BYO path like our create(). Kept as a caution, not a finding. +- ~~weAreAhead block (interrupt handling, close contract, async cancel/adapt race, zero-copy file upload, protocol/reason-phrase/status fidelity, richer proxy)~~ — Accurate but not actionable — these are strengths to preserve, not changes. Spot-verified each: async cancel/adapt race handled at OkHttpTransport.kt:130-132 (close adapted Response when future.complete loses to cancel) and ResponseAdapter.kt:80-86 (close okhttp response on any adaptation throw); zero-copy file path at FileRequestBodyAdapter.kt:45-54 (okio FileHandle + sink.write(source, count) honoring position/count) which openai-java lacks (their toRequestBody at OkHttpClient.kt:294 routes everything through sink.outputStream()); protocol fidelity incl. QUIC/H2_PRIOR_KNOWLEDGE at ResponseAdapter.kt:96-104 + reason phrase via message at line 76; total non-throwing Status.fromCode used at ResponseAdapter.kt:75; RestrictedHeaders drop-list (content-length/host/transfer-encoding) at RestrictedHeaders.kt:27-32; proxy with nonProxyHosts ProxySelector + unsupported-challengeHandler warning at OkHttpTransport.kt:339-417. All confirmed superior to openai-java's status-and-headers-only adapter. No recommendation attached, so not a decision-ready finding — captured as context in notes. + +**Do not copy** + +1) NO INTERRUPT RESTORATION (correctness): their execute (OkHttpClient.kt:48-54) catches only IOException and wraps it as OpenAIIoException, never restoring Thread.interrupt(). Since InterruptedIOException IS an IOException, an interrupted blocking read is silently re-labeled as a generic 'Request failed' with the interrupt flag cleared — exactly the Loom-hostile behavior our convention forbids (our OkHttpTransport.kt:93-96 does it right). Do NOT copy their catch shape. 2) NON-IDEMPOTENT, NON-OWNERSHIP-AWARE close() (resource safety): OkHttpClient.close (OkHttpClient.kt:86-90) unconditionally shuts the dispatcher, evicts the pool, and closes the cache with no closed-latch and no owned flag. For us this is doubly wrong: it would close a BYO client (violating our ownership contract) and would NPE/throw on a second close. Our AtomicBoolean+owned guard (OkHttpTransport.kt:183-221) is the correct model — do not regress to theirs. 3) UNGUARDED close() steps (resource safety): they wrap none of the three shutdown calls in try/catch, so a SecurityException on executorService.shutdown() would skip evictAll() and cache.close(), leaking sockets/fds. Our per-step try/catch+log (OkHttpTransport.kt:195-220) is the right pattern. 4) DEPENDING ON A FORGIVING SPI CONTRACT: their toRequest empty-body guard exists because okhttp throws on null-body POST — a reminder that an adapter must defend against the transport's strictness, not assume the model layer is always well-formed (we currently DON'T defend — see the COPY finding, that's the bug). 5) JACKSON/okhttp HARD-WIRED INTO CORE (architecture, already known): RequestOptions.responseValidation, JsonMapper, Jackson-version checks all leak into the okhttp builder (OpenAIOkHttpClient.kt:197-207) — fine for a single-API client, an antipattern for us; keep serde out of the transport entirely (we already do). + +**Where we're ahead** + +Concretely ahead in five places. (1) INTERRUPT HANDLING: we catch InterruptedIOException, re-assert the interrupt, and rethrow (OkHttpTransport.kt:93-96); they swallow it into OpenAIIoException with the flag cleared (OkHttpClient.kt:50-51). (2) close() CONTRACT: ours is idempotent (AtomicBoolean CAS, OkHttpTransport.kt:184), ownership-aware (no-op for BYO, lines 187-189), and per-step fault-isolated+logged (lines 195-220); theirs (OkHttpClient.kt:86-90) is none of these and would close a user's client. (3) ASYNC CANCELLATION RACE: we close the adapted Response when future.complete loses to a cancel (OkHttpTransport.kt:130-132) and our ResponseAdapter closes the okhttp response on any adaptation failure (ResponseAdapter.kt:80-86), so no socket leaks on the cancel/adapt or adapt-throws races; their executeAsync (OkHttpClient.kt:57-84) has no such guard. (4) ZERO-COPY FILE UPLOAD: FileRequestBodyAdapter (FileRequestBodyAdapter.kt:45-54) streams via okio FileHandle with position/count for byte-range uploads; they have no file-specific path and would push files through a generic OutputStream. (5) FIDELITY + EXTENSIBILITY: we preserve protocol incl. QUIC/H2_PRIOR_KNOWLEDGE and the reason phrase (ResponseAdapter.kt:96-104), surface vendor status codes via a total non-throwing Status.fromCode (Status.kt:207), drop only the three connection-managed headers via an explicit RestrictedHeaders list (RestrictedHeaders.kt:27-32), and support SOCKS/HTTP proxies with nonProxyHosts + a discoverable warning for unsupported challenge handlers (OkHttpTransport.kt:339-417) — all richer than their single-Proxy, status-and-headers-only adapter. The one place they're ahead on capability (not quality) is the per-request timeout seam and the phantom-reachable backstop, both captured as findings. + +_Verifier notes:_ VERIFICATION METHOD: read all three cited openai-java files line by line (OkHttpClient.kt, OpenAIOkHttpClient.kt, OpenAIOkHttpClientAsync.kt) plus the dependencies the findings rely on (PhantomReachable.kt, PhantomReachableClosingHttpClient.kt, Timeout.kt, core/http/HttpClient.kt, RetryingHttpClient.kt); read all six of our okhttp transport sources and the relevant sdk-core SPIs/models/retry steps. For the load-bearing empty-body claim I decompiled okhttp-jvm-5.0.0 (our PINNED version per libs.versions.toml, not the 5.2.1 the original author used) and confirmed Request$Builder.method athrows IllegalArgumentException when body==null and HttpMethod.requiresRequestBody(method)==true (POST/PUT/PATCH).\n\nNET: of six findings, all six are technically accurate, but their VALUE is uneven. KEEP and prioritize: (1) empty-body guard — a real, untested correctness bug in OUR code (RequestAdapter.kt:55-56 + permissive Request builder), high confidence, S effort. (2) per-request timeout — genuine SPI gap, but L effort not M (threads through pipelines + both transports + apiDump), a design decision not a quick add. (3) phantom-reachable backstop — fits our Java-8 reflection-degrade constraint, but strictly a nondeterministic backstop secondary to our already-superior explicit close; medium confidence. DOWNGRADE: (4) decorator-vs-pipeline is confirmatory — we ALREADY enforce the shared body-replayability invariant on both okhttp-internal (isOneShot) and pipeline (RetryStep.kt:354) sides; docs-only, low value. (5) codegen builder boilerplate is the standard Stainless idiom, useful codegen checklist note, not a discovery. (6) conditional-rebuild SIMPLIFY is a sub-bullet of finding 2 with nothing to change today (our execute already takes the no-rebuild fast path). DROPPED the antipatterns and weAreAhead blocks as non-actionable, but they verified accurate with ONE correction: openai-java's own OpenAIOkHttpClient never accepts a BYO okhttp client (build() at lines 444-461 always constructs internally), so their unconditional/ownership-blind close() is safe FOR THEM and only becomes a hazard if its shape is lifted into a toolkit with a BYO path like our create() — the original overstated this slightly.\n\nKEY FILE:LINE EVIDENCE — BUG: RequestAdapter.kt:55-56 (null body passed to method), Request.kt:252-258 (no POST-body validation), okhttp 5.0.0 Request$Builder.method bytecode offsets 59-121 (athrow), HttpMethod.requiresRequestBody (POST/PUT/PATCH true). GAP: HttpClient.kt:51 + AsyncHttpClient.kt:72 (Request-only SPI) vs openai-java HttpClient.kt:9-22 + OkHttpClient.kt:92-104. BACKSTOP: openai-java PhantomReachable.kt:31-56 + 15-17 (capture caveat); we have none. ALREADY-DONE: RetryStep.kt:354 (isReplayable gate), SdkRequestBodyAdapter.kt:45 (isOneShot)." + +--- + +## 17. Build / packaging / release / DX (gradle, codegen stats, spring starter, proguard, examples) + +**What it is** + +openai-java is a 6-module Gradle build (openai-java-core, -client-okhttp, -lib, -spring-boot-starter, -proguard-test, -example, plus the `openai-java` aggregator) whose DRYness comes entirely from `buildSrc` convention plugins (`openai.kotlin`, `openai.java`, `openai.publish`) applied by id in each module — so module scripts are 2-23 lines. The build is codegen-driven: `.stats.yml` records the OpenAPI spec URL+hash and `configured_endpoints: 261`; `release-please-config.json` + `.release-please-manifest.json` + `x-release-please-version`/`x-release-please-start-version` markers in `build.gradle.kts` and `README.md` drive fully-automated conventional-commit releases via the Stainless `trigger-release-please` action (create-releases.yml, cron 5am UTC + push-to-main). Distinctive packaging hardening: (1) a dedicated `openai-java-proguard-test` module that shadow-JARs the SDK, runs BOTH ProGuard 7.4.2 and R8 8.3.x over it using the consumer-facing keep rules shipped at `META-INF/proguard/openai-java-core.pro`, then executes a real round-trip test through the shrunk JAR (`tasks.test { dependsOn(testProGuard, testR8); enabled = false }`); (2) a Spring Boot starter with `@AutoConfiguration`, `@ConfigurationProperties`, a `fun interface` customizer seam, and IDE-autocomplete metadata. Their quality bar is otherwise LOW: no coverage tooling at all, no binary-compatibility validator (they fake it with a lint-based `detect-breaking-changes` that checks out old tests and recompiles), only ktfmt/palantir-format. OUR build is the inverse: stronger gates (detekt + ktlint + Kotlin binary-compatibility-validator with committed `.api` snapshots + 80% Kover floor wired into `check` + explicit-API strict + allWarningsAsErrors), but NO buildSrc (9 module scripts duplicate byte-identical publishing/signing POMs, 79-132 lines each), NO `.github` CI of any kind, NO release automation, a near-empty `gradle.properties`, and no proguard/shrink test, Spring starter, or example module. + +**How it works (line-level)** + +CONVENTION PLUGINS (the DRY engine we lack): buildSrc/src/main/kotlin/openai.kotlin.gradle.kts:4-7 `plugins { id("openai.java"); kotlin("jvm") }` composes plugins; openai.kotlin.gradle.kts:18-30 sets `freeCompilerArgs = listOf("-Xjvm-default=all","-Xjdk-release=1.8","-nowarn")`, `jvmTarget JVM_1_8`, `languageVersion/apiVersion KOTLIN_1_8`, `coreLibrariesVersion="1.8.0"` ONCE. openai.java.gradle.kts:20-23 `options.compilerArgs.add("-Werror"); options.release.set(8)` — `release 8` is the correct cross-compile flag (compiles against the JDK 8 API surface even on a JDK 21 toolchain, preventing newer-symbol leakage). openai.publish.gradle.kts:30-68 centralises ALL Maven-Central publishing (vanniktech plugin, in-memory GPG from env at lines 26-28, `publishToMavenCentral(SonatypeHost.CENTRAL_PORTAL)`, full POM) in ONE file. Each consuming module is then trivial: openai-java-core/build.gradle.kts:1-5 is just `plugins { id("java"); id("openai.kotlin"); id("openai.publish") }`; the aggregator openai-java/build.gradle.kts:6-8 is `dependencies { api(project(":openai-java-client-okhttp")) }`. PROGUARD/R8 SHRINK TEST: openai-java-proguard-test/build.gradle.kts:25-28 shadow-JARs test output + testRuntimeClasspath; :31-55 registers a `ProGuardTask` feeding it `configuration("../openai-java-core/src/main/resources/META-INF/proguard/openai-java-core.pro")` — i.e. it validates the EXACT keep rules shipped to consumers — and :41-51 conditionally adds `rt.jar` (Java<9) vs `java.base.jmod` (Java 9+) as libraryjars; :66-85 runs R8 via `com.android.tools.r8.R8` main with `--pg-conf` pointing at the same rules; :96-101 `tasks.test { dependsOn(testProGuard); dependsOn(testR8); enabled = false }` — the real JUnit test is disabled and replaced by execution through the shrunk JAR. The shipped rules openai-java-core.pro:5-6 `-keep class kotlin.reflect.** { *; }` / `-keep class kotlin.Metadata { *; }` and :29-32 `-keepclassmembers class com.openai.** { (...); @com.fasterxml.jackson.annotation.* *; }` exist solely because Jackson reflects over constructors+annotations. SPRING STARTER: OpenAIClientAutoConfiguration.kt:14-24 `@AutoConfiguration @ConditionalOnClass(OpenAIClient::class) @ConditionalOnMissingBean fun client(properties, customizers: ObjectProvider)`; :33 `customizers.orderedStream().forEach { it.customize(this) }` is the escape hatch; registered via META-INF/spring/...AutoConfiguration.imports (single line). OpenAIClientProperties.kt:9-19 `@ConfigurationProperties(prefix="openai") @ConstructorBinding` with `@Name("base-url")` kebab keys; additional-spring-configuration-metadata.json gives IDE autocomplete + defaultValue. OpenAIClientCustomizer.kt:7 is a `fun interface` (SAM) so Java/Kotlin lambdas work. RELEASE AUTOMATION: release-please-config.json:7-9 `"include-v-in-tag": true, "versioning": "prerelease", "prerelease": true`; :12 `"pull-request-title-pattern": "release: ${version}"`; :14-61 maps conventional-commit types→changelog sections; :63-66 `"extra-files": ["README.md","build.gradle.kts"]` so version bumps propagate to docs. build.gradle.kts:11 `version = "4.39.1" // x-release-please-version` is the magic-comment anchor. PERF: gradle.properties:1-3 `caching=true, configuration-cache=true, parallel=true`, :4 `daemon=false`, :6-18 a tuned `-Xmx8g -XX:TieredStopAtLevel=1 -XX:+UseStringDeduplication` JVM arg block. + +**vs. our SDK** + +OUR layout has NO buildSrc (confirmed: `test -d buildSrc` → DOES NOT EXIST). Consequence: every module repeats the full publishing+signing block. Verified byte-identical: `diff` of the `publishing {…}` blocks in sdk-async-reactor/build.gradle.kts vs sdk-transport-okhttp/build.gradle.kts → IDENTICAL. The same ~40-line MavenPublication POM (name/description/url/MIT-license/dexpace developer/scm) and the same ~8-line `signing { isRequired = (System.getenv("CI")=="true"); useInMemoryPgpKeys(...) }` block appear in all 9 scripts: sdk-core/build.gradle.kts:44-90, sdk-transport-jdkhttp/build.gradle.kts:67-112, sdk-async-virtualthreads/build.gradle.kts:66-111, etc. The detekt-disable workaround (a 20-line comment + `tasks.matching { it.name=="detekt" }.configureEach { enabled=false }`) is copy-pasted verbatim into sdk-transport-jdkhttp/build.gradle.kts:114-132 AND sdk-async-virtualthreads/build.gradle.kts:113-131. The JDK-11/21 toolchain override triple (kotlin jvmToolchain + java sourceCompatibility/toolchain + KotlinCompile jvmTarget) is duplicated across jdkhttp:32-48 and virtualthreads:33-49. Our root build.gradle.kts:97-156 does centralise compiler config via an `allprojects { plugins.withId("org.jetbrains.kotlin.jvm") {…} }` callback (jvmToolchain(8), explicitApi=Strict, jvmTarget JVM_1_8, allWarningsAsErrors, compileOnly slf4j) and :140-155 already does reproducible archives + manifest version stamping — that part is good — but publishing/signing was NOT lifted there. NOTE we use `jvmTarget.set(JvmTarget.JVM_1_8)` on a toolchain-8 build, whereas they use the cross-compile combo `release.set(8)` + `-Xjdk-release=1.8` on a toolchain-21 build; ours is correct for a true JDK-8 toolchain but cannot catch newer-symbol leakage the way `-Xjdk-release` does on the two modules where we deliberately compile on 11/21. VERSION STAMPING: our SdkInfo.kt:39-40 reads `Package.getImplementationVersion()`, fed by root build.gradle.kts:148-154 Jar-manifest stamping — equivalent to their openai.java.gradle.kts:25-31, and arguably cleaner (no codegen anchor comment). PROVENANCE: we have NO `.stats.yml` (correct — we are hand-written, not spec-generated; a `.stats.yml` would be meaningless for a toolkit). RELEASE: we have `release-please-config.json` + `.release-please-manifest.json` (`{".":"4.39.1"}` was theirs; ours exists per repo-root listing) BUT NO `.github/workflows` directory at all — so the config is inert; nothing runs it. QUALITY GATES: ours dominate — root build.gradle.kts:57-83 Kover `minBound(80)` wired to `check`, :32 binary-compatibility-validator with committed `.api` snapshots (9 of them under */api/*.api), :35-38 ktlint+detekt, plus explicit-API strict. openai-java has none of these: `grep kover|jacoco|coverage` over their buildSrc+core build → "NO coverage tooling"; no `.api` files exist; binary compat is approximated by scripts/detect-breaking-changes:14-22 (checks out the base-ref's model/service TESTS, then runs the linter — if old tests no longer compile, the API broke). + +**Recommendations (verified)** + +- **Tune gradle.properties: enable build cache + parallel now; trial configuration-cache separately** `COPY` · `docs/process` · effort S · confidence high + - *Verdict:* Verified. Our gradle.properties is a single line (`kotlin.code.style=official`), so org.gradle.caching/parallel/configuration-cache all default OFF. Theirs enables `org.gradle.caching=true`, `org.gradle.configuration-cache=true`, `org.gradle.parallel=true`, `org.gradle.daemon=false`, plus a heavy `-Xms2g -Xmx8g -XX:+UseParallelGC -XX:ReservedCodeCacheSize=1G ... -XX:CICompilerCount=4 -XX:+UseStringDeduplication` block. The analysis is right on every point: caching+parallel on a 9-module build is free wall-clock with near-zero risk; configuration-cache is the big win but can surface non-CC-safe task wiring and must be a separate, validated commit (note: our root build.gradle.kts uses `tasks.named(...).dependsOn` and `allprojects{}`/`subprojects{}` with `apply(plugin=...)` — the eager `apply` and cross-project `subprojects{}` config are the classic CC friction points, so expect to fix a few); and the `-Xmx8g`/`CICompilerCount=4` JVM block is sized for compiling 261 generated endpoints and would just waste memory on our small build. `org.gradle.daemon=false` is specifically their CI-flakiness mitigation (their ci.yml disables the daemon via GRADLE_OPTS) and belongs in CI env, not necessarily in the committed gradle.properties (a local daemon is desirable for dev). One forward-looking note the analysis got right: if we add the shrink-test module, its ProGuard/R8 tasks must opt out with `notCompatibleWithConfigurationCache(...)` exactly as proguard-test/build.gradle.kts:34,60,70 does. Confidence high. + - *Do:* Commit `org.gradle.caching=true` and `org.gradle.parallel=true` now. In a separate commit, trial `org.gradle.configuration-cache=true` and fix the reported incompatibilities (likely the eager `apply(plugin=...)` in subprojects{} and any task wiring that captures Project at execution time). Skip the aggressive JVM block — a modest `-Xmx2g` suffices; do NOT copy -Xmx8g/CICompilerCount. Put `org.gradle.daemon=false` only in CI's GRADLE_OPTS, not the shared properties file. +- **Stand up CI (.github/workflows) — our strong gates run nowhere; nothing enforces apiCheck/koverVerify/detekt on PRs** `ADOPT` · `docs/process` · effort S · confidence high + - *Verdict:* Verified and this is the highest-leverage item in the subsystem. Our repo has NO `.github` directory (confirmed: ls fails). Every gate we built — apiCheck (binary-compat-validator, 9 committed .api files), koverVerify minBound(80) wired to `check` (root build.gradle.kts:81-83), detekt 1.23.6, ktlint, explicit-API strict, allWarningsAsErrors — only fires when a human runs `./gradlew build` locally. A binary-compat validator that isn't in CI is decorative; the whole point is blocking the PR that breaks it. openai-java's ci.yml is a fine skeleton (lint/build/test/examples jobs, `setup-java` temurin with `java-version: |\n 8\n 21`, `cache: gradle`, per-job `timeout-minutes`, and `GRADLE_OPTS: -Dkotlin.compiler.execution.strategy=in-process` to dodge daemon flakiness). One correction to the analysis: their matrix is for their 8+21 bytecode targets; OURS spans 8/11/21, but because settings.gradle.kts:8-10 already applies the foojay toolchain resolver, the CI runner only needs ONE JDK (21) and Gradle auto-provisions 8 and 11 for the per-module toolchains — so our runner is SIMPLER than theirs, not a 3-way matrix. Their CI is also riddled with `github.repository == 'openai/...'`/Stainless-specific guards and a `pkg.stainless.com` artifact upload step — all vendor noise to ignore. The substance (one job running `./gradlew build`, which already chains apiCheck+koverVerify+detekt+ktlint) is correct and S-effort. Confidence high. + - *Do:* Add `.github/workflows/ci.yml` triggering on pull_request + push: checkout, `actions/setup-java` (temurin, single java-version 21 — foojay auto-provisions 8/11), `gradle/actions/setup-gradle` with caching, then `./gradlew build` (already runs apiCheck, koverVerify minBound(80), detekt, ktlint, allWarningsAsErrors). Optionally a fast `apiCheck`-only job for a crisp binary-compat failure signal. Set `GRADLE_OPTS=-Dorg.gradle.daemon=false` for CI stability. This converts our existing paid-for gates from decorative to enforced. +- **Add a runnable example module that assembles the toolkit end-to-end and runs in CI as a usability smoke test** `ADOPT` · `both` · effort M · confidence medium + - *Verdict:* Verified. openai-java-example/build.gradle.kts uses the `application` plugin with `mainClass = com.openai.example.${example}Example` selectable via `-Pexample=`, pins `options.release.set(11)` so examples may use List.of, and ci.yml runs `./gradlew :openai-java-example:run` in an examples job. The analysis's strongest and correct argument: for a TOOLKIT the hard part is *assembly* (install IoProvider, pick a transport, compose an HttpPipeline with exactly one of each pillar stage), which our README describes only in prose — explicit-API strict and binary-compat protect signatures, not whether the public surface is actually pleasant to wire together. A compiled example doubles as an integration smoke test that our per-module unit tests (scoped within a module, using test fixtures) never provide: it proves a downstream consumer can combine sdk-core + sdk-io-okio3 + sdk-transport-okhttp + a pipeline. Fair caveats: it's M-effort (must actually design the assembly the example demonstrates, which overlaps with the codegen/starter design question), and running it against a live endpoint in CI is flaky — point it at mockwebserver3 (already our transport test dependency) instead of a real API so the smoke test is hermetic. Keep it out of kover aggregation and out of publishing. The 'target both' is right: the hand-written example validates the toolkit now, and the pattern (an `-Pexample=` selector over sync/async/SSE/paging scenarios) is what generated SDKs should ship too. Confidence medium (clear value, but effort and design-coupling are real). + - *Do:* Add `sdk-example` (application plugin, NOT published, excluded from kover) that wires OkioIoProvider + the okhttp transport + a full HttpPipeline (one REDIRECT/RETRY/AUTH/LOGGING/SERDE step each) and issues a request against an embedded mockwebserver3 server. Add a CI smoke job running `./gradlew :sdk-example:run` once CI exists. Use a `-Pexample=` selector to cover sync, async, SSE, and paging assemblies. This makes the README's 'compose a pipeline' claim a compiled, CI-verified artifact. +- **Add a test-only R8/ProGuard shrink-survival module and ship consumer keep-rules — higher value for us as a toolkit than for an app SDK** `ADOPT` · `both` · effort L · confidence medium + - *Verdict:* Mechanism verified. openai-java-proguard-test/build.gradle.kts shadow-JARs `:openai-java` (incl. test output), runs the GuardSquare ProGuardTask (7.4.2) AND android-tools R8 (8.3.37) over it feeding BOTH ./test.pro and the shipped openai-java-core/src/main/resources/META-INF/proguard/openai-java-core.pro, then `tasks.test { dependsOn(testProGuard, testR8); enabled = false }` defers to a real round-trip executed through each shrunk JAR (ProGuardCompatibilityTest.kt runs its own @Test methods via reflection because JUnit+R8 don't cooperate; it asserts the .pro is on the classpath, builds a client, and does a Jackson writeValueAsString→readValue roundtrip). The analysis's claim that OUR risk is real-but-different is CORRECT and I confirmed the two specific hooks: ClientLogger.kt:81 has a secondary constructor taking `kotlin.reflect.KClass<*>` (their .pro line `-keep class kotlin.reflect.** { *; }` exists precisely because R8 strips kotlin.reflect), and SdkInfo.kt:39 reads `Package.getImplementationVersion()` off the Jar manifest (R8 can drop manifest/package attributes → silent `dexpace-sdk/unknown` User-Agent regression). Their entire .pro is Jackson-reflection-centric (24 of 32 lines are `com.fasterxml.jackson.**` keeps) and is genuinely N/A to our zero-Jackson sdk-core — so we cannot copy their rules; we must author our own, which is why effort is correctly L. Honest pushback on priority: this is real but speculative until a consumer actually reports an R8 break — sdk-core has almost no reflection surface (only the one KClass ctor, which a consumer can trivially avoid by using the KClass-free constructor), and the manifest-version concern degrades gracefully to 'unknown', it doesn't crash. The genuinely exposed module is sdk-serde-jackson (it reflects), which is opt-in. So: worth doing, but rank it BELOW CI and gradle.properties; it is the most expensive item here for a failure mode we have not yet observed. Confidence medium (value clear, urgency unproven). + - *Do:* Add `sdk-shrink-test` (Java 8, test-only — shadow + R8 stay out of every published module, so zero-dep core is preserved). Shadow-JAR sdk-core + sdk-io-okio3 + sdk-transport-okhttp + sdk-serde-jackson; run R8 only (skip ProGuard — R8 is the Android default and doubling the matrix isn't worth it pre-1.0); round-trip an install-IoProvider → request → serde cycle and assert SdkInfo.sdkVersion is non-'unknown' through the shrunk JAR. Ship the discovered rules as `META-INF/proguard/sdk-core.pro` (likely just `-keepattributes` for the manifest + a kotlin.reflect keep for ClientLogger's KClass ctor) and per-adapter rules for the Jackson serde. Defer until after CI exists. +- **Extract a buildSrc/build-logic convention-plugin layer to kill the ~45-line publishing+signing block duplicated across all 9 module scripts** `SIMPLIFY` · `docs/process` · effort M · confidence high + - *Verdict:* Verified. openai-java composes three precompiled-script plugins from buildSrc (openai.java, openai.kotlin, openai.publish), so openai-java-core/build.gradle.kts has a 3-line plugins block and the whole POM/signing config lives once in buildSrc/src/main/kotlin/openai.publish.gradle.kts. Our side: `test -d buildSrc` and `-d build-logic` both fail; the publishing{...}+signing{...} block (sdk-core/build.gradle.kts:44-90, identical logic in all 9 — sdk-transport-okhttp:42-89, sdk-async-reactor:37-82, sdk-serde-jackson, etc.) repeats a ~40-line MavenPublication POM + ~8-line in-memory-PGP signing block verbatim. The block is near-identical, not strictly byte-identical (sdk-core carries an extra `// repositories...` comment; line offsets differ), so the analysis's 'diff = IDENTICAL' overstates it slightly — but the duplicated *logic* is real and the maintenance hazard is exactly as described: a Sonatype-host or POM-description change today is a 9-file edit. The detekt-disable block IS byte-identical across the two non-8 modules (sdk-transport-jdkhttp:114-132 == sdk-async-virtualthreads:113-131, confirmed verbatim), and the JDK-11/21 toolchain triple (kotlin jvmToolchain + java sourceCompatibility/toolchain + KotlinCompile jvmTarget) is duplicated (jdkhttp:32-48, virtualthreads:33-49). One nuance the analysis missed: our root build.gradle.kts ALREADY centralises the bulk of compiler/Jar/repo config via `allprojects{}`/`subprojects{}` callbacks (:85-175) and applies ktlint/detekt to every subproject — so the *only* substantial things still duplicated per-module are (a) publishing+signing, (b) the two-module detekt+toolchain override. That narrows the win vs openai-java (whose root build is nearly empty and pushes everything to buildSrc) but does not eliminate it. Note: openai.publish uses the vanniktech maven-publish plugin (a 3rd-party convenience), NOT raw maven-publish+signing like us; we should keep our zero-extra-plugin approach and just lift our own blocks into a precompiled script plugin. NOT byte-identical claim downgraded; core recommendation stands. + - *Do:* Add a `build-logic` included build (config-cache-friendlier than buildSrc) with: `dexpace.published-module.gradle.kts` (the publishing+signing+POM block, parameterising only project.name/description), and a `dexpace.jvm-module-jdk11`/`-jdk21.gradle.kts` carrying the toolchain triple + detekt-disable once. Leave the existing root `allprojects{}` compiler/Jar/reproducibility config where it is (it already works and is not duplicated); this is a focused de-duplication of the two remaining repeated blocks, not a wholesale root-to-buildSrc migration. Keep raw maven-publish+signing (do not adopt vanniktech). libs.versions.toml stays as-is. +- **Document the --release/-Xjdk-release cross-compile discipline as the required guard IF we ever consolidate onto a single newer toolchain** `LEARN` · `docs/process` · effort S · confidence high + - *Verdict:* Verified and correctly scoped. openai-java compiles the whole project on a JDK 21 toolchain (openai.kotlin.gradle.kts jvmToolchain 21, openai.java.gradle.kts toolchain 21) but pins the API surface to 8 via javac `options.release.set(8)` and Kotlin `freeCompilerArgs += "-Xjdk-release=1.8"` (plus jvmTarget JVM_1_8, languageVersion/apiVersion KOTLIN_1_8, coreLibrariesVersion 1.8.0). `--release`/`-Xjdk-release` is strictly stronger than bare targetCompatibility/jvmTarget: it compiles against the JDK 8 *class library*, so a Java 9+ stdlib reference is a COMPILE error, not a runtime NoSuchMethodError. Our CLAUDE.md 'Things That Will Bite You' documents exactly this hazard and our root build relies on `jvmTarget.set(JvmTarget.JVM_1_8)` on a true jvmToolchain(8) — which is genuinely SAFE today because a real JDK-8 compiler physically cannot resolve newer symbols (the analysis is right that our 7 true-8 modules don't have the problem). The residual exposure is only the two modules that deliberately compile on 11/21 (jdkhttp, virtualthreads), where review discipline (no InputStream.transferTo on the 11 module, etc.) is the only guard — but those modules WANT their newer symbols, so -Xjdk-release is inapplicable there too. So this is correctly LEARN/docs, not a code change: the insight matters only as a precondition if we ever collapse to one newer toolchain (e.g. to also sidestep the detekt-on-JDK-25 crash that currently forces detekt off on those two modules). Accurate, low-stakes, worth a doc line. Confidence high. + - *Do:* Add one paragraph to docs/architecture.md (Toolchain discipline) and/or CLAUDE.md: 'Our per-module true-JDK-toolchain approach is safe as-is. IF we ever consolidate onto a single newer toolchain for build speed or to resolve the detekt-1.23.x/JDK-25 crash, every Java-8-target module MUST then use javac `--release 8` + Kotlin `-Xjdk-release=1.8` (not bare targetCompatibility/jvmTarget) to make newer-stdlib leakage a compile error.' No change to current module scripts. +- **Spring Boot starter pattern (fun interface customizer + @ConditionalOnMissingBean) — design input for FUTURE codegen, not a core feature** `LEARN` · `codegen` · effort M · confidence medium + - *Verdict:* Code verified exactly as described. OpenAIClientAutoConfiguration.kt is `@AutoConfiguration @ConditionalOnClass(OpenAIClient) @EnableConfigurationProperties(...)` with a single `@Bean @ConditionalOnMissingBean` factory that builds the client from properties then runs `customizers.orderedStream().forEach { it.customize(this) }`; OpenAIClientCustomizer.kt is a one-method `fun interface`; OpenAIClientProperties.kt is an `@ConfigurationProperties(prefix="openai")` `@ConstructorBinding` data class with `@Name("kebab-key")` fields; additional-spring-configuration-metadata.json supplies IDE autocomplete + the base-url default. The `@ConditionalOnMissingBean` + ObjectProvider combo (zero-config default, fully overridable) is the genuinely liftable idea and is Java-8/Loom-safe (`fun interface` SAM is fine). BUT I am downgrading ADOPT→LEARN and re-scoping to codegen-only: (1) Their starter is generated and wires ONE concrete client; a hand-written `sdk-spring-boot-starter` in the toolkit would have nothing single to wire — for us the bean is {IoProvider + chosen transport + assembled HttpPipeline}, which is exactly the open design question, so a toolkit POC would be inventing the very API the codegen is supposed to define. (2) spring-boot-autoconfigure 2.7.18 is fine to keep in a starter module (respects zero-dep core), but building a throwaway POC now risks anchoring the codegen design prematurely. The right output today is a documented design note in docs/refs-comparison.md capturing the pattern (properties shape + single fun-interface customizer + ConditionalOnMissingBean) so the generator can emit a per-API starter later. The analysis's 'two-thirds of Java shops are Spring' adoption argument is sound but applies to GENERATED SDKs, not the toolkit. Recategorised ADOPT→LEARN, target both→codegen. + - *Do:* Do NOT build a toolkit starter module now. Record in docs/refs-comparison.md (codegen section) the starter shape to emit per generated API: `@ConfigurationProperties(prefix=)` data class, a `fun interface ClientCustomizer`, an `@AutoConfiguration` with `@Bean @ConditionalOnMissingBean` that assembles {IoProvider + transport + HttpPipeline} then applies `ObjectProvider.orderedStream().forEach{...}`, plus additional-spring-configuration-metadata.json. Keep spring deps confined to the generated starter module. Revisit as an ADOPT once the codegen MVP exists and the pipeline-assembly API is stable. +- **Decide on release automation from scratch — the release-please config the analysis assumed we have does NOT exist in our repo** `LEARN` · `docs/process` · effort M · confidence high · claim-qualified + - *Verdict:* The analysis's PREMISE is FACTUALLY WRONG and I am keeping this only as a corrected note. It claims we already committed `release-please-config.json` + `.release-please-manifest.json` ('ours exists per repo-root listing') and that they are 'inert' / a 'trap' to either wire up or delete. Both files are ABSENT from our repo (ls of both paths → 'No such file or directory'; the repo-root listing in the env block shows neither). So there is nothing to 'wire up' and nothing to 'delete' — the framing collapses. What IS verified on their side: release-please-config.json (release-type:simple, prerelease versioning, changelog-sections mapping feat/fix/perf/docs/etc.), .release-please-manifest.json `{".":"4.39.1"}`, the `// x-release-please-version` anchor at their build.gradle.kts:11, and `` markers throughout README.md — and their automation runs via the Stainless-hosted `stainless-api/trigger-release-please` action (create-releases.yml:20), which is vendor lock-in we'd replace with upstream googleapis/release-please-action. Net: this is a genuinely useful capability for an SDK cutting frequent alpha releases, and our `feat:`/`fix:`/`docs:` commit convention feeds it — but it is a GREENFIELD 'should we add this' decision, not a 'finish the half-built config' decision. Downgraded ADOPT→LEARN and gated firmly behind CI: release automation with no CI to run it is pointless, so this is strictly second. claimAccurate=false because the central factual claim about our repo state is false. + - *Do:* Treat as a future decision, AFTER CI lands and before the first non-alpha tag. If adopted: add release-please-config.json + .release-please-manifest.json fresh, an `x-release-please-version` anchor on `version = ...` in root build.gradle.kts:42 and README install snippets, and a `release-please.yml` using the UPSTREAM googleapis/release-please-action (never the Stainless fork), with `extra-files` pointing at those version locations. Do not present this as wiring up existing config — there is none. + +**Considered & dropped** + +- ~~antipattern: do not copy .stats.yml / codegen-provenance into sdk-core~~ — Verified accurate (.stats.yml records configured_endpoints:261 + openapi_spec_url/hash — meaningless for a hand-written toolkit) but it is an antipattern warning, not a decision-ready action item, and the constructive half (a generator should stamp provenance into GENERATED SDKs) is already implied by the codegen-design framing. No standalone change for us. Folded into notes. +- ~~antipattern: do not copy detect-breaking-changes recompile-old-tests approach~~ — Verified: scripts/detect-breaking-changes checks out the base-ref's models/services TEST dirs and runs ./scripts/lint, treating a compile failure as a break — a weak proxy that only catches breakage an existing test exercises. We already have the strictly-better binary-compatibility-validator with 9 committed .api files. This is a 'keep doing what we do' confirmation with zero action; not a finding. +- ~~weAreAhead summary (coverage / binary-compat / static analysis / warnings / reproducible builds / zero-dep core)~~ — Independently re-verified and TRUE on every point (no kover/jacoco anywhere in their build; no .api files; only ktfmt 0.61 + palantir-format, no detekt/explicit-API; global -nowarn in openai.kotlin.gradle.kts:24 vs our allWarningsAsErrors; our reproducible-archives block at root :140-143 has no openai-java equivalent; their core hard-deps Jackson 2.18 `api` + victools jsonschema while ours is SLF4J-compileOnly only). But it is a scoreboard, not an actionable finding — it prescribes no change. Captured in notes; not a verified finding requiring action. +- ~~antipattern: avoid -Xmx8g / CICompilerCount=4 JVM block and Stainless vendor lock-in~~ — Accurate but already absorbed into the gradle.properties finding (skip the heavy JVM block) and the CI/release findings (use upstream actions, not stainless-api/trigger-release-please or pkg.stainless.com). Redundant as a separate item. + +**Do not copy** + +(1) DO NOT copy `.stats.yml` or the codegen-provenance machinery into sdk-core — `.stats.yml:1-4` (`configured_endpoints: 261`, `openapi_spec_url`, `openapi_spec_hash`) only makes sense for a spec-generated client; our toolkit is hand-written and a spec hash would be a lie. It IS relevant to FUTURE CODEGEN: our generator should stamp an analogous provenance file (generator version + input contract hash) into GENERATED SDKs, not into the toolkit. (2) DO NOT copy their binary-compatibility approach (scripts/detect-breaking-changes:14-22 checks out the base-ref's TESTS and recompiles, treating a compile failure as a breaking change) — it is a weak proxy that only catches breakage exercised by an existing test and produces confusing 'lint failed' errors. Our committed `.api` snapshots + binary-compatibility-validator (root build.gradle.kts:32) are strictly better; keep ours. (3) DO NOT pull ProGuard/R8/shadow plugins into sdk-core or any shipped module — they belong in a test-only `sdk-shrink-test` module exactly as theirs is isolated in openai-java-proguard-test; dragging shadow into core would violate zero-dep. (4) DO NOT copy their `-nowarn`/`-Xsuppress-warning=DEPRECATION` posture (openai.kotlin.gradle.kts:23-25) — they globally suppress deprecation warnings because generated code references its own deprecated members; we run `allWarningsAsErrors` (root build.gradle.kts:116) which is far healthier for a hand-maintained toolkit. Generated code may need a scoped suppression, but never a global `-nowarn`. (5) DO NOT copy their `-Xmx8g -XX:CICompilerCount=4` JVM block wholesale (gradle.properties:6-18) — it is sized for compiling 261 endpoints of generated models; our 9-module build will waste memory. (6) Their reliance on the Stainless-hosted `trigger-release-please` action (create-releases.yml:20) and `pkg.stainless.com` artifact upload (ci.yml:77) is vendor lock-in irrelevant to us; use upstream release-please + Sonatype directly. + +**Where we're ahead** + +Our quality gates are categorically stronger and this is verifiable, not opinion. (1) COVERAGE: openai-java has NO coverage tooling at all (`grep kover|jacoco|coverage` over their buildSrc + root + core build → none); we enforce an aggregate 80% line floor wired into `check` (root build.gradle.kts:57-83, `minBound(80)`), so a plain `./gradlew build` fails on regressions. (2) BINARY COMPATIBILITY: they have NO `.api` snapshots and approximate API-break detection with a brittle recompile-old-tests lint (scripts/detect-breaking-changes:14-22); we run the Kotlin binary-compatibility-validator with 9 committed `.api` files (root build.gradle.kts:32, e.g. sdk-core/api/sdk-core.api) — real signature-level diffing. (3) STATIC ANALYSIS: they run only formatters (ktfmt 0.61 via openai.kotlin.gradle.kts:43, palantir-java-format via openai.java.gradle.kts:48); we run detekt (config/detekt.yml) AND ktlint AND explicit-API STRICT mode (root build.gradle.kts:104) — they have no explicit-API enforcement, so their generated public surface is implicit-public by Kotlin default. (4) WARNINGS: we compile with `allWarningsAsErrors=true` (root build.gradle.kts:116); they globally `-nowarn` (openai.kotlin.gradle.kts:24). (5) REPRODUCIBLE BUILDS: root build.gradle.kts:140-143 strips archive timestamps and sorts entries for byte-identical JARs across machines; openai-java has no equivalent in its build scripts. (6) ZERO-DEP CORE: our sdk-core has only SLF4J compileOnly (root build.gradle.kts:121) and pushes Jackson/Okio/coroutines into adapters; openai-java-core HARD-depends on Jackson 2.18 (`api`) + victools jsonschema (openai-java-core/build.gradle.kts:23-34) and pulls okhttp in for tests — their core is heavier and less composable. The one place they are ahead on packaging is the proguard/R8 shrink-test, the Spring starter, CI, and gradle.properties tuning — all addressed in findings above; on the actual correctness/quality bar, we lead decisively. + +_Verifier notes:_ VERIFICATION VERDICT: 7 of 8 findings substantively accurate; 1 (release-please, F5) has a FALSE premise and was corrected. Two ADOPTs were downgraded to LEARN on scope/urgency grounds. + +KEY CORRECTION — release-please (F5): The analysis asserts we already committed release-please-config.json + .release-please-manifest.json ('ours exists per repo-root listing') and frames the task as 'wire up the inert config or delete it'. BOTH FILES ARE ABSENT from our repo (verified: ls of both paths fails; repo-root listing in the env shows neither). Any downstream reader must NOT go looking for files to delete/wire — release automation is a greenfield decision, gated behind CI. claimAccurate=false set on that finding for this reason. + +OTHER ACCURACY ADJUSTMENTS: +- F1 buildSrc: the 'diff = byte-IDENTICAL' claim for publishing blocks is slightly overstated (sdk-core has a comment variation; offsets differ) — the duplicated LOGIC is real, the detekt-disable block IS byte-identical across the two non-8 modules, and our root build ALREADY centralises compiler/Jar/repo/ktlint/detekt config, so the remaining duplication is narrower than the analysis implies (publishing+signing in all 9; toolchain+detekt override in 2). Win is real but smaller than 'their root is empty, do the same'. +- F4 CI matrix: ours need NOT mirror their 8+21 (or a 3-way 8/11/21) matrix — settings.gradle.kts foojay resolver lets a single JDK-21 runner auto-provision 8/11 for per-module toolchains. Simpler than the analysis suggested. +- F2 proguard: the two specific risk hooks are CONFIRMED in our code — ClientLogger.kt:81 takes a `kotlin.reflect.KClass<*>` ctor arg, and SdkInfo.kt:39 reads `Package.getImplementationVersion()`. Real, but urgency is unproven (sdk-core's only reflection surface is that one KClass ctor, which has a KClass-free alternative; manifest-version degrades to 'unknown', doesn't crash). Ranked below CI/gradle.properties. + +RECOMMENDED SEQUENCING (cheapest-highest-leverage first): (1) CI ci.yml [S, makes all existing gates real — top priority]; (2) gradle.properties caching+parallel [S], then configuration-cache trial [S, separate commit]; (3) build-logic convention plugins for publishing/signing+toolchain/detekt [M]; (4) sdk-example assembled-toolkit smoke module [M, after CI]; (5) docs note on -Xjdk-release discipline [S]; (6) release-please decision [M, after CI, greenfield]; (7) Spring-starter SHAPE into docs/refs-comparison.md codegen section [LEARN, no module]; (8) sdk-shrink-test R8 module [L, last — speculative failure mode]. + +WHERE WE LEAD (verified, not opinion): coverage floor (kover minBound(80) wired to check), binary-compat (9 committed .api snapshots vs their brittle recompile-old-tests lint), static analysis (detekt+ktlint+explicit-API-strict vs their formatters-only + global -nowarn), allWarningsAsErrors, reproducible archives, and a zero-dep sdk-core (SLF4J compileOnly) vs their Jackson-2.18-`api` + victools-jsonschema core. openai-java is ahead ONLY on packaging DX: CI, gradle.properties tuning, the R8/ProGuard shrink test, the Spring starter, and the example module — all captured as findings above. + +ANTIPATTERNS to avoid (verified): do not pull .stats.yml/codegen-provenance, shadow/ProGuard/R8 plugins, or any Jackson keep-rules into sdk-core or any published module; do not adopt their global -nowarn, their -Xmx8g/CICompilerCount JVM block, their recompile-old-tests breaking-change proxy, the vanniktech publish plugin, or the Stainless-hosted trigger-release-please action / pkg.stainless.com upload (use upstream googleapis/release-please-action + Sonatype directly). + +--- diff --git a/sdk-core/src/test/kotlin/org/dexpace/sdk/core/http/pipeline/steps/RetryStepTest.kt b/sdk-core/src/test/kotlin/org/dexpace/sdk/core/http/pipeline/steps/RetryStepTest.kt index ac8b22c3..df883f35 100644 --- a/sdk-core/src/test/kotlin/org/dexpace/sdk/core/http/pipeline/steps/RetryStepTest.kt +++ b/sdk-core/src/test/kotlin/org/dexpace/sdk/core/http/pipeline/steps/RetryStepTest.kt @@ -691,7 +691,8 @@ class RetryStepTest { // bare POST (no payload to re-send) is non-idempotent, so it must NOT be retried even on a // retryable status — a second POST could duplicate a side effect the server already // applied. The 503 is returned as-is after exactly one attempt. This exercises the - // body == null branch of isRetrySafe on a non-idempotent method. + // body == null branch of isRetrySafe on a non-idempotent method. Mirrored by the + // `body-less POST is not retried` case in the pipeline.step.retry RetryStep suite. val fake = FakeHttpClient() .enqueue { status(503) } @@ -717,6 +718,7 @@ class RetryStepTest { fun `body-less PUT IS retried because PUT is idempotent`() { // Control for the body == null branch: with no body the gate falls through to method // idempotency. PUT is idempotent, so a body-less PUT is retry-safe and retries normally. + // Mirrored by the `body-less PUT is retried` case in the pipeline.step.retry RetryStep suite. val fake = FakeHttpClient() .enqueue { status(503) } diff --git a/sdk-core/src/test/kotlin/org/dexpace/sdk/core/pipeline/step/retry/RetryStepTest.kt b/sdk-core/src/test/kotlin/org/dexpace/sdk/core/pipeline/step/retry/RetryStepTest.kt index 0b5b3ec3..08a484b4 100644 --- a/sdk-core/src/test/kotlin/org/dexpace/sdk/core/pipeline/step/retry/RetryStepTest.kt +++ b/sdk-core/src/test/kotlin/org/dexpace/sdk/core/pipeline/step/retry/RetryStepTest.kt @@ -167,6 +167,18 @@ class RetryStepTest { .build() } + private fun requestPostBodyless(): Request = + Request.builder() + .url("https://api.example.com/resource") + .method(Method.POST) + .build() + + private fun requestPutBodyless(): Request = + Request.builder() + .url("https://api.example.com/resource") + .method(Method.PUT) + .build() + private fun response( status: Int, headers: Headers = Headers.builder().build(), @@ -248,6 +260,39 @@ class RetryStepTest { assertTrue(client.calls.isEmpty()) } + @Test + fun `503 with body-less POST is not retried because POST is non-idempotent`() { + // Body-less retry safety keys off METHOD idempotency, not off the absence of a body. A + // bare POST has no payload to re-send, but it is non-idempotent, so it must NOT be retried + // — a second POST could duplicate a side effect the server already applied. Exercises the + // body == null branch of canRetry on a non-idempotent method. Mirrors the + // `body-less POST is NOT retried` case in the http.pipeline DefaultRetryStep suite. + val client = FakeClient() + val request = requestPostBodyless() + val step = RetryStep(client, zeroDelaySettings(InstantScheduler()), request) + val outcome = ResponseOutcome.Failure(httpException(SC_SERVICE_UNAVAILABLE)) + val out = step.invoke(outcome) + assertSame(outcome, out, "Outcome must be unchanged — a body-less POST is non-idempotent") + assertTrue(client.calls.isEmpty(), "body-less POST must not be re-dispatched") + } + + @Test + fun `503 with body-less PUT is retried because PUT is idempotent`() { + // Control for the body == null branch: with no body the gate falls through to method + // idempotency. PUT is idempotent, so a body-less PUT is retry-safe and retries normally — + // here the single retry succeeds. Mirrors the `body-less PUT IS retried` control in the + // http.pipeline DefaultRetryStep suite. + val ok = response(SC_OK) + val client = FakeClient(listOf(Canned.Ok(ok))) + val request = requestPutBodyless() + val step = RetryStep(client, zeroDelaySettings(InstantScheduler()), request) + val outcome = ResponseOutcome.Failure(httpException(SC_SERVICE_UNAVAILABLE)) + val out = step.invoke(outcome) + assertTrue(out is ResponseOutcome.Success, "body-less PUT must retry — PUT is idempotent") + assertSame(ok, (out as ResponseOutcome.Success).response) + assertEquals(1, client.calls.size) + } + @Test fun `404 is not retried`() { val client = FakeClient()