From dbc5b5b7fcaf654fed56b11f0eb2946c9332c4dc Mon Sep 17 00:00:00 2001 From: OmarAlJarrah Date: Mon, 15 Jun 2026 22:10:34 +0300 Subject: [PATCH 1/2] docs: clarify bounded body-logging preview and the stage-pipeline rationale MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two documentation gaps around HTTP instrumentation: Under BODY_AND_HEADERS logging, the response body capture is bounded to bodyPreviewMaxBytes, but the full body still flows to the consumer — a body larger than the cap is streamed in full while only a preview prefix is buffered. The logged response.body.size / response.body.preview fields therefore describe the captured preview, not necessarily the whole body, whereas response.content.length reports the body's true length. Document this in http-body-logging-and-concurrency.md and add a concise note to the HttpLogLevel.BODY_AND_HEADERS and HttpInstrumentationOptions KDoc so it is visible at the opt-in point, not just on LoggableResponseBody. Also explain why the stage-based http.pipeline layer uses an ordered Stage enum with single-step pillar stages (REDIRECT/RETRY/AUTH/LOGGING/SERDE) instead of nested HttpClient decorators: deterministic and inspectable run order, one place to reason about stage precedence, enforced pillar-uniqueness, and shared ordering across the sync and async runtimes — plus the one cost it buys, the PipelineNext.copy() re-drive contract that wrapping steps must honour. Added to pipelines.md with a cross-reference from architecture.md. --- docs/architecture.md | 4 ++ docs/http-body-logging-and-concurrency.md | 40 ++++++++++++++ docs/pipelines.md | 55 +++++++++++++++++++ .../steps/HttpInstrumentationOptions.kt | 12 ++++ .../core/http/pipeline/steps/HttpLogLevel.kt | 2 + 5 files changed, 113 insertions(+) diff --git a/docs/architecture.md b/docs/architecture.md index 32c8a5e3..f11f4f2f 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -271,6 +271,10 @@ Shipped pillar/step implementations live in `http.pipeline.steps`: `DefaultRedir `DefaultRetryStep`, `AuthStep` (+ `BearerTokenAuthStep` / `KeyCredentialAuthStep`), `DefaultInstrumentationStep`, and the redirect/retry option types. +For why this layer uses ordered stages with pillar-uniqueness rather than nested `HttpClient` +decorators — and the one cost that buys (the `next.copy()` re-drive contract) — see +[Pipeline Mechanism](pipelines.md#why-ordered-stages-not-nested-decorators). + #### Recovery-aware primitives (`org.dexpace.sdk.core.pipeline`) A lower-level layer that threads a sealed `ResponseOutcome` so recovery steps observe and diff --git a/docs/http-body-logging-and-concurrency.md b/docs/http-body-logging-and-concurrency.md index 951e8be4..1212a4f2 100644 --- a/docs/http-body-logging-and-concurrency.md +++ b/docs/http-body-logging-and-concurrency.md @@ -14,6 +14,7 @@ and the concurrency decisions behind them. - [Architecture](#architecture) - [LoggableRequestBody — Tee-Write Strategy](#loggablerequestbody--tee-write-strategy) - [LoggableResponseBody — Drain-Once Strategy](#loggableresponsebody--drain-once-strategy) + - [Logged body size vs. the body the consumer receives](#logged-body-size-vs-the-body-the-consumer-receives) - [Reading a Snapshot](#reading-a-snapshot) - [Internal Stream Utilities](#internal-stream-utilities) - [Concurrency Design](#concurrency-design) @@ -240,6 +241,45 @@ retains whatever bytes were read before the failure and caches the exception: post-mortem logging that records "what we got" alongside the exception. - `captureException` surfaces the cached exception (or `null`) without triggering a drain. +### Logged body size vs. the body the consumer receives + +When `HttpLogLevel.BODY_AND_HEADERS` is enabled, the instrumentation step +(`DefaultInstrumentationStep` / `DefaultAsyncInstrumentationStep`) wraps the response body in a +`LoggableResponseBody` bounded to `HttpInstrumentationOptions.bodyPreviewMaxBytes` (default +8 KiB, `DEFAULT_BODY_PREVIEW_MAX_BYTES`). Two consequences follow that are easy to miss when +reading the logs: + +**1. The body delivered downstream can be larger than the logged preview.** The cap bounds only +the in-memory *capture*, not the body. For a response larger than `bodyPreviewMaxBytes`, the +step buffers the preview prefix and the wrapper then streams the full body to the consumer — it +replays the captured prefix and continues from the live tail (see the bounded-capture diagram +above). The preview you see in the log is a prefix; the consumer still reads every byte. + +**2. The logged size fields measure different things.** The step emits two size-related fields +on the `http.response` event, and they are not the same number for an over-cap body: + +| Field | Source | What it reports | +|---------------------------|----------------------------------------------|---------------------------------------------------------------------------------| +| `response.body.size` | `loggableBody.snapshot(bodyPreviewMaxBytes)` | Size of the **captured preview** — bounded by `bodyPreviewMaxBytes` | +| `response.body.preview` | the same captured bytes, decoded as UTF-8 | The preview text (a prefix for an over-cap body) | +| `response.content.length` | `response.body.contentLength()` | The body's **true** length when the origin declared one (`Content-Length`); `-1` for unknown-length / streaming bodies | + +So `response.body.size` is the *captured/preview* size, **not** necessarily the full body size. +When a body exceeds the cap, `response.body.size` saturates near `bodyPreviewMaxBytes` while +`response.content.length` still shows the real length. Read `content.length` (not +`body.size`) when you need the full size, and treat `body.preview` as a prefix that may be +truncated. The two agree only when the whole body fit within the cap — exactly the case where +`contentLength()` itself returns the captured size (see **`contentLength()`** above). + +**Streaming / unknown-length bodies (async path).** `DefaultAsyncInstrumentationStep` skips the +capture entirely when `contentLength() < 0`, because the bounded drain would run on the +future-completion thread and a slow producer could stall it. Such bodies stream to the consumer +unwrapped, so they carry **no** `response.body.size` / `response.body.preview` fields at all — +absence of those fields is expected for chunked or streaming responses, not a logging bug. The +synchronous `DefaultInstrumentationStep` drains known-length and unknown-length bodies alike (it +runs on the caller's thread), but the size-vs-preview distinction above applies to it just the +same. + ### Reading a Snapshot The only logging output is a raw `ByteArray`: diff --git a/docs/pipelines.md b/docs/pipelines.md index 4130dc23..4bb6d1ce 100644 --- a/docs/pipelines.md +++ b/docs/pipelines.md @@ -19,6 +19,7 @@ composable request/response processing. - [Functional Interfaces](#functional-interfaces) - [Immutable Pipeline State](#immutable-pipeline-state) - [Step Ordering and Dependencies](#step-ordering-and-dependencies) + - [Why ordered stages, not nested decorators](#why-ordered-stages-not-nested-decorators) - [Usage Examples](#usage-examples) - [File Index](#file-index) @@ -476,6 +477,60 @@ Retry is **not** a request step — it lives in `ResponsePipeline.recoverySteps` the transport outcome and re-issue the request. Order recovery steps so retry runs before any status-to-exception mapping you do not want a transient failure to surface prematurely. +### Why ordered stages, not nested decorators + +> This decision concerns the stage-based `http.pipeline` layer (`HttpPipeline`, `HttpStep`, +> `Stage`) introduced under [Async Dispatch](#async-dispatch) and detailed in +> `docs/architecture.md`, not the recovery-aware `pipeline` primitives above. + +A common alternative to a step list is to nest cross-cutting concerns as `HttpClient` +decorators — `RedirectClient(RetryClient(AuthClient(LoggingClient(transport))))` — where each +wrapper calls `inner.execute(request)`. The `http.pipeline` layer deliberately uses an ordered +list of `HttpStep`s keyed by a `Stage` enum instead, with five **pillar** stages +(`REDIRECT` → `RETRY` → `AUTH` → `LOGGING` → `SERDE`) that admit exactly one step each. The +reasons: + +- **Deterministic, inspectable ordering.** `Stage.order` is the single source of truth for run + order: lower-ordered stages run first (closer to the caller), higher ones run last (closer to + the wire). `HttpPipelineBuilder.build()` flattens the stages in that fixed order, and the + resulting `HttpPipeline.steps` is an unmodifiable, ordered list you can read back to see + exactly what runs and in what sequence. A decorator tower encodes the same order implicitly in + constructor nesting, which is harder to assemble correctly and impossible to enumerate after + the fact. +- **One place to reason about precedence.** Because the order lives in the `Stage` enum rather + than scattered across nesting sites, "does auth run before or after the retry loop?" is + answered by reading one enum, not by tracing who-wrapped-whom across call sites. Sparse `order` + values (100s apart) and interleaved non-pillar stages (`PRE_AUTH`, `POST_LOGGING`, …) leave + room to slot user steps at a precise point without renumbering or rebuilding the tower. The + surgical `insertAfter` / `insertBefore` / `replace` edits operate against this declared order. +- **Pillar-uniqueness invariants.** Redirect, retry, auth, logging, and serde are concerns you + want *exactly one* of — two retry layers or two auth layers is almost always a bug. A pillar + stage enforces that: installing a second step in a pillar replaces the first and emits a + `pipeline.pillar.replaced` SLF4J warning (`HttpPipelineBuilder`). The shipped pillar steps go + further and lock their slot at the type level — `RedirectStep`, `RetryStep`, `AuthStep`, and + `InstrumentationStep` each declare `final override val stage`, so a subclass cannot relocate + itself out of its pillar. Nested decorators cannot express "there is exactly one auth layer"; + nothing stops a caller wrapping `AuthClient` twice. +- **Sync/async mirroring.** The async layer (`AsyncHttpStep`, `AsyncHttpPipeline`, + `AsyncHttpPipelineBuilder`) reuses the identical `Stage` semantics and shares the staging + policy via the internal `StagedSteps` helper, so a step occupies the same ordered slot in + both the blocking and the `CompletableFuture`-returning pipeline. Keeping order in the data + (`Stage`) rather than in the control flow (constructor nesting) is what lets both runtimes + share one ordering definition instead of each re-deriving it. + +**The cost: the re-drive contract.** A decorator re-invokes downstream with a plain +`inner.execute(request)`, which is hard to get wrong. A stage step instead receives a +`PipelineNext` and calls `next.process()`; a step that needs to drive the downstream chain **more +than once** — retry re-attempting, redirect following a hop, auth retrying after a 401 — must +call `next.copy().process()` rather than reusing `next`. `PipelineNext` advances an internal +cursor, so re-using the same instance resumes *past* the steps already visited on the previous +pass instead of re-running the whole tail. Forgetting `copy()` fails silently — the second +attempt skips steps rather than throwing — which is strictly more error-prone than a decorator's +re-call. The shipped pillar steps follow the contract (`DefaultRetryStep`, `DefaultRedirectStep`, +and `AuthStep` all re-drive via `next.copy().process()`); custom wrapping steps at the +REDIRECT / RETRY / AUTH stages must do the same. See `PipelineNext.copy` and the `HttpStep` +contract for the normative wording. + --- ## Usage Examples diff --git a/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/HttpInstrumentationOptions.kt b/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/HttpInstrumentationOptions.kt index 633a7b15..6e7b7ee7 100644 --- a/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/HttpInstrumentationOptions.kt +++ b/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/HttpInstrumentationOptions.kt @@ -47,6 +47,18 @@ import org.dexpace.sdk.core.instrumentation.metrics.NoopMeter * return large downloads, server-sent events, gRPC, or chunked encodings whose size is unknown * ahead of time. [HttpLogLevel.BODY_AND_HEADERS] is intended for diagnostic builds against * small JSON/text payloads. + * + * Because the capture is a bounded preview, the logged `response.body.size` / + * `response.body.preview` fields describe the **captured preview**, not necessarily the full + * body: for a body larger than [bodyPreviewMaxBytes] the consumer still receives every byte + * while those fields reflect only the preview prefix. The separate `response.content.length` + * field carries the body's true length when the origin declared one. See + * `docs/http-body-logging-and-concurrency.md` ("Logged body size vs. the body the consumer + * receives"). + * + * @property bodyPreviewMaxBytes Upper bound, in bytes, on the in-memory body capture under + * [HttpLogLevel.BODY_AND_HEADERS]. Bounds the preview, not the body the consumer sees; the + * logged `*.body.size` fields are capped by it. Defaults to [DEFAULT_BODY_PREVIEW_MAX_BYTES]. */ public class HttpInstrumentationOptions @JvmOverloads diff --git a/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/HttpLogLevel.kt b/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/HttpLogLevel.kt index 5e75cf26..9571664a 100644 --- a/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/HttpLogLevel.kt +++ b/sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/HttpLogLevel.kt @@ -27,6 +27,8 @@ public enum class HttpLogLevel { * * Capture is **bounded** to the configured preview size (`bodyPreviewMaxBytes`), so large * or streaming responses are not buffered whole — the caller still streams the remainder. + * Consequently the logged `response.body.size` / `response.body.preview` fields reflect the + * captured preview, not necessarily the full body, which can be larger than the preview. * See [HttpInstrumentationOptions] for the streaming and async-completion-thread caveats. */ BODY_AND_HEADERS, From ca37e4d8ad032d120bde455904bf98617bba5c6b Mon Sep 17 00:00:00 2001 From: OmarAlJarrah Date: Tue, 16 Jun 2026 21:13:43 +0300 Subject: [PATCH 2/2] docs: count SERDE as reserved and note the terminal SEND singleton MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Clarify that the ordered-stage rationale lists five cross-cutting pillar stages (SERDE currently reserved/unused) and that SEND is also a singleton slot but represents the transport hop, not a configurable pillar — matching the six isPillar entries in the Stage enum. --- docs/pipelines.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/pipelines.md b/docs/pipelines.md index 4bb6d1ce..f50d500d 100644 --- a/docs/pipelines.md +++ b/docs/pipelines.md @@ -486,9 +486,10 @@ status-to-exception mapping you do not want a transient failure to surface prema A common alternative to a step list is to nest cross-cutting concerns as `HttpClient` decorators — `RedirectClient(RetryClient(AuthClient(LoggingClient(transport))))` — where each wrapper calls `inner.execute(request)`. The `http.pipeline` layer deliberately uses an ordered -list of `HttpStep`s keyed by a `Stage` enum instead, with five **pillar** stages -(`REDIRECT` → `RETRY` → `AUTH` → `LOGGING` → `SERDE`) that admit exactly one step each. The -reasons: +list of `HttpStep`s keyed by a `Stage` enum instead, with five cross-cutting **pillar** stages +(`REDIRECT` → `RETRY` → `AUTH` → `LOGGING` → `SERDE`, the last currently reserved/unused) that +admit exactly one step each, plus the terminal `SEND` slot — also a singleton, but the transport +hop itself rather than a configurable pillar. The reasons: - **Deterministic, inspectable ordering.** `Stage.order` is the single source of truth for run order: lower-ordered stages run first (closer to the caller), higher ones run last (closer to