Skip to content

Sync instrumentation step eagerly drains unknown-length response bodies, stalling time-to-first-byte under BODY_AND_HEADERS #107

Description

@OmarAlJarrah

Summary

The synchronous instrumentation step (DefaultInstrumentationStep) and its async counterpart
(DefaultAsyncInstrumentationStep) handle unknown-length response bodies asymmetrically. The async
step deliberately skips body capture when contentLength() < 0; the sync step does not. As a result,
a streaming / unknown-length response (SSE, long-poll, chunked trickle) logged at
HttpLogLevel.BODY_AND_HEADERS blocks the caller's thread inside the step until the bounded preview
fills or the producer reaches EOF — turning time-to-first-byte into "however long the first
bodyPreviewMaxBytes take to arrive."

The asymmetry

The async step guards the bounded drain on the body's declared length:

sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/DefaultAsyncInstrumentationStep.kt:241-268

private fun wrapResponseForLogging(response: Response): Response {
    val responseBody = response.body
    if (!shouldCaptureBody() || responseBody == null) return response
    // The bounded drain below runs on the future-completion thread. For an unknown-length
    // (streaming) body the read could block on a slow/idle producer and stall the
    // completion thread, so we skip body capture entirely for contentLength() < 0 ...
    if (responseBody.contentLength() < 0L) return response          // <-- the skip
    val wrapped = LoggableResponseBody.bounded(responseBody, Io.provider, options.bodyPreviewMaxBytes.toLong())
    try {
        wrapped.snapshot(options.bodyPreviewMaxBytes)               // forces the bounded drain
    } catch (t: Throwable) { ... }
    return response.newBuilder().body(wrapped).build()
}

The sync step has the identical structure but no contentLength() < 0 guard:

sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/DefaultInstrumentationStep.kt:151-172

private fun wrapResponseForLogging(response: Response): Response {
    val responseBody = response.body
    if (!shouldCaptureBody() || responseBody == null) return response
    // (no length check here)
    val wrapped = LoggableResponseBody.bounded(responseBody, Io.provider, options.bodyPreviewMaxBytes.toLong())
    try {
        wrapped.snapshot(options.bodyPreviewMaxBytes)               // <-- eager drain on the caller thread
    } catch (t: Throwable) { ... }
    return response.newBuilder().body(wrapped).build()
}

The eager snapshot(...) is called synchronously inside process(...)
(DefaultInstrumentationStep.kt:127, wrapResponseForLogging(raw)), which runs on the caller's
thread before process returns the response.

Failure mode

For a streaming / unknown-length response under BODY_AND_HEADERS:

  • The async step returns immediately and the body streams to the caller unwrapped.
  • The sync step blocks in snapshot(options.bodyPreviewMaxBytes) until bodyPreviewMaxBytes
    (default 8 KiB, see HttpInstrumentationOptions.DEFAULT_BODY_PREVIEW_MAX_BYTES) have been read or
    the producer sends EOF. For an SSE / long-poll endpoint that trickles a few bytes per event, the
    caller does not get the response object back until the preview cap fills — first byte could be
    seconds or minutes late, gated entirely by producer pacing rather than network transfer.

The async step's own KDoc names exactly this hazard as the reason for its skip
(DefaultAsyncInstrumentationStep.kt:50-56): "an unknown-length (streaming) response body
(contentLength() < 0) is left unwrapped — draining it could block the completion thread on a
slow/idle producer." The same reasoning applies to the sync caller thread, but the guard is missing.

Why the existing docs don't cover this

HttpInstrumentationOptions documents the streaming caveat as async-only and frames the sync drain
as safe: "In the sync step the bounded drain happens eagerly inside the step. In the async
step the bounded drain ... is skipped for unknown-length (streaming) bodies"
(HttpInstrumentationOptions.kt:38-43). It does warn callers to prefer HEADERS/NONE for SSE and
unknown-size downloads, but it presents the eager sync drain as a deliberate, safe choice rather than
a stall risk — so a caller who keeps BODY_AND_HEADERS on for a mixed workload (small JSON plus the
occasional streaming endpoint) hits the stall with no warning.

Suggested fix

Apply the same guard to the sync step's wrapResponseForLogging:

if (responseBody.contentLength() < 0L) return response

This makes the two steps symmetric: an unknown-length body streams to the caller unwrapped, with no
body preview (the http.response event still carries headers, status, and
response.content.length = -1). Known-length bodies keep the bounded preview as today.

Update HttpInstrumentationOptions.kt so it no longer implies the eager sync drain is unconditionally
safe — note that the sync step also skips capture for unknown-length bodies.

Notes

  • response.content.length is already emitted from response.body?.contentLength() in both steps
    (DefaultInstrumentationStep.kt:209, DefaultAsyncInstrumentationStep.kt:305), so skipping the
    preview loses only the body bytes, not the size/headers.
  • This is independent of the open body-logging PRs: the charset-aware preview refactor
    (BodyPreview) moves preview decoding into a shared helper but does not add a length guard to the
    sync step, and the body-size docs PR documents field semantics without changing the sync drain.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingsdk-coresdk-core toolkit

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions