Skip to content

Improve yamux write backpressure#486

Open
StefanBratanov wants to merge 2 commits into
libp2p:developfrom
StefanBratanov:fix/yamux-write-backpressure
Open

Improve yamux write backpressure#486
StefanBratanov wants to merge 2 commits into
libp2p:developfrom
StefanBratanov:fix/yamux-write-backpressure

Conversation

@StefanBratanov

@StefanBratanov StefanBratanov commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR improves yamux write backpressure by removing yamux’s private outbound data buffer and routing blocked writes
through Netty stream-channel backpressure instead.

Previously, yamux buffered blocked stream writes internally in ByteBufQueue, controlled by
maxBufferedConnectionWrites. Downstream users had to tune that value, and in Teku it needed to be set as high as 150
MiB to avoid write-buffer overflow. That made yamux unexpectedly memory intensive.

With this change, maxBufferedConnectionWrites is removed. Yamux no longer exposes a connection-level buffered-write
knob. When a stream cannot send because its yamux send window is exhausted, data remains pending in the stream
MuxChannel outbound buffer, the stream becomes unwritable, and the write futures remain pending. Writes resume when
a WINDOW_UPDATE is received.

Changes

  • Remove maxBufferedConnectionWrites from the yamux API.
  • Remove yamux’s internal ByteBufQueue send buffer.
  • Make muxed stream channels reflect:
    • parent Netty channel writability
    • yamux stream send-window availability
    • stream write-buffer watermarks
  • Keep stream writes pending while yamux cannot consume data due to flow control.
  • Retry pending writes on yamux WINDOW_UPDATE and parent channel writability changes.
  • Delay local FIN until already-flushed stream data has drained.
  • Preserve correct ByteBuf ownership for pending partial writes and ping echo responses.
  • Update yamux tests for pending writes, parent backpressure, and exhausted send-window behavior.

Notes

This should address the main yamux memory/backpressure issue that forced Teku to configure very large
maxBufferedConnectionWrites values. Applications should no longer need to choose a yamux buffered-write limit.

This also substantially addresses stream-level Channel.isWritable support for muxed streams, related to #282. It
does not fully implement the stricter #316 behavior of completing stream write futures only after bytes are written to
the underlying transport, but write futures now remain pending while yamux cannot consume data due to flow-control.

@StefanBratanov StefanBratanov force-pushed the fix/yamux-write-backpressure branch from 48b8fe0 to 16ebd01 Compare June 5, 2026 09:37
@StefanBratanov

Copy link
Copy Markdown
Collaborator Author

@Nashatyrev if of interest, you can have a look

@Nashatyrev

Copy link
Copy Markdown
Collaborator

I'm a bit nervous about significant changes in common classes (AbstractMuxHandler and MuxChannel). Do we really want to alter Mplex behavior here?

@StefanBratanov

StefanBratanov commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Revamp with Fable @Nashatyrev:

Your teammate's feedback was right on all three points — here's how each is addressed

  1. "Memory issue just moved elsewhere" — Correct: stalled writes sat in each child channel's ChannelOutboundBuffer, and
    the 64 KiB watermark only flips the writability flag — Netty never enforces a hard cap. The Netty-native part of your PR
    is still the right approach (write promises complete only on actual transmission, isWritable() works), but it's now backed
    by a hard limit (below).

  2. "No per-connection limit → DoS" — Reinstated maxBufferedConnectionWrites (10 MiB default), which also restores the
    public getYamux(maxBufferedConnectionWrites, ackBacklogLimit) API your PR had silently broken. The new enforcement is
    actually stronger than the old one: it measures real pending bytes across all child outbound buffers on every flush, so it
    also bounds parent TCP backpressure stalls — which the old sendBuffer accounting never covered at all. On overflow: RST
    the offending stream, fail its pending writes with WriteBufferOverflowMuxerException, close the stream channel so its
    memory is freed immediately.

  3. "Don't alter Mplex" — All backpressure logic (parent-writability gating, watermarks, writability flags,
    channelWritabilityChanged) now lives in YamuxHandler overrides. AbstractMuxHandler and MuxChannel kept only neutral hooks:
    onChildWrite returns bytes-consumed (Mplex always consumes fully, so it never stalls), and onChildRegistered defaults to
    no-op. Mplex behaves byte-for-byte as on develop, including writing through TCP backpressure as it always did.

Robustness fixes found during the rework

  • Removed retainedPendingWrite entirely — I traced every path: the retain/release pair was refcount-neutral everywhere
    because ChannelOutboundBuffer already owns pending messages (releases on remove/fail/close). The leak-checking test
    harness confirms no leaks without it, and MuxChannel is now much closer to its develop shape — which should ease your
    teammate's nerves about that class.
  • Limit check was unreachable in your version: doWrite broke out before onChildWrite whenever canWriteChild was false, so
    once a window was exhausted no limit could ever run. Stall decisions now flow through onChildWrite's return value, so the
    cap is checked on every flush.
  • Window-counter overflow: a hostile peer sending huge cumulative WINDOW_UPDATEs could wrap sendWindowSize negative and
    wedge the stream; the arithmetic is now clamped (signed deltas still work — your existing window-shrink test still
    passes).
  • Deferred-FIN counter leak: if a pending write failed during a deferred disconnect, the counter never hit zero and the
  • Deferred-FIN counter leak: if a pending write failed during a deferred disconnect, the counter never hit zero and the
    FIN was never sent. Fixed.
  • One Netty trap worth knowing: closing a child channel inside a flush fails its writes synchronously with
    ClosedChannelException, masking the overflow exception — the close is deferred to the event loop so promises fail with the
    real cause.

New tests

Overflow → RST + failed promises + closed stream (both stall causes: window exhausted and parent unwritable), child
unwritable on window exhaustion / writable again after update, and cumulative window-update overflow.

… to yamux

- Reinstate maxBufferedConnectionWrites (10 MiB default) and its public API on
  StreamMuxerProtocol.getYamux. The limit is now enforced against the actual
  bytes pending in the child channel outbound buffers, covering both
  send-window stalls and parent TCP backpressure stalls (the latter was never
  bounded before). On overflow the offending stream is reset, its pending
  writes are failed and the stream channel is closed, releasing the buffers.
- Move all writability/backpressure logic (parent writability gating, write
  buffer watermarks, user-defined writability flags) from the common classes
  into YamuxHandler. AbstractMuxHandler and MuxChannel keep only neutral
  extension points, so Mplex behaves exactly as before.
- Remove the redundant retainedPendingWrite ref-counting in MuxChannel: the
  ChannelOutboundBuffer owns pending messages and releases them on
  remove/fail/close on every path.
- Clamp send window arithmetic so cumulative window updates from a
  misbehaving remote cannot overflow the window counter.
- Fix the deferred-disconnect counter not being decremented when a pending
  write fails, which could leave the FIN frame unsent.
@StefanBratanov StefanBratanov force-pushed the fix/yamux-write-backpressure branch from 6d07d59 to 5db52d3 Compare June 11, 2026 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants