fix(op-devstack): eliminate rollup-boost RPC port-bind race#20938
Open
hdcesario-op wants to merge 2 commits into
Open
fix(op-devstack): eliminate rollup-boost RPC port-bind race#20938hdcesario-op wants to merge 2 commits into
hdcesario-op wants to merge 2 commits into
Conversation
The rollup-boost RPC port was pre-allocated by the Go harness
(net.Listen("127.0.0.1:0") -> Close() -> return port number), then
handed to the Rust subprocess via --rpc-port=N. The Rust binary had to
bind that port hundreds of milliseconds later (after Tokio init), during
which any other net.Listen(":0") in the test process could be handed
the same port by the kernel. Result: intermittent EADDRINUSE on
memory-all-opn-op-{reth,geth}, most visible on TestFlashblocksTransfer.
Adopt the same port-discovery pattern this binary already uses for its
flashblocks and debug ports:
- rollup-boost (cli.rs) logs "RPC server listening on <addr>" after
Server::build() returns, using the existing local_addr() API
(already used at proxy.rs:210, 842).
- op-devstack passes --rpc-port=0 and parses the bound address from
the log stream, mirroring the existing flashblocks parser.
Pre-allocation is removed entirely; cfg.RPCPort > 0 still pins a
specific port for callers that need it.
Resolves the TODO at op-devstack/sysgo/rollup_boost.go:119-122.
Follow-up: op-devstack/sysgo/mixed_runtime.go:486 (kona-node
KONA_METRICS_PORT) uses the same pre-allocation pattern and should be
migrated separately once kona-node logs its bound metrics address.
The stdout parser callback is held by r.sub (via NewSubProcess) and can fire after Start() returns. tasks.Await does not require channel closure — it returns on first value or ctx done. The select-default already drops duplicate emits while the channel is open, but if the channel is closed first, a send would panic. Drop the defers and document the lifecycle. No functional change for the success path; eliminates a latent panic if a late/duplicate log line ever races the deferred close.
This was referenced May 21, 2026
pcw109550
approved these changes
May 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
net.Listen(":0")→Close()→ return the port number) and then handed to the Rust subprocess via--rpc-port=N. Between the Go-sideClose()and the Rust-sidebind()(hundreds of ms later, after Tokio init), any othernet.Listen(":0")in the same test process can be handed the same port. Result: intermittentEADDRINUSEonmemory-all-opn-op-{reth,geth}, surfacing as either fast-fail (TestFlashblocksStream,TestFlashblocksTransferwith "TCP endpoint not ready within 5s") or slow-hang (TestFlashblocksTransferwith 30-minute context timeout terminating atop-devstack/sysgo/rollup_boost.go:132).0and logging the bound address. The RPC port was the lone outlier; the existing block comment atop-devstack/sysgo/rollup_boost.go:119–122flagged exactly this gap as a TODO.Closes #19883
Closes #19934