Skip to content

trickle: align Go publisher with SDK probe-and-fallback pattern#3926

Closed
rickstaa wants to merge 1 commit into
trickle-next-endpointfrom
trickle-publisher-probe-fallback
Closed

trickle: align Go publisher with SDK probe-and-fallback pattern#3926
rickstaa wants to merge 1 commit into
trickle-next-endpointfrom
trickle-publisher-probe-fallback

Conversation

@rickstaa

@rickstaa rickstaa commented May 8, 2026

Copy link
Copy Markdown
Member

Stacked on #3925. Targets trickle-next-endpoint branch; will retarget to master when #3925 merges.

Summary

Mirror the Python SDK's first-POST flow in the Go HTTP publisher:
GET {baseURL}/next to learn the server's nextWrite, fall back to slot 0
on probe failure. Establishes a single canonical pattern across the two
trickle clients in the Livepeer stack.

Why now

The Python SDK probes /next to learn the next-write slot before its
first POST. The Go publisher (used by gateway → orchestrator media-in)
historically just started at slot 0. With #3925 introducing the /next
endpoint, parity becomes possible — and worth taking, so we have one
canonical client-side pattern to maintain.

Behavior

Probe outcome Action
/next returns Lp-Trickle-Latest Use the resolved slot
/next returns 400 / no header (pre-#3925 servers) Fall back to slot 0
Probe network error Fall back to slot 0

Wire behavior on a fresh channel is identical to today — both old
(always-0) and new (probe-then-resolve) paths POST to /0, then /1,
/2, ... The probe adds one round-trip on session start.

Why fallback returns 0, not -1

Same reason the SDK does: -1 would let the server resolve to
nextWrite, but the publisher has no read-back path to learn what slot
the server picked. The local counter then increments past whatever the
server chose, causing the second POST to race the first at slot 0.
Returning 0 keeps the publisher's local counter in lockstep with the
server.

Tests

trickle/trickle_test.go adds two tests via mock /next handlers:

  • TestTrickle_PublisherProbeFallback — server returns 400 on
    /next. Asserts first POST targets slot 0.
  • TestTrickle_PublisherProbeSuccess — server returns
    Lp-Trickle-Latest: 7 on /next. Asserts first POST targets slot 7.

The existing trickle test suite continues to pass against the configured
trickle server (which doesn't have /next on master) since the fallback
path is the existing behavior in disguise.

Relationship to #3884

Josh's #3884 builds the server side (/next route, Lp-Trickle-Seq on
POST responses, Lp-Trickle-Reset handling) but does not update the
Go HTTP publisher to consume any of it. This PR fills that client-side
gap.

Test plan

  • go build ./trickle/... passes
  • go test ./trickle/... passes (existing + new)
  • Manual: build orchestrator/gateway from this branch, run live-video-to-video against it, confirm media-in publishes correctly

Refs livepeer/livepeer-python-gateway#12.

🤖 Generated with Claude Code

@github-actions github-actions Bot added the go Pull requests that update Go code label May 8, 2026
@rickstaa rickstaa force-pushed the trickle-publisher-probe-fallback branch from 3e07d6f to cc202db Compare May 8, 2026 12:16
Mirror the Python SDK's first-POST flow in the Go publisher:
GET {baseURL}/next to learn the server's nextWrite, fall back to slot 0
on probe failure. Establishes a single canonical pattern across the two
trickle clients in the Livepeer stack.

Wire behavior on a fresh channel is unchanged — both old (always-0) and
new (probe-then-resolve) paths POST to /0 first, then /1, /2, ...
The probe adds one round-trip on session start; against a server with
the /next route (#3925) it returns nextWrite cleanly,
against a server without it (today's master) the failure path resolves
to 0 — the same starting slot.

The fallback explicitly returns 0 (not -1) for the same reason the
Python SDK does: -1 would let the server resolve to nextWrite, but the
publisher has no read-back path to learn what slot the server picked.
The local counter then increments past whatever the server chose,
causing the second POST to race the first at slot 0. Returning 0 keeps
the publisher's local counter in lockstep with the server.

Tests in trickle_test.go cover both paths via mock /next handlers
(success returning Lp-Trickle-Latest, fallback returning 400). The
existing test suite continues to pass against the configured trickle
server (which doesn't have /next on master) since the fallback path is
the existing behavior in disguise.

Refs #3925, livepeer/livepeer-python-gateway#12.
@rickstaa rickstaa force-pushed the trickle-publisher-probe-fallback branch from cc202db to 58e6add Compare May 8, 2026 12:29
@rickstaa rickstaa closed this May 8, 2026
@github-project-automation github-project-automation Bot moved this from Triage to Done in Engineering Roadmap May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go Pull requests that update Go code

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant