forwarding ability: add main algo and rpc#241
Conversation
The per-pair uptime walk derives forwarding ability for both (A→B) and (B→A) directions in a single chronological pass over the merged event stream. Two independent state copies, one rooted at each peer, accumulate each direction's uptime against its own threshold. Self-pair calls collapse to a single recorded ability, so the bidirectional shape serves every pair in the cross-product without a parallel single-direction path on the diagonal. A liquidity check that would otherwise be O(channels-per-peer) per tick becomes O(1) via four running balance sums that shadow each side's online inbound and outbound capacity. The merge loop adjusts these sums as events mutate channel state, and the per-tick threshold check reads the mins inline. Trace-level logging guards on log.Level() up front so the hot path does not allocate slog.Attr per event when the handler is silent.
Establish the boundary between the chanevents store and the upcoming forwarding analyzer plus the seed walk every pair calculation depends on. EventsSource is the read surface the analyzer consumes, and ForwardingAnalyzer carries it alongside an lnd handle so the upcoming driver can fold lnd's open and closed channel sets into the considered population. getInitialChannelState reconstructs a channel's state at startTime by seeding from the latest pre-window update and replaying any residual same-timestamp siblings the SQL keyset may have surfaced. The residual range is bounded by definition (events between two adjacent timestamps within a channel), so streaming would buy nothing over materialising the slice in one call. A guard error flags later-timestamp updates in the residual walk as schema drift.
Wire EffectiveUptime as the analyzer's public entry. The pipeline resolves channels to peers from the store, folds lnd's forwarding history into per-pair success amounts, augments the considered set with lnd's open and closed channels to hedge survivorship bias, seeds each channel's state at startTime, and dispatches every peer pair to the bidirectional walk. calculateAllPairsUptime walks the unordered cross-product (i, j>=i) with a lazy per-peer event cache. Each peer's events are fetched once and replayed across every pair that consumes them. The cached events live as a slice so the cross-pair merge can use a two-pointer walk instead of an iter-based merge that would need goroutine plus channel synchronisation per event.
Promote the existing test-store and test-DB constructors from *testing.T to testing.TB so the upcoming EffectiveUptime benchmark can share the same fixture path as the existing tests. testing.TB is the shared interface of *testing.T and *testing.B, so every current caller keeps type-checking unchanged.
Drive realistic K² pair walks against two fixture shapes. The single-channel-per-peer bench (100 peers × 1 channel × 200 events) exercises the bench infrastructure and matches the typical Lightning node profile. The multi-channel-per-peer bench (30 peers × 5 channels × 130 events) keeps the per-peer cache fill and the cross-pair merge on the hot path at the same time, which is the shape where the analyzer is most sensitive to algorithmic choices. A bench fixture is cached under $TMPDIR keyed by the bench parameters so re-runs skip the SQLite insert cost. On a fixture hit only an OS-level file copy and the lnd-stub channel-list reconstruction run.
Define the ForwardingAbility gRPC method and its request/response shape in faraday.proto. The wire schema decouples the chanevents-backed analyzer from clients (frcli, external operators) and provides the canonical surface for querying peer-pair forwarding velocity and effective uptime over a time window.
Wire the gRPC ForwardingAbility method to the chanevents-backed analyzer. Input validation rejects out-of-range times and percentiles before they reach the analyzer, surfacing typed gRPC status codes instead of panics. The ability:read macaroon entity gates access in line with the existing events:read pattern.
Add the forwardingability command to frcli and a regtest itest that exercises both the analyzer logic and the gRPC wiring end-to-end. The itest drives a circular-pair payment through a regtest channel and asserts the resulting velocity and uptime fraction over the wire.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new analysis capability to evaluate the historical routing performance of peer pairs. By calculating forwarding velocity and effective uptime, it provides operators with actionable insights into channel liquidity and reliability. The implementation includes the core analysis engine, necessary RPC definitions, CLI support, and robust testing to ensure the metrics are calculated correctly and efficiently. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a ForwardingAbility RPC and CLI command to analyze peer-to-peer routing performance by calculating forwarding velocity and effective uptime. The implementation includes a ForwardingAnalyzer that combines historical channel events with LND's forwarding history to avoid survivorship bias. Feedback focuses on scalability and performance, noting that the O(N^2) peer-pair matrix could exceed gRPC message limits and suggesting pagination or filtering. Reviewers also recommended pre-calculating balance sums to reduce initialization overhead, adding context cancellation checks in nested loops, and fixing a potential nil pointer dereference when the LND client is not initialized.
| func (s *RPCServer) ForwardingAbility(ctx context.Context, | ||
| req *frdrpc.ForwardingAbilityRequest) ( | ||
| *frdrpc.ForwardingAbilityResponse, error) { |
There was a problem hiding this comment.
The ForwardingAbility RPC returns a full matrix of peer pairs, which grows quadratically ($O(N^2)$) with the number of peers. For nodes with a significant number of peers (e.g., >200), the resulting response size will likely exceed the default gRPC message limit of 4MB, causing the RPC to fail. It is highly recommended to implement pagination or allow filtering by specific peers (e.g., peer_in or peer_out) to ensure scalability.
| for j := i; j < len(peers); j++ { | ||
| peerB := peers[j] |
There was a problem hiding this comment.
The nested loops over peers perform
for j := i; j < len(peers); j++ {
if err := ctx.Err(); err != nil {
return nil, err
}
peerB := peers[j]| var sumARemote, sumALocal, sumBRemote, sumBLocal btcutil.Amount | ||
| for _, s := range statesA { | ||
| if s.online { | ||
| sumARemote += s.remoteBalance | ||
| sumALocal += s.localBalance | ||
| } | ||
| } | ||
| for _, s := range statesB { | ||
| if s.online { | ||
| sumBRemote += s.remoteBalance | ||
| sumBLocal += s.localBalance | ||
| } | ||
| } |
There was a problem hiding this comment.
The initial sums for online channel balances are calculated inside calculateBothDirectionsUptime, which is called calculateAllPairsUptime and passed into this function. This would reduce the initialization overhead from
| forwardingAnalyzer := chanevents.NewForwardingAnalyzer( | ||
| s.cfg.ChanEvents, s.cfg.Lnd, | ||
| ) |
There was a problem hiding this comment.
The code does not check if s.cfg.Lnd.Client is initialized before passing it to the analyzer. If the LND client is missing, this will lead to a nil pointer dereference when the analyzer attempts to fetch forwarding history.
if s.cfg.Lnd.Client == nil {
return nil, status.Error(
codes.FailedPrecondition,
"lnd client not configured",
)
}
forwardingAnalyzer := chanevents.NewForwardingAnalyzer(
s.cfg.ChanEvents, s.cfg.Lnd,
)
Implements a pair-wise uptime and forwarding velocity algorithm.