Skip to content

forwarding ability: add main algo and rpc#241

Open
bitromortac wants to merge 8 commits into
lightninglabs:faraday-forwarding-abilityfrom
bitromortac:2604-fwd-prep-11
Open

forwarding ability: add main algo and rpc#241
bitromortac wants to merge 8 commits into
lightninglabs:faraday-forwarding-abilityfrom
bitromortac:2604-fwd-prep-11

Conversation

@bitromortac
Copy link
Copy Markdown
Contributor

Implements a pair-wise uptime and forwarding velocity algorithm.

The per-pair uptime walk derives forwarding ability for both (A→B) and
(B→A) directions in a single chronological pass over the merged event
stream. Two independent state copies, one rooted at each peer, accumulate
each direction's uptime against its own threshold. Self-pair calls
collapse to a single recorded ability, so the bidirectional shape serves
every pair in the cross-product without a parallel single-direction path
on the diagonal.

A liquidity check that would otherwise be O(channels-per-peer) per tick
becomes O(1) via four running balance sums that shadow each side's
online inbound and outbound capacity. The merge loop adjusts these sums
as events mutate channel state, and the per-tick threshold check reads
the mins inline. Trace-level logging guards on log.Level() up front so
the hot path does not allocate slog.Attr per event when the handler is
silent.
Establish the boundary between the chanevents store and the upcoming
forwarding analyzer plus the seed walk every pair calculation depends
on. EventsSource is the read surface the analyzer consumes, and
ForwardingAnalyzer carries it alongside an lnd handle so the upcoming
driver can fold lnd's open and closed channel sets into the considered
population.

getInitialChannelState reconstructs a channel's state at startTime by
seeding from the latest pre-window update and replaying any residual
same-timestamp siblings the SQL keyset may have surfaced. The residual
range is bounded by definition (events between two adjacent timestamps
within a channel), so streaming would buy nothing over materialising
the slice in one call. A guard error flags later-timestamp updates in
the residual walk as schema drift.
Wire EffectiveUptime as the analyzer's public entry. The pipeline
resolves channels to peers from the store, folds lnd's forwarding
history into per-pair success amounts, augments the considered set with
lnd's open and closed channels to hedge survivorship bias, seeds each
channel's state at startTime, and dispatches every peer pair to the
bidirectional walk.

calculateAllPairsUptime walks the unordered cross-product (i, j>=i) with
a lazy per-peer event cache. Each peer's events are fetched once and
replayed across every pair that consumes them. The cached events live as
a slice so the cross-pair merge can use a two-pointer walk instead of an
iter-based merge that would need goroutine plus channel synchronisation
per event.
Promote the existing test-store and test-DB constructors from
*testing.T to testing.TB so the upcoming EffectiveUptime benchmark can
share the same fixture path as the existing tests. testing.TB is the
shared interface of *testing.T and *testing.B, so every current caller
keeps type-checking unchanged.
Drive realistic K² pair walks against two fixture shapes. The
single-channel-per-peer bench (100 peers × 1 channel × 200 events)
exercises the bench infrastructure and matches the typical Lightning
node profile. The multi-channel-per-peer bench (30 peers × 5 channels
× 130 events) keeps the per-peer cache fill and the cross-pair merge
on the hot path at the same time, which is the shape where the
analyzer is most sensitive to algorithmic choices.

A bench fixture is cached under $TMPDIR keyed by the bench parameters
so re-runs skip the SQLite insert cost. On a fixture hit only an
OS-level file copy and the lnd-stub channel-list reconstruction run.
Define the ForwardingAbility gRPC method and its request/response shape
in faraday.proto. The wire schema decouples the chanevents-backed
analyzer from clients (frcli, external operators) and provides the
canonical surface for querying peer-pair forwarding velocity and
effective uptime over a time window.
Wire the gRPC ForwardingAbility method to the chanevents-backed
analyzer. Input validation rejects out-of-range times and percentiles
before they reach the analyzer, surfacing typed gRPC status codes
instead of panics. The ability:read macaroon entity gates access in
line with the existing events:read pattern.
Add the forwardingability command to frcli and a regtest itest that
exercises both the analyzer logic and the gRPC wiring end-to-end. The
itest drives a circular-pair payment through a regtest channel and
asserts the resulting velocity and uptime fraction over the wire.
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new analysis capability to evaluate the historical routing performance of peer pairs. By calculating forwarding velocity and effective uptime, it provides operators with actionable insights into channel liquidity and reliability. The implementation includes the core analysis engine, necessary RPC definitions, CLI support, and robust testing to ensure the metrics are calculated correctly and efficiently.

Highlights

  • Core Algorithm: Implemented the ForwardingAnalyzer to compute pair-wise forwarding velocity and effective uptime by fusing historical channel events with live lnd state.
  • RPC Interface: Added the ForwardingAbility RPC endpoint to allow querying routing performance metrics for peer pairs.
  • CLI Support: Integrated a new command into frcli to facilitate easy access to the forwarding ability metrics.
  • Testing: Added comprehensive unit tests, benchmarks, and an integration test to ensure accuracy and performance of the new analysis logic.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a ForwardingAbility RPC and CLI command to analyze peer-to-peer routing performance by calculating forwarding velocity and effective uptime. The implementation includes a ForwardingAnalyzer that combines historical channel events with LND's forwarding history to avoid survivorship bias. Feedback focuses on scalability and performance, noting that the O(N^2) peer-pair matrix could exceed gRPC message limits and suggesting pagination or filtering. Reviewers also recommended pre-calculating balance sums to reduce initialization overhead, adding context cancellation checks in nested loops, and fixing a potential nil pointer dereference when the LND client is not initialized.

Comment on lines +18 to +20
func (s *RPCServer) ForwardingAbility(ctx context.Context,
req *frdrpc.ForwardingAbilityRequest) (
*frdrpc.ForwardingAbilityResponse, error) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ForwardingAbility RPC returns a full matrix of peer pairs, which grows quadratically ($O(N^2)$) with the number of peers. For nodes with a significant number of peers (e.g., >200), the resulting response size will likely exceed the default gRPC message limit of 4MB, causing the RPC to fail. It is highly recommended to implement pagination or allow filtering by specific peers (e.g., peer_in or peer_out) to ensure scalability.

Comment thread chanevents/analyzer.go
Comment on lines +449 to +450
for j := i; j < len(peers); j++ {
peerB := peers[j]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The nested loops over peers perform $O(N^2)$ pair-wise calculations. For nodes with a large number of peers, this process can be time-consuming. It is recommended to check for context cancellation within the loops to ensure the operation can be aborted if the client disconnects or the request times out.

		for j := i; j < len(peers); j++ {
			if err := ctx.Err(); err != nil {
				return nil, err
			}

			peerB := peers[j]

Comment thread chanevents/analyzer.go
Comment on lines +631 to +643
var sumARemote, sumALocal, sumBRemote, sumBLocal btcutil.Amount
for _, s := range statesA {
if s.online {
sumARemote += s.remoteBalance
sumALocal += s.localBalance
}
}
for _, s := range statesB {
if s.online {
sumBRemote += s.remoteBalance
sumBLocal += s.localBalance
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The initial sums for online channel balances are calculated inside calculateBothDirectionsUptime, which is called $O(N^2)$ times. Since the initial state for each peer is constant across all pair-wise calculations, these sums can be pre-calculated once per peer in calculateAllPairsUptime and passed into this function. This would reduce the initialization overhead from $O(N^2 \cdot C)$ to $O(N \cdot C + N^2)$, where $C$ is the number of channels per peer.

Comment on lines +61 to +63
forwardingAnalyzer := chanevents.NewForwardingAnalyzer(
s.cfg.ChanEvents, s.cfg.Lnd,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The code does not check if s.cfg.Lnd.Client is initialized before passing it to the analyzer. If the LND client is missing, this will lead to a nil pointer dereference when the analyzer attempts to fetch forwarding history.

	if s.cfg.Lnd.Client == nil {
		return nil, status.Error(
			codes.FailedPrecondition,
			"lnd client not configured",
		)
	}

	forwardingAnalyzer := chanevents.NewForwardingAnalyzer(
		s.cfg.ChanEvents, s.cfg.Lnd,
	)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant