Skip to content

Fix P2P connection deadlines and credential separation#117

Merged
mateeullahmalik merged 1 commit intomasterfrom
fix/p2p-connection
Aug 8, 2025
Merged

Fix P2P connection deadlines and credential separation#117
mateeullahmalik merged 1 commit intomasterfrom
fix/p2p-connection

Conversation

@mateeullahmalik
Copy link
Collaborator

  • Fixed P2P nodes incorrectly marking active nodes as inactive due to stale connection deadlines
  • Resolved "message authentication failed" errors from shared ephemeral keys between client/server
  • Fixed "conn write: i/o timeout" errors from pooled connections retaining old deadlines

Root Causes

1. Shared KeyExchanger Cache

Client and server credentials were sharing the same KeyExchanger instance, causing ephemeral key collisions. When a node acted as both client and server, the
shared key storage led to authentication failures with "ephemeral private key not found" errors.

2. Missing Per-Operation Deadlines

Connections were created with a 10-minute deadline but never updated for individual operations. When checkNodeActivity sent pings expecting 10-second timeouts,
the mismatch caused timeouts and nodes were incorrectly marked inactive.

3. Wrong Connection Layer for Deadlines

The connWrapper was setting deadlines on the raw TCP connection instead of the ALTS encrypted connection layer, causing deadline operations to be ineffective.

Changes

Connection Management (network.go)

  • Added per-operation deadline refresh before each RPC call
  • Split single tc into clientTC and serverTC for proper credential separation
  • Ensures pooled connections get fresh deadlines matching operation timeout

Credential Isolation (lumeratc.go)

  • Modified KeyExchanger cache key to include side (0=client, 1=server)
  • Prevents ephemeral key collision between client and server operations
  • Each side now maintains independent key storage

Deadline Layer Fix (conn_pool.go)

  • Updated SetDeadline, SetReadDeadline, SetWriteDeadline to use secureConn
  • Ensures deadlines are set on the ALTS encryption layer, not raw TCP

DHT Initialization (dht.go)

  • Created separate server credentials alongside existing client credentials
  • Both credentials passed to Network initialization
  • Proper separation of incoming vs outgoing connection handling

Impact

  • No more false inactive node marking during health checks
  • Bidirectional P2P communication works without authentication errors
  • Connection pooling operates correctly with appropriate timeouts

@mateeullahmalik mateeullahmalik merged commit 5fb4930 into master Aug 8, 2025
7 checks passed
@mateeullahmalik mateeullahmalik deleted the fix/p2p-connection branch August 12, 2025 11:25
@mateeullahmalik mateeullahmalik restored the fix/p2p-connection branch August 12, 2025 11:25
@mateeullahmalik mateeullahmalik deleted the fix/p2p-connection branch August 12, 2025 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants