Skip to content

Fix lldp_syncd force-repopulate cascade on high-scale systems#80

Closed
ZhaohuiS wants to merge 2 commits intosonic-net:masterfrom
ZhaohuiS:fix/lldp-syncd-incremental-update
Closed

Fix lldp_syncd force-repopulate cascade on high-scale systems#80
ZhaohuiS wants to merge 2 commits intosonic-net:masterfrom
ZhaohuiS:fix/lldp-syncd-incremental-update

Conversation

@ZhaohuiS
Copy link
Copy Markdown
Contributor

@ZhaohuiS ZhaohuiS commented Apr 16, 2026

  • Description:

Fix LLDP neighbor table instability on high-scale systems (500+ ports) while preserving APPL_DB flush recovery.

Problem 1 (PR #71 scenario): When APPL_DB is flushed externally (e.g. orchagent restart), lldp_syncd's in-memory cache still has full data. For interfaces like eth0 that stay up, is_only_time_mark_modified() returns True, so only lldp_rem_time_mark gets written - leaving APPL_DB incomplete.

Problem 2 (issue #26568): PR #71 fixed this by force-repopulating ALL changed interfaces when new or deleted interfaces exist. But on high-scale systems (SN5640, 512 ports) with VM topology, new neighbors arrive every 10s sync cycle during convergence. This triggers force-repopulate every cycle - cascading storm (3412+ force repopulates, 125 deletes, table never stabilizes).

Root cause: Using new or deleted as a proxy for APPL_DB might be stale is too broad - it fires every cycle during normal convergence.

  • How to fix:

Instead of branching on new or deleted, verify APPL_DB completeness per-interface before using the time_mark-only shortcut:

  1. exists() - check if APPL_DB has the entry
  2. get_all() - check if it has more than just lldp_rem_time_mark (>1 field)
  3. If complete - just update lldp_rem_time_mark (efficient, no churn)
  4. If missing/incomplete - full delete + hmset (handles flush scenario)

This handles both scenarios:

  • APPL_DB flush: entry missing - full repopulate
  • High-scale convergence: entry complete - just time_mark - no cascade

Also fixes mock set() to properly emulate Redis HSET (update one field, not replace entire hash), and adds regression tests for both scenarios.

  • Test:

Verified on DUT str5-sn5640-6 (Mellanox SN5640, 512 ports, VM topology t0-isolated-d256u256s2):

Metric Before (PR #71 code) After (this fix)
delete_table 125 0
force_repopulate 3412+ 0
Convergence Never stable (oscillating 137-235) Monotonic 0-108-118-145-160-202 (stable in 3.5 min)

New unit tests:

30233240

When new or deleted LLDP neighbors were detected, lldp_syncd's sync()
method force-repopulated ALL changed interfaces (delete+re-add in
APP_DB), bypassing the is_only_time_mark_modified optimization.

On high-scale systems (500+ ports), this caused a cascading
repopulation storm during LLDP convergence: new neighbors arriving
each 10-second sync cycle triggered force-repopulate of all existing
entries, resulting in continuous LLDP table churn that never
stabilizes.

This change removes the if/else branching based on new/deleted
presence and always applies incremental updates for changed
interfaces. New and deleted interfaces are still handled correctly.

Tested on Mellanox SN5640 (512 ports, 200+ VM LLDP neighbors):
- Before fix: 125 deletes, 3412+ force repopulates, never stable
- After fix: 0 deletes, 0 unnecessary repopulates, stable in 3.5 min

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@ZhaohuiS ZhaohuiS marked this pull request as draft April 17, 2026 07:12
The previous fix (PR sonic-net#71) force-repopulated ALL changed interfaces when
new/deleted interfaces existed, to handle the case where APPL_DB was
flushed externally but the daemon cache still had full data. However,
on high-scale systems (500+ ports), this caused a cascading repopulation
storm during LLDP convergence (issue sonic-buildimage#26568).

Instead of using new/deleted as a proxy for APPL_DB staleness, verify
APPL_DB completeness per-interface: check exists() + get_all() before
using the time_mark-only optimization. If APPL_DB is missing or has
only one field (stale), do full repopulate. This handles both:

1. APPL_DB flush scenario (PR sonic-net#71): entry missing -> full hmset
2. High-scale convergence (#26568): entry complete -> just time_mark

Also fix mock set() to properly emulate Redis HSET (update one field
instead of replacing entire hash), and add regression tests for both
scenarios.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@ZhaohuiS
Copy link
Copy Markdown
Contributor Author

DUT Verification - Combined Fix on SN5640 (512 ports)

DUT: str5-sn5640-6 (Mellanox-SN5640-C512S2, t0-isolated-d256u256s2)
Image: SONiC master (6.12.41+deb13)
Test: sudo systemctl restart lldp then apply fix then monitor with while true; do lldpctl -f json | jq '.lldp.interface | length'; sleep 1; done

Convergence Pattern (monotonic, NO oscillation)

07:49:04 ->   0  (lldpd starting)
07:49:32 -> 124  (first batch of VM neighbors)
07:49:46 -> 125  (stable ~90s)
07:51:02 -> 188  (second wave)
07:52:32 -> 192  (third wave)
07:53:02 -> 186  (minor TTL expiry, 6 VMs)
07:54:02 -> 203  (more neighbors)
07:55:02 -> 198  (stable)
Final    -> 172  (normal VM TTL churn)

Syslog Event Counts

Event Count Notes
Add new 238 Legitimate new neighbors during convergence
Delete table_key 66 TTL-expired neighbors removed
Repopulate for changed 9 Real field changes (e.g. lldp_rem_index updated)
Repopulate stale 0 APPL_DB had complete data, time_mark-only shortcut worked correctly

Key Findings

  1. No cascading storm - count rises monotonically, no oscillation (previously saw 137<->235 ping-pong with master code)
  2. Repopulate stale = 0 - APPL_DB verification logic correctly detected complete entries and used the efficient time_mark-only update path
  3. Only 9 real changes out of 238 adds - the fix avoids force-repopulating unchanged interfaces, which was the root cause of the cascade
  4. Deletes are legitimate - TTL expiries from VM neighbors with varying LLDP intervals, not cascade-induced

Comparison: Before vs After

Metric Before (master) After (this PR)
Convergence pattern Oscillating 137<->235 Monotonic 0->124->188->198
Delete storms Hundreds per cycle 66 total (TTL only)
Force repopulate All changed interfaces Only stale/missing entries
Steady state Never reaches Stable within ~5 min

Tested by: @ZhaohuiS

@ZhaohuiS ZhaohuiS closed this Apr 20, 2026
@ZhaohuiS
Copy link
Copy Markdown
Contributor Author

Fix lldp issue for high scale testbed by sonic-net/sonic-buildimage#26873

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants