Skip to content

fix(registry): add lock protection for serviceListeners map access#3442

Open
Aias00 wants to merge 7 commits into
apache:developfrom
Aias00:worktree-fix-service-listeners-lock
Open

fix(registry): add lock protection for serviceListeners map access#3442
Aias00 wants to merge 7 commits into
apache:developfrom
Aias00:worktree-fix-service-listeners-lock

Conversation

@Aias00

@Aias00 Aias00 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Problem

serviceListeners map in serviceDiscoveryRegistry was read/written without holding the lock, while the same struct's serviceMappingListeners already had proper lock protection via s.lock. This could cause data races when concurrent subscribe/unsubscribe operations access the map.

Unprotected access locations

  • SubscribeURL (line 365): read s.serviceListeners[serviceNamesKey] — no lock
  • SubscribeURL (line 381): write s.serviceListeners[serviceNamesKey] = listener — no lock
  • UnSubscribe (line 203): read s.serviceListeners[serviceNamesKey] — no lock

Contrast with protected serviceMappingListeners

  • findMappedServices: s.lock.Lock() / s.lock.Unlock()
  • stopListen: s.lock.Lock() / s.lock.Unlock()

Fix

  • SubscribeURL: Use s.lock.Lock() / s.lock.Unlock() to protect the check-then-act read+write of serviceListeners. A write lock is required (not RLock) because the method follows a check-then-act pattern — it first reads to check if a listener exists, then conditionally creates and writes one. Using RLock would allow concurrent reads but not prevent the TOCTOU race.
  • UnSubscribe: Use s.lock.RLock() / s.lock.RUnlock() for the read-only access of serviceListeners, consistent with read-lock usage elsewhere in the codebase.

Testing

  • go build ./registry/servicediscovery/...
  • go vet ./registry/servicediscovery/...
  • go test ./registry/servicediscovery/... -count=1

serviceListeners map was read/written without holding the lock in
SubscribeURL and UnSubscribe, while the same struct's serviceMappingListeners
already had proper lock protection. This could cause data races when
concurrent subscribe/unsubscribe operations access the map.

- SubscribeURL: use s.lock.Lock() to protect the check-then-act read+write
  of serviceListeners (write lock needed to prevent TOCTOU race)
- UnSubscribe: use s.lock.RLock() for the read access of serviceListeners

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 17, 2026 08:03

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR improves concurrency safety in the service discovery registry by adding mutex protection around accesses to serviceListeners.

Changes:

  • Add RLock/RUnlock when reading serviceListeners during UnSubscribe
  • Add Lock/Unlock when reading/updating serviceListeners during SubscribeURL
Comments suppressed due to low confidence (2)

registry/servicediscovery/service_discovery_registry.go:1

  • The lock is manually unlocked at the end of the block; this is fragile if a future change introduces an early return/panic path in the locked region. Consider using defer s.lock.Unlock() immediately after s.lock.Lock() to make the critical section robust against future control-flow changes.
/*

registry/servicediscovery/service_discovery_registry.go:1

  • The mutex is held while doing potentially expensive work (listener construction, iterating services, and calling Register). This can unnecessarily block other goroutines that need serviceListeners. Consider narrowing the critical section to only the map get/set (e.g., read under lock, create/register outside lock, then write back under lock with a double-check) to reduce contention.
/*

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov-commenter

codecov-commenter commented Jun 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 91.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.93%. Comparing base (60d1c2a) to head (936016a).
⚠️ Report is 861 commits behind head on develop.

Files with missing lines Patch % Lines
...try/servicediscovery/service_discovery_registry.go 91.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3442      +/-   ##
===========================================
+ Coverage    46.76%   53.93%   +7.17%     
===========================================
  Files          295      461     +166     
  Lines        17172    35487   +18315     
===========================================
+ Hits          8031    19141   +11110     
- Misses        8287    14838    +6551     
- Partials       854     1508     +654     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Alanxtl Alanxtl added ☢️ Bug 3.3.2 version 3.3.2 labels Jun 18, 2026
@Alanxtl Alanxtl self-assigned this Jun 18, 2026
protocol = url.Protocol
}
protocolServiceKey := url.ServiceKey() + ":" + protocol
s.lock.Lock()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unlock都跑到line 385 了,这太远了,一旦出现 panic 就死锁了,要善于使用 defer + lambda 表达式 啊,童子。我教你个技巧:
func() {
m.lock()
defer m.unlock()
xxx
}()

Aias00 and others added 3 commits June 18, 2026 23:06
…deadlock on panic

Wrap lock/unlock in anonymous functions with defer to ensure the lock
is always released even if a panic occurs between Lock() and Unlock(),
as suggested in PR review.

Co-Authored-By: Claude <noreply@anthropic.com>
Resolved conflict in registry/servicediscovery/service_discovery_registry.go

The conflict combined:
- develop's sortServices() for generating serviceNamesKey
- PR's lock protection for accessing serviceListeners map
@Aias00 Aias00 force-pushed the worktree-fix-service-listeners-lock branch from 8713ece to 936016a Compare June 27, 2026 05:47
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.3.2 version 3.3.2 ☢️ Bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants