use sync cond by YakirOren · Pull Request #302 · kubescape/storage

YakirOren · 2026-03-24T08:28:22Z

Summary by CodeRabbit

Release Notes

Refactor
- Improved context cancellation responsiveness in concurrent lock operations
- Optimized lock acquisition mechanism for enhanced concurrency efficiency
- Streamlined error handling by consolidating context validation errors
Tests
- Expanded test coverage for high-contention scenarios and cancellation edge cases
- Added verification of proper resource cleanup under concurrent access

Signed-off-by: Yakir Oren <yakiroren@gmail.com>

coderabbitai · 2026-03-24T08:28:39Z

Warning

Rate limit exceeded

@YakirOren has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 11 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f87db4cb-b62d-4f74-820a-b9042b58d20c

📥 Commits

Reviewing files that changed from the base of the PR and between e321228 and d5f714f.

📒 Files selected for processing (2)

pkg/utils/mutex.go
pkg/utils/mutex_test.go

📝 Walkthrough

Walkthrough

This pull request refactors the MapMutex synchronization mechanism, replacing backoff-based timeout handling with condition variable–based synchronization. The internal lock bookkeeping changes from ref-counted locks to a per-key keyState structure, and context validation is simplified by removing unnecessary exported error types.

Changes

Cohort / File(s)	Summary
Dependency Management `go.mod`	Removed backoff library dependency; moved `github.com/cenkalti/backoff/v5` from direct to indirect `require` block. Removed unused OpenTelemetry logging test dependency.
Mutex Implementation `pkg/utils/mutex.go`	Replaced backoff-based lock acquisition with condition variable synchronization. Refactored internal state from `refCountedLock` map to per-key `keyState` containing `sync.Cond` and reader/writer counters. Simplified context validation; removed `ContextNotCancellableError`, `ContextNoTimeoutError`, and `TimeOutError` exports.
Mutex Tests `pkg/utils/mutex_test.go`	Updated test contexts to use shared cleanup pattern. Removed eviction tests; consolidated into `TestMapMutex_EvictionWhileHeldAndAfterUnlock`. Updated all internal state assertions to check `len(m.keys)` under `m.mu` with polling. Added test for nil context error and high-contention single-key scenario.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

fix MapMutex unbounded growth by adding refcounted eviction #301: Prior work introducing refCountedLock acquire/release mechanism; this PR supersedes that approach with a Cond-based keyState design to fix unbounded MapMutex growth.

Suggested reviewers

slashben

Poem

🐰 From backoff loops to Cond we hop,\br
State machines now sync without a stop,$br
Readers and writers dance with grace,$br
As mutexes evict without a trace! 🔒✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'use sync cond' directly reflects the main architectural change: replacing the timeout/backoff-based locking mechanism with sync.Cond-based condition variable synchronization in the MapMutex implementation.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/mapmutex-eviction

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-24T08:39:32Z

Summary:

License scan: failure
Credentials scan: failure
Vulnerabilities scan: failure
Unit test: success
Go linting: failure

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/utils/mutex.go`:
- Around line 11-15: The current keyState allows new readers to starve queued
writers because RLock only checks s.writer; modify the locking logic so writers
signal their intent and readers defer if writers are waiting: in Lock (and any
write-acquiring code path) increment s.waiters before blocking and decrement it
after acquiring (and before setting s.writer=true), and change RLock to block
when either s.writer is true OR s.waiters>0 (i.e., treat queued writers as
active), ensuring waiters is properly updated in Unlock/try paths and that all
Cond broadcasts/signals remain correct; update references in the keyState
struct, Lock, Unlock, RLock, and RUnlock methods to use waiters to prevent
reader admission while a writer is queued.
- Around line 71-85: The waiter can acquire the lock after its context was
canceled because we only check ctx.Err() while waiting; fix by re-checking the
context immediately after canProceed() becomes true and before decrementing
s.waiters or calling onAcquire: if ctx.Err() != nil then decrement s.waiters,
call m.maybeEvict(key, s), unlock m.mu, close(stop) and return ctx.Err(); apply
the same additional pre-acquisition ctx.Err() guard to the other fast-paths in
the same file that call onAcquire (the blocks around where onAcquire() is
invoked) to ensure canceled contexts never proceed.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 991c5165-87d9-4e77-a1e7-45ca49ef7139

📥 Commits

Reviewing files that changed from the base of the PR and between a53f3f7 and e321228.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (3)

go.mod
pkg/utils/mutex.go
pkg/utils/mutex_test.go

pkg/utils/mutex.go

Copilot

Pull request overview

Refactors pkg/utils.MapMutex to use sync.Cond-based waiting instead of TryLock + backoff polling, aiming to improve cancellation responsiveness and contention behavior, and updates/extends related tests and module metadata.

Changes:

Replaced per-key sync.RWMutex + backoff retry with a sync.Cond-driven wait/notify implementation and per-key state tracking.
Updated/expanded mutex tests to cover eviction behavior, cancellation/timeout scenarios, and high-contention stress cases.
Ran dependency tidy-related updates in go.mod/go.sum (e.g., backoff/v5 no longer a direct dependency).

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.

File	Description
pkg/utils/mutex.go	Reimplements keyed mutex using `sync.Cond` and state counters; removes backoff polling.
pkg/utils/mutex_test.go	Updates tests to align with new internals and adds high-contention/cancellation coverage.
go.mod	Adjusts direct vs indirect requirements after refactor/tidy.
go.sum	Updates sums to match module graph after tidy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/utils/mutex.go

Copilot · 2026-03-24T08:44:15Z

pkg/utils/mutex.go

+	// Spawn a goroutine to broadcast on context cancellation so that
+	// Cond.Wait wakes up and can observe ctx.Err(). The goroutine exits
+	// immediately in all cases (no leak).
+	stop := make(chan struct{})
+	go func() {
 		select {
-		case <-done:
-			m.release(key)
-			return ctx.Err()
-		default:
+		case <-ctx.Done():
+			m.mu.Lock()
+			s.cond.Broadcast()
+			m.mu.Unlock()


lockSlow spawns a new goroutine per contended Lock/RLock to broadcast on ctx.Done(). Under high contention this can create significant goroutine churn. Since this repo targets Go 1.25, consider using context.AfterFunc(ctx, ...) (and stopping it on exit) to avoid per-wait goroutine creation while still waking Cond.Wait() on cancellation.

Copilot · 2026-03-24T08:44:16Z

pkg/utils/mutex_test.go

 	assert.Error(t, err, "should fail due to context timeout")

-	// Release the original lock
+	// Release the original lock — the cleanup goroutine will acquire then release.


The comment mentions a “cleanup goroutine” acquiring then releasing after Unlock, but the current MapMutex implementation performs eviction synchronously in Unlock/RUnlock (no cleanup goroutine). Please update/remove this comment to match the implementation so future readers aren’t misled.

Suggested change

// Release the original lock — the cleanup goroutine will acquire then release.

// Release the original lock so any state associated with the timed-out waiter can be cleaned up.

Signed-off-by: Yakir Oren <yakiroren@gmail.com>

github-actions · 2026-03-24T08:57:44Z

Summary:

License scan: failure
Credentials scan: failure
Vulnerabilities scan: failure
Unit test: success
Go linting: failure

Copilot

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-24T09:09:04Z

pkg/utils/mutex_test.go

+	// Give the writer time to enter the wait loop.
+	time.Sleep(10 * time.Millisecond)
+
+	// A new reader must NOT slip past the pending writer.
+	shortCtx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
+	defer cancel()
+	err = m.RLock(shortCtx, "key")
+	assert.Error(t, err, "new reader should block while a writer is pending")


The writer-priority test relies on fixed sleeps (10ms/50ms) for ordering, which can be flaky on slow/loaded CI runners. Consider synchronizing deterministically (e.g., wait until pendingWriters/waiters is observed >0 under m.mu, or use an explicit channel/barrier) before asserting the new reader blocks.

Copilot · 2026-03-24T09:09:04Z

pkg/utils/mutex.go

+	stop := make(chan struct{})
+	go func() {
 		select {
-		case <-done:
-			m.release(key)
-			return ctx.Err()
-		default:
+		case <-ctx.Done():
+			m.mu.Lock()
+			s.cond.Broadcast()
+			m.mu.Unlock()
+		case <-stop:
 		}
-		if l.mu.TryLock() {
-			return nil
+	}()
+
+	for !canProceed() {
+		if ctx.Err() != nil {
+			s.waiters--
+			if onCancel != nil {
+				onCancel()
+			}
+			m.maybeEvict(key, s)
+			m.mu.Unlock()
+			close(stop)
+			return ctx.Err()


In lockSlow, the cancellation wake-up relies on a helper goroutine doing select { case <-ctx.Done(): Broadcast; case <-stop: }. When lockSlow exits via the ctx.Err() path it closes stop after unlocking, and if the helper goroutine hasn't started yet it may observe both ctx.Done() and stop as ready and randomly choose the stop branch, skipping the Broadcast. That can leave other waiters asleep even though state changed (e.g., pendingWriters decremented), causing a potential indefinite block. Ensure cancellation always triggers a Broadcast (e.g., Broadcast explicitly in the ctx.Err() branch after updating state, or restructure using context.AfterFunc so the callback is guaranteed to run on cancellation).

Copilot · 2026-03-24T09:09:05Z

pkg/utils/mutex.go

+	// Spawn a goroutine to broadcast on context cancellation so that
+	// Cond.Wait wakes up and can observe ctx.Err(). The goroutine exits
+	// immediately in all cases (no leak).
+	stop := make(chan struct{})
+	go func() {
 		select {
-		case <-done:
-			m.release(key)
-			return ctx.Err()
-		default:
+		case <-ctx.Done():
+			m.mu.Lock()
+			s.cond.Broadcast()
+			m.mu.Unlock()
+		case <-stop:
 		}
-		if l.mu.TryLock() {
-			return nil
+	}()


lockSlow spawns a dedicated goroutine for every contended Lock/RLock attempt to watch ctx.Done and broadcast. Under high contention with long timeouts, this can create large numbers of parked goroutines and increase scheduling/memory overhead. Consider replacing this with context.AfterFunc (stopped on successful acquire) or another mechanism that avoids one goroutine per waiter.

Copilot · 2026-03-24T09:09:05Z

pkg/utils/mutex.go

+	m.mu.Lock()
+	s := m.keys[key]
+	s.writer = false
+	s.cond.Broadcast()
+	m.maybeEvict(key, s)


Unlock assumes m.keys[key] is present and will panic with a nil-pointer dereference if Unlock is called for an unknown/evicted key. Even if you want to panic on misuse (like sync.Mutex), it would be better to check for missing state and panic with a clear message rather than a nil deref.

Copilot · 2026-03-24T09:09:05Z

pkg/utils/mutex.go

+	m.mu.Lock()
+	s := m.keys[key]
+	s.readers--
+	if s.readers == 0 {


RUnlock assumes m.keys[key] is present and will panic with a nil-pointer dereference if RUnlock is called for an unknown/evicted key. Consider an explicit existence check and a clear panic/error to make misuse easier to diagnose.

Copilot · 2026-03-24T09:09:05Z

pkg/utils/mutex_test.go

+		err := m.Lock(ctx, "key")
+		require.NoError(t, err)
+		close(writerDone)


Using require.NoError inside this goroutine is risky: if it fails, the goroutine will exit before close(writerDone), and the main test goroutine will block forever on <-writerDone. Prefer sending the error back to the main goroutine (or use t.Error/assert and ensure writerDone is always closed via defer) so the test can fail without hanging.

Suggested change

err := m.Lock(ctx, "key")

require.NoError(t, err)

close(writerDone)

defer close(writerDone)

err := m.Lock(ctx, "key")

assert.NoError(t, err)

if err != nil {

return

}

matthyx

very good one 👍

use sync cond

47bce0f

Signed-off-by: Yakir Oren <yakiroren@gmail.com>

YakirOren force-pushed the fix/mapmutex-eviction branch from a131958 to 47bce0f Compare March 24, 2026 08:28

Merge branch 'main' into fix/mapmutex-eviction

e321228

YakirOren requested a review from Copilot March 24, 2026 08:40

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

pkg/utils/mutex.go Outdated Show resolved Hide resolved

pkg/utils/mutex.go Show resolved Hide resolved

Copilot started reviewing on behalf of YakirOren March 24, 2026 08:40 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

fix starvation

d5f714f

Signed-off-by: Yakir Oren <yakiroren@gmail.com>

slashben approved these changes Mar 24, 2026

View reviewed changes

YakirOren requested a review from Copilot March 24, 2026 09:02

Copilot started reviewing on behalf of YakirOren March 24, 2026 09:02 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

matthyx approved these changes Mar 27, 2026

View reviewed changes

matthyx merged commit ab80dc3 into main Mar 27, 2026
11 checks passed

matthyx deleted the fix/mapmutex-eviction branch March 27, 2026 12:48

	// Release the original lock — the cleanup goroutine will acquire then release.
	// Release the original lock so any state associated with the timed-out waiter can be cleaned up.

Conversation

YakirOren commented Mar 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

matthyx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

YakirOren commented Mar 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 24, 2026 •

edited

Loading