Skip to content

add features provider#142

Merged
YakirOren merged 6 commits intomainfrom
skip-alert-by-feature-flag
Mar 30, 2026
Merged

add features provider#142
YakirOren merged 6 commits intomainfrom
skip-alert-by-feature-flag

Conversation

@YakirOren
Copy link
Copy Markdown
Contributor

@YakirOren YakirOren commented Mar 22, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Runtime alerts can now be selectively controlled per account based on feature flags. The system dynamically fetches feature flags from a remote features service and automatically refreshes them at configurable intervals. Includes fail-open resilience to ensure graceful degradation if the features service is unavailable.
  • Chores

    • Updated dependencies.

Signed-off-by: Yakir Oren <yakiroren@gmail.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 22, 2026

📝 Walkthrough

Walkthrough

This PR introduces a FeaturesProvider system that conditionally skips runtime alerts based on cached feature flags fetched from a remote HTTP server. The provider integrates into the backend adapter lifecycle (initialization, registration during account start, unregistration during stop) and is consulted by the client before ingesting alerts. Access keys are now propagated through context for feature-flag validation.

Changes

Cohort / File(s) Summary
Features Provider Implementation
adapters/backend/v1/skip_alerts.go, adapters/backend/v1/skip_alerts_test.go
New FeaturesProvider maintains concurrent per-account cache of feature flags, performs periodic HTTP fetches with configured interval, and decides alert-skipping based on configured feature flag name. Tests verify cache behavior, HTTP error handling, header validation, and async refresh cycles.
Backend Adapter & Client Integration
adapters/backend/v1/adapter.go, adapters/backend/v1/client.go
Adapter now initializes and manages FeaturesProvider lifecycle; registers accounts on client start and unregisters on stop. Client receives FeaturesProvider dependency and uses it alongside existing alert-skip list to determine whether to skip runtime alerts.
Configuration & Context
config/config.go, domain/context.go
Added FeaturesProviderConfig struct with FeatureFlagName and RefreshIntervalSeconds fields, and new ContextKeyAccessKey constant for propagating access keys through request context.
Authentication & Utilities
cmd/server/authentication/authentication.go, utils/utils.go
Authentication middleware now stores incoming access key in context; new utility function AccessKeyFromContext extracts the value for downstream use.
Dependency Updates
go.mod, tests/go.mod
Updated golang.org/x/sync from v0.19.0 to v0.20.0 and golang.org/x/sys from v0.40.0 to v0.42.0.

Sequence Diagram(s)

sequenceDiagram
    participant Adapter
    participant FeaturesProvider
    participant HTTPServer as Features HTTP Server
    participant Client
    participant MessageProducer

    Adapter->>FeaturesProvider: RegisterAccount(account, accessKey)
    FeaturesProvider->>FeaturesProvider: Store credentials in concurrent cache
    FeaturesProvider->>HTTPServer: StartRefreshLoop: GET /api/v1/features<br/>(X-API-KEY, X-API-ACCOUNT headers)
    HTTPServer-->>FeaturesProvider: Return feature list (e.g., [threat-detection])
    FeaturesProvider->>FeaturesProvider: Update cache entry

    Client->>FeaturesProvider: ShouldSkipAlertsFrom(account)
    alt Feature flag present in cache
        FeaturesProvider-->>Client: false (send alert)
    else Feature flag absent from cache
        FeaturesProvider-->>Client: true (skip alert)
    else Account not cached or unavailable
        FeaturesProvider-->>Client: false (fail-open, send alert)
    end

    Client->>MessageProducer: Send or skip runtime alert

    Adapter->>FeaturesProvider: UnregisterAccount(account)
    FeaturesProvider->>FeaturesProvider: Remove account from cache
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 With feature flags now cached and bright,
We hop through alerts with refreshed sight,
Access keys flow through context's stream,
Skipping alerts—a clever dream!
HTTP fetches dance all day,
Concurrent maps light up the way. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add features provider' accurately describes the main change: introducing a new FeaturesProvider component that enables feature-flag-based runtime alert skipping across multiple files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch skip-alert-by-feature-flag

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
tests/synchronizer_integration_test.go (1)

578-578: Integration setup currently bypasses the new access-key feature path.

Line 578 passes "" as access key, so the integration suite likely won’t exercise feature-provider registration/skip behavior. Consider adding one integration path with a non-empty access key (or a focused integration test) to catch regressions in this PR’s main flow.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/synchronizer_integration_test.go` at line 578, The integration test
currently constructs the adapter with an empty access key by calling
NewBackendAdapter(ctx, pulsarProducer, config.Backend{}, ""), which bypasses the
new access-key code path; update the test to exercise the access-key flow by
adding a variant that calls NewBackendAdapter with a non-empty access key string
(or add a focused integration test that constructs the adapter with a real/stub
access key) so feature-provider registration/skip behavior is exercised (refer
to NewBackendAdapter and the access-key argument).
adapters/backend/v1/skip_alerts_test.go (2)

15-25: Unchecked error from JSON encoding.

The json.NewEncoder(w).Encode() call can fail, and the error is silently ignored. While this is test code, handling the error ensures tests fail clearly if encoding issues occur.

✨ Proposed fix
 		w.Header().Set("Content-Type", "application/json")
-		json.NewEncoder(w).Encode(featuresResponse{Features: features})
+		if err := json.NewEncoder(w).Encode(featuresResponse{Features: features}); err != nil {
+			t.Errorf("failed to encode response: %v", err)
+		}
 	}))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/backend/v1/skip_alerts_test.go` around lines 15 - 25, In
newTestFeaturesServer wrap the call to
json.NewEncoder(w).Encode(featuresResponse{Features: features}) and handle its
error: check the returned error and call t.Fatalf or t.Errorf with a descriptive
message if it is non-nil so the test fails clearly on encoding problems; locate
this in the http.HandlerFunc inside newTestFeaturesServer and replace the
unchecked Encode(...) invocation with error handling that reports the error via
the testing.T methods.

103-118: Consider adding a test for the refresh loop behavior.

The current tests only exercise the immediate fetch triggered by RegisterAccount. A test verifying that StartRefreshLoop periodically updates cached features would provide confidence in the refresh mechanism.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/backend/v1/skip_alerts_test.go` around lines 103 - 118, Add a test
that verifies the periodic refresh: create the test server
(newTestFeaturesServer) with an initial feature set, create the provider via
NewFeaturesProvider, call RegisterAccount to populate cache, then start the
refresh loop using StartRefreshLoop (or the provider method that begins
background refreshes). Update the test server to return a different feature set
after a short delay, wait with assert.Eventually for ShouldSkipAlertsFrom (or
other feature query methods) to reflect the updated features, and finally stop
the refresh loop/cleanup; reference NewFeaturesProvider, RegisterAccount,
StartRefreshLoop, ShouldSkipAlertsFrom, and UnregisterAccount/StopRefreshLoop to
locate implementation points.
adapters/backend/v1/skip_alerts.go (1)

97-101: Synchronous refresh in Range callback may cause delays.

Each refreshAccount call is synchronous with a 10-second timeout. With many registered accounts, the refresh loop iteration could take a very long time, and accounts at the end of the iteration may have significantly stale data.

⚡ Proposed fix using concurrent refresh with bounded parallelism
 			p.cache.Range(func(key, value any) bool {
 				entry := value.(*cachedAccount)
-				p.refreshAccount(key.(string), entry.accessKey)
+				go p.refreshAccount(key.(string), entry.accessKey)
 				return true
 			})

For bounded parallelism, consider using a semaphore or worker pool pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/backend/v1/skip_alerts.go` around lines 97 - 101, The Range callback
currently calls p.refreshAccount synchronously for each cachedAccount which
blocks the iteration and can cause long delays; change this to perform
concurrent refreshes with bounded parallelism: inside the p.cache.Range loop
spawn goroutines that call p.refreshAccount(key.(string), entry.accessKey),
limit concurrency using a semaphore (buffered channel) or worker pool, and use a
sync.WaitGroup to wait for all spawned goroutines to finish before returning;
ensure errors from refreshAccount are handled/logged and that any per-refresh
timeout/context remains applied within refreshAccount so the concurrent calls
remain safe.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@adapters/backend/v1/adapter.go`:
- Around line 142-148: When b.featuresProvider is non-nil and
utils.AccessKeyFromContext(ctx) returns an empty string, add a warning log so
the silent failure is visible: detect the empty accessKey in the existing block
that currently calls b.featuresProvider.RegisterAccount(incomingId.Account,
accessKey) and call the appropriate logger (use the same logger used in this
file) to warn that RegisterAccount was skipped for incomingId.Account due to
missing access key; keep the RegisterAccount call when accessKey is present and
do not change NewClient or b.skipAlertsFrom behavior.

In `@adapters/backend/v1/skip_alerts.go`:
- Around line 46-57: The read of existing.features in RegisterAccount races with
concurrent writes from refreshAccount; protect access by adding synchronization
on cachedAccount (e.g., add a mu sync.RWMutex on the cachedAccount struct or use
an atomic.Value for the features field) and acquire the appropriate lock when
reading existing.features in RegisterAccount and when updating it inside
refreshAccount (and any other writers). Ensure p.cache.Load/Store still stores
*cachedAccount and that all code paths consistently use the new lock/atomic
access for features to eliminate the data race.
- Around line 20-23: cachedAccount.features is mutated concurrently because
sync.Map only protects map operations; fix by making access to features
synchronized: either (A) change cachedAccount to include a sync.RWMutex (e.g.,
add mu sync.RWMutex) and acquire mu.Lock()/mu.Unlock() when writing in
refreshAccount and mu.RLock()/mu.RUnlock() when reading in ShouldSkipAlertsFrom,
or (B) make cachedAccount immutable and avoid in-place mutation by constructing
a new cachedAccount with the updated features and atomically storing it back
into the sync.Map (use Load/Store pattern) from refreshAccount so readers in
ShouldSkipAlertsFrom can safely read without locks.

In `@config/config.go`:
- Around line 84-87: FeaturesProviderConfig.RefreshIntervalSeconds must be
validated to reject zero and negative values to avoid passing invalid durations
to time.NewTicker; add a validation check when loading/constructing the
FeaturesProviderConfig (e.g., in the config unmarshalling or the
FeaturesProvider constructor) that returns an error if RefreshIntervalSeconds <=
0, referencing the FeaturesProviderConfig struct and ensuring callers like
StartRefreshLoop handle/propagate that error rather than relying on implicit
defaults.

---

Nitpick comments:
In `@adapters/backend/v1/skip_alerts_test.go`:
- Around line 15-25: In newTestFeaturesServer wrap the call to
json.NewEncoder(w).Encode(featuresResponse{Features: features}) and handle its
error: check the returned error and call t.Fatalf or t.Errorf with a descriptive
message if it is non-nil so the test fails clearly on encoding problems; locate
this in the http.HandlerFunc inside newTestFeaturesServer and replace the
unchecked Encode(...) invocation with error handling that reports the error via
the testing.T methods.
- Around line 103-118: Add a test that verifies the periodic refresh: create the
test server (newTestFeaturesServer) with an initial feature set, create the
provider via NewFeaturesProvider, call RegisterAccount to populate cache, then
start the refresh loop using StartRefreshLoop (or the provider method that
begins background refreshes). Update the test server to return a different
feature set after a short delay, wait with assert.Eventually for
ShouldSkipAlertsFrom (or other feature query methods) to reflect the updated
features, and finally stop the refresh loop/cleanup; reference
NewFeaturesProvider, RegisterAccount, StartRefreshLoop, ShouldSkipAlertsFrom,
and UnregisterAccount/StopRefreshLoop to locate implementation points.

In `@adapters/backend/v1/skip_alerts.go`:
- Around line 97-101: The Range callback currently calls p.refreshAccount
synchronously for each cachedAccount which blocks the iteration and can cause
long delays; change this to perform concurrent refreshes with bounded
parallelism: inside the p.cache.Range loop spawn goroutines that call
p.refreshAccount(key.(string), entry.accessKey), limit concurrency using a
semaphore (buffered channel) or worker pool, and use a sync.WaitGroup to wait
for all spawned goroutines to finish before returning; ensure errors from
refreshAccount are handled/logged and that any per-refresh timeout/context
remains applied within refreshAccount so the concurrent calls remain safe.

In `@tests/synchronizer_integration_test.go`:
- Line 578: The integration test currently constructs the adapter with an empty
access key by calling NewBackendAdapter(ctx, pulsarProducer, config.Backend{},
""), which bypasses the new access-key code path; update the test to exercise
the access-key flow by adding a variant that calls NewBackendAdapter with a
non-empty access key string (or add a focused integration test that constructs
the adapter with a real/stub access key) so feature-provider registration/skip
behavior is exercised (refer to NewBackendAdapter and the access-key argument).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 92536a24-a774-4904-b478-2b189efe57f4

📥 Commits

Reviewing files that changed from the base of the PR and between 1f37a05 and 5dd6b1d.

⛔ Files ignored due to path filters (2)
  • go.sum is excluded by !**/*.sum
  • tests/go.sum is excluded by !**/*.sum
📒 Files selected for processing (11)
  • adapters/backend/v1/adapter.go
  • adapters/backend/v1/client.go
  • adapters/backend/v1/skip_alerts.go
  • adapters/backend/v1/skip_alerts_test.go
  • cmd/server/authentication/authentication.go
  • config/config.go
  • domain/context.go
  • go.mod
  • tests/go.mod
  • tests/synchronizer_integration_test.go
  • utils/utils.go

Comment on lines +142 to +148
if b.featuresProvider != nil {
if accessKey := utils.AccessKeyFromContext(ctx); accessKey != "" {
b.featuresProvider.RegisterAccount(incomingId.Account, accessKey)
}
}

client := NewClient(b.producer, b.skipAlertsFrom, b.featuresProvider)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent failure when access key is missing from context.

When accessKey is empty, the account is not registered with the features provider, but no warning is logged. This could lead to confusing behavior where alerts are not skipped as expected (fail-open), without any indication in the logs.

📝 Proposed fix to add logging
 	if b.featuresProvider != nil {
 		if accessKey := utils.AccessKeyFromContext(ctx); accessKey != "" {
 			b.featuresProvider.RegisterAccount(incomingId.Account, accessKey)
+		} else {
+			logger.L().Warning("access key not found in context, features provider will not track this account",
+				helpers.String("account", incomingId.Account))
 		}
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if b.featuresProvider != nil {
if accessKey := utils.AccessKeyFromContext(ctx); accessKey != "" {
b.featuresProvider.RegisterAccount(incomingId.Account, accessKey)
}
}
client := NewClient(b.producer, b.skipAlertsFrom, b.featuresProvider)
if b.featuresProvider != nil {
if accessKey := utils.AccessKeyFromContext(ctx); accessKey != "" {
b.featuresProvider.RegisterAccount(incomingId.Account, accessKey)
} else {
logger.L().Warning("access key not found in context, features provider will not track this account",
helpers.String("account", incomingId.Account))
}
}
client := NewClient(b.producer, b.skipAlertsFrom, b.featuresProvider)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/backend/v1/adapter.go` around lines 142 - 148, When
b.featuresProvider is non-nil and utils.AccessKeyFromContext(ctx) returns an
empty string, add a warning log so the silent failure is visible: detect the
empty accessKey in the existing block that currently calls
b.featuresProvider.RegisterAccount(incomingId.Account, accessKey) and call the
appropriate logger (use the same logger used in this file) to warn that
RegisterAccount was skipped for incomingId.Account due to missing access key;
keep the RegisterAccount call when accessKey is present and do not change
NewClient or b.skipAlertsFrom behavior.

Signed-off-by: Yakir Oren <yakiroren@gmail.com>
@YakirOren YakirOren force-pushed the skip-alert-by-feature-flag branch from 8b8e021 to 6dade36 Compare March 22, 2026 15:42
Signed-off-by: Yakir Oren <yakiroren@gmail.com>
@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: success
  • Credentials scan: failure
  • Vulnerabilities scan: failure
  • Unit test: success
  • Go linting: failure

Signed-off-by: Yakir Oren <yakiroren@gmail.com>
… mutating

Signed-off-by: Yakir Oren <yakiroren@gmail.com>
@YakirOren YakirOren force-pushed the skip-alert-by-feature-flag branch from 9192791 to 844568c Compare March 22, 2026 16:03
@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: success
  • Credentials scan: failure
  • Vulnerabilities scan: failure
  • Unit test: success
  • Go linting: failure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@adapters/backend/v1/skip_alerts.go`:
- Around line 82-87: The code only guards interval == 0 before creating the
ticker, but negative durations will cause a panic; update the check in the block
that sets the default to ensure non-positive values are handled (change the
condition from interval == 0 to interval <= 0) so that before calling
utils.NewStdTicker(interval) any interval <= 0 is replaced (e.g., with
time.Hour); locate the interval variable and the utils.NewStdTicker call in
skip_alerts.go to apply this change.
- Around line 97-100: The refresh path can overwrite a newer access key with
results from an in-flight refresh: modify refreshAccount so after fetching
refreshed credentials it checks the current cache entry's accessKey before
writing; use p.cache.Load (or sync.Map.CompareAndSwap if available) to compare
the stored cachedAccount.accessKey with the accessKey used for this refresh and
only update p.cache when they match, otherwise discard the refresh results so
RegisterAccount / cachedAccount rotations won't be clobbered and
ShouldSkipAlertsFrom sees the most recent credentials.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cbc42fbe-7a91-42f1-8090-55f78d9cc95d

📥 Commits

Reviewing files that changed from the base of the PR and between 6dade36 and c2d27e0.

⛔ Files ignored due to path filters (2)
  • go.sum is excluded by !**/*.sum
  • tests/go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • adapters/backend/v1/skip_alerts.go
  • go.mod
  • tests/go.mod
✅ Files skipped from review due to trivial changes (2)
  • tests/go.mod
  • go.mod

Comment on lines +82 to +87
if interval == 0 {
interval = time.Hour
}

ticker := utils.NewStdTicker(interval)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify interval guard and ticker implementation expectations.
rg -n -C3 'func \(p \*FeaturesProvider\) StartRefreshLoop|interval == 0|NewStdTicker' adapters/backend/v1/skip_alerts.go
rg -n -C3 'func NewStdTicker|time.NewTicker' utils
rg -n -C3 'RefreshIntervalSeconds' adapters/backend/v1/adapter.go

Repository: kubescape/synchronizer

Length of output: 1135


🏁 Script executed:

python3 << 'EOF'
# Check Go standard library documentation for time.NewTicker behavior
# time.NewTicker panics if d <= 0 - this is documented behavior
# Verify this is consistent across Go versions

import subprocess

# Search for any validation or guards on RefreshIntervalSeconds in the codebase
result = subprocess.run(
    ["rg", "-n", "RefreshIntervalSeconds|NewTicker", "adapters/", "utils/", "-A2", "-B2"],
    capture_output=True,
    text=True
)

print("Search results for RefreshIntervalSeconds and NewTicker context:")
print(result.stdout)
EOF

Repository: kubescape/synchronizer

Length of output: 1290


🌐 Web query:

time.NewTicker panic negative duration Go standard library documentation

💡 Result:

Yes, time.NewTicker panics when given a negative duration in the Go standard library. The official documentation states: "The duration d must be greater than zero; if not, NewTicker will panic." This behavior is implemented in the source code with the check if d <= 0 { panic("non-positive interval for NewTicker") }. This has been consistent across Go versions, including recent ones like Go 1.23 and later, as shown in the current source and pkg.go.dev docs. Note that time.Tick(d) handles d <= 0 by returning nil instead of panicking.

Citations:


Guard all non-positive refresh intervals before creating ticker.

Line 82 only handles interval == 0. If the config produces a negative RefreshIntervalSeconds, the resulting negative duration will cause time.NewTicker to panic (standard Go ticker APIs panic on d <= 0).

Suggested fix
 func (p *FeaturesProvider) StartRefreshLoop(ctx context.Context, interval time.Duration) {
-	if interval == 0 {
+	if interval <= 0 {
 		interval = time.Hour
 	}
 
 	ticker := utils.NewStdTicker(interval)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if interval == 0 {
interval = time.Hour
}
ticker := utils.NewStdTicker(interval)
if interval <= 0 {
interval = time.Hour
}
ticker := utils.NewStdTicker(interval)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/backend/v1/skip_alerts.go` around lines 82 - 87, The code only
guards interval == 0 before creating the ticker, but negative durations will
cause a panic; update the check in the block that sets the default to ensure
non-positive values are handled (change the condition from interval == 0 to
interval <= 0) so that before calling utils.NewStdTicker(interval) any interval
<= 0 is replaced (e.g., with time.Hour); locate the interval variable and the
utils.NewStdTicker call in skip_alerts.go to apply this change.

Comment on lines +97 to +100
p.cache.Range(func(key, value any) bool {
entry := value.(*cachedAccount)
p.refreshAccount(key.(string), entry.accessKey)
return true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify async refresh sources and unconditional cache overwrite.
rg -n -C3 'go p\.refreshAccount|p\.refreshAccount\(key\.\(string\), entry\.accessKey\)|cache\.Store\(account, &cachedAccount\{accessKey: accessKey, features: features\}\)' adapters/backend/v1/skip_alerts.go

Repository: kubescape/synchronizer

Length of output: 599


🏁 Script executed:

# Check full RegisterAccount, UnregisterAccount, and refreshAccount implementations
cat -n adapters/backend/v1/skip_alerts.go | sed -n '40,120p'

Repository: kubescape/synchronizer

Length of output: 2757


🏁 Script executed:

# Check if there are any other cache access patterns or guards
rg -n 'cache\.(Store|Load|Range|Delete|CompareAndSwap)' adapters/backend/v1/skip_alerts.go

Repository: kubescape/synchronizer

Length of output: 375


🏁 Script executed:

# Check how the skip decisions use the cached account
rg -n -A5 'p\.cache\.Load\|GetAccount\|GetFeatures' adapters/backend/v1/skip_alerts.go

Repository: kubescape/synchronizer

Length of output: 48


🏁 Script executed:

# Check for any documentation or comments about key rotation or re-registration
rg -n -B2 -A2 'RegisterAccount|rotate|refresh' adapters/backend/v1/skip_alerts.go

Repository: kubescape/synchronizer

Length of output: 997


Prevent stale refresh results from overwriting newer credentials.

When an account is re-registered with a different access key while a refresh is in-flight, the older refresh can overwrite the cache with stale credentials. For example: RegisterAccount("acc1", "keyA") spawns an async refresh, then RegisterAccount("acc1", "keyB") updates the cache; if the first refresh completes after the second registration, line 115 unconditionally overwrites the cache back to keyA. This causes ShouldSkipAlertsFrom to use features fetched with the wrong (revoked/rotated) credentials, potentially leading to incorrect skip decisions.

Add compare-and-swap logic to line 115 to discard refresh results when the access key has been rotated:

Suggested fix
 func (p *FeaturesProvider) refreshAccount(account, accessKey string) {
 	features, err := p.fetchFeatures(account, accessKey)
 	if err != nil {
 		logger.L().Warning("failed to fetch features",
 			helpers.String("account", account),
 			helpers.Error(err))
 		return
 	}
 
-	p.cache.Store(account, &cachedAccount{accessKey: accessKey, features: features})
+	for {
+		currentVal, ok := p.cache.Load(account)
+		if !ok {
+			return // account was unregistered
+		}
+		current := currentVal.(*cachedAccount)
+		if current.accessKey != accessKey {
+			return // stale refresh result; account key was rotated
+		}
+
+		updated := &cachedAccount{accessKey: accessKey, features: features}
+		if p.cache.CompareAndSwap(account, current, updated) {
+			return
+		}
+		// concurrent update occurred; retry with latest entry
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/backend/v1/skip_alerts.go` around lines 97 - 100, The refresh path
can overwrite a newer access key with results from an in-flight refresh: modify
refreshAccount so after fetching refreshed credentials it checks the current
cache entry's accessKey before writing; use p.cache.Load (or
sync.Map.CompareAndSwap if available) to compare the stored
cachedAccount.accessKey with the accessKey used for this refresh and only update
p.cache when they match, otherwise discard the refresh results so
RegisterAccount / cachedAccount rotations won't be clobbered and
ShouldSkipAlertsFrom sees the most recent credentials.

@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: success
  • Credentials scan: failure
  • Vulnerabilities scan: failure
  • Unit test: success
  • Go linting: failure

@YakirOren YakirOren merged commit f97aa6b into main Mar 30, 2026
24 checks passed
@matthyx matthyx moved this to To Archive in KS PRs tracking Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants