fix: deep copy eBPF data for each syscall sub-event to prevent memory issues#770
fix: deep copy eBPF data for each syscall sub-event to prevent memory issues#770yugal07 wants to merge 1 commit intokubescape:mainfrom
Conversation
copying eBPF data for each sub-event Signed-off-by: yugal07 <yashsadhwani544@gmail.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe PR refactors event lifecycle management: the syscall tracer now creates independent deep copies of event data for each sub-event and releases the original after processing, while the container watcher unconditionally releases all events after handling, including syscall events, removing prior conditional skip logic. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@matthyx Please Take a look. Thanks |
| if enrichedEvent.Event.GetEventType() != utils.SyscallEventType { | ||
| enrichedEvent.Event.Release() // at this time we should not need the event anymore | ||
| } | ||
| enrichedEvent.Event.Release() // at this time we should not need the event anymore |
There was a problem hiding this comment.
are we sure we can drop the condition here?
There was a problem hiding this comment.
In callback:
Data: event.Datasource.DeepCopy(event.Data), // each sub-event owns its own copy
...
event.Release() // original released here, after all sub-events are created
Each sub-event now holds an independent deep copy. When the worker pool calls:
enrichedEvent.Event.Release()
...it's releasing that sub-event's own copy, not shared memory. No other sub-event is affected.
So yes, dropping the condition is correct and intentional. The condition was only there to paper over the
shared-pointer problem. Now that every sub-event owns its data, the worker pool can safely call Release()
uniformly for all event types.
What do you think?
|
@YakirOren can you check your benchmarks to see the impact? |
|
heap-with-fix.prof.gz - quay.io/yakiro/node-agent:yugal-fix-deepcopy both ran for 10m while a script that generates syscall anomalies was running
|
the shell script seems light, In heavy production environments, I think the inuse space might drop more than 26 percent, and even more pronounced since the leaked objects would accumulate faster. Are these benchmarks good? There seems to be a tradeoff happening |
copying eBPF data for each sub-event
Overview
When the syscall tracer gets an event, it fans it out into multiple sub-events (one per syscall).
The problem was all those sub-events were sharing the same eBPF data pointer — there was even
a
WARNINGcomment in the code about it. So when one sub-event released the data, the rest wereleft pointing at freed memory.
To patch over this,
Release()was being skipped for syscall events entirely in the worker pool,which just traded a use-after-free for a memory leak.
The fix: each sub-event now gets its own deep copy via
event.Datasource.DeepCopy(event.Data),and the original is released after all sub-events are created. With that in place, the special-case
skip of
Release()in the worker pool isn't needed anymore, so that's removed too.Files Changes
pkg/containerwatcher/v2/tracers/syscall.go— deep copy data per sub-event, release original afterpkg/containerwatcher/v2/container_watcher.go— remove syscall-specificRelease()skipHow to Test
Run the existing tracer tests:
go test ./pkg/containerwatcher/v2/tracers/..
Summary by CodeRabbit