Problem
The ActionsClient.doWatch in actions/k8s/client.go does not track or pass a ResourceVersion when re-establishing the watch. When the watch channel closes (server-side timeout, network blip, etc.), watchLoop retries by calling doWatch again, which starts a fresh watch without a ResourceVersion. This causes the API server to replay all existing CRs as ADDED events, resulting in duplicate RecordAction gRPC calls.
This happens even without a server restart — Kubernetes watches have a server-side timeout (typically 5-10 minutes), after which the watch channel closes and reconnects.
Current Mitigation
An OppoBloomFilter was added in #6987 to deduplicate RecordAction calls at the application layer, which prevents the duplicate DB writes. However, the root cause (unnecessary event replays) remains.
Proposed Fix
Track the ResourceVersion from the last seen event and pass it when restarting the watch via ListOptions. On reconnect, the API server will only send events that occurred after that version. Handle 410 Gone (resource version too old / compacted) by resetting and doing a full relist.
References
Problem
The
ActionsClient.doWatchinactions/k8s/client.godoes not track or pass aResourceVersionwhen re-establishing the watch. When the watch channel closes (server-side timeout, network blip, etc.),watchLoopretries by callingdoWatchagain, which starts a fresh watch without a ResourceVersion. This causes the API server to replay all existing CRs asADDEDevents, resulting in duplicateRecordActiongRPC calls.This happens even without a server restart — Kubernetes watches have a server-side timeout (typically 5-10 minutes), after which the watch channel closes and reconnects.
Current Mitigation
An
OppoBloomFilterwas added in #6987 to deduplicateRecordActioncalls at the application layer, which prevents the duplicate DB writes. However, the root cause (unnecessary event replays) remains.Proposed Fix
Track the
ResourceVersionfrom the last seen event and pass it when restarting the watch viaListOptions. On reconnect, the API server will only send events that occurred after that version. Handle410 Gone(resource version too old / compacted) by resetting and doing a full relist.References
actions/k8s/client.go—doWatchandwatchLoopmethods