Skip to content

Refactor AppK8sClient.Watch to use controller-runtime informer cache #7251

@pingsutw

Description

@pingsutw

What

AppK8sClient.Watch in app/internal/k8s/app_client.go currently implements its own watch loop with manual reconnect, exponential backoff, and resourceVersion tracking (see drainWatcher / reconnect loop around lines 269-290, added in #7188).

We should replace this hand-rolled logic with the controller-runtime informer cache (sigs.k8s.io/controller-runtime/pkg/cache), which is already imported in this file.

Why

Informers (reflectors) already handle all of the concerns the current code is reinventing:

  • Reconnect / backoff — the reflector does exponential backoff on watch errors automatically.
  • List + Watch atomicity — the reflector performs the initial List and then Watch using the returned resourceVersion, so no events are lost between the two calls. This also resolves InternalAppService.Watch can drop events during initial snapshot replay #7250 (events being dropped between Watch subscription and snapshot replay in InternalAppService.Watch).
  • Snapshot replay — listing from the informer's in-memory cache is synchronous and cheap, instead of hitting the API server on every client connect.
  • 410 Gone / resourceVersion expiry — the reflector relists automatically.
  • Shared across consumers — one informer per GVK+namespace scope, shared by many Watch clients, instead of one API-server watch per client.

Maintaining our own reconnect/backoff/resourceVersion state machine is duplicate work and a correctness risk; the upstream implementation is battle-tested.

Proposed change

  • Create (or reuse) a controller-runtime cache.Cache for servingv1.Service scoped to the namespaces we care about.
  • Replace the raw client.Watch + drainWatcher + reconnect loop with:
    • cache.List for the initial snapshot.
    • An EventHandler (AddFunc/UpdateFunc/DeleteFunc) that fans out events to the per-client channel returned from Watch.
  • Remove drainWatcher, the watchState / backoff helpers, and the manual resourceVersion tracking once the informer path is in place.
  • Update InternalAppService.Watch to rely on the informer-backed snapshot (which eliminates the List/Watch ordering concern from InternalAppService.Watch can drop events during initial snapshot replay #7250).

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions