Skip to content

refactor: replace galactic-agent with galactic-router (controller-runtime)#120

Draft
privateip wants to merge 1 commit into
mainfrom
refactor/galactic-router-replace-agent
Draft

refactor: replace galactic-agent with galactic-router (controller-runtime)#120
privateip wants to merge 1 commit into
mainfrom
refactor/galactic-router-replace-agent

Conversation

@privateip

Copy link
Copy Markdown
Contributor

Summary

Replace the gRPC-based galactic-agent DaemonSet with a controller-runtime based galactic-router that watches Cosmos BGP CRDs directly.

Changes

Removed

  • cmd/galactic-agent — old agent binary with cobra CLI
  • internal/agent — gRPC server, agent run loop
  • internal/bootstrap — BGPProvider CR lifecycle
  • internal/gobgp — embedded GoBGP server (moved/rewritten under internal/runtime/gobgp)

Added

  • cmd/galactic-router — router entry point; reads NODE_NAME/ROUTER_ROLE env vars
  • internal/controller — controller-runtime reconcilers (BGPRouter, BGPPeer, BGPAdvertisement, BGPPolicy, Secret, Node) with field index registration and status helpers
  • internal/reconcile — CRD → DesiredRouter translation (node/role checks, secret resolution, IPv6 next-hop)
  • internal/runtime — RouterRuntime interface + RuntimeFactory pattern (GoBGP tenant, FRR fabric stub)
  • internal/model — internal BGP model types
  • internal/hash — SHA-256 change detection over DesiredRouter to suppress redundant GoBGP Apply calls

Updated

  • Deployment manifests (deploy/galactic-router/) — DaemonSet, RBAC, ServiceAccount
  • Dockerfile — builds both galactic-cni and galactic-router binaries
  • ContainerLab config — references galactic-router image and BGPRouter CRDs
  • Docs — AGENTS.md, ARCHITECTURE.md, CONVENTIONS.md, cni-sequence.md, agent-startup.md
  • Devcontainer — health port 5000 (gRPC), removed 8081/9443
  • go.mod — added controller-runtime, cosmos API deps

Key architectural decisions

  • Lazy GoBGP start: GoBGP starts on first BGPRouter reconcile (listenPort=-1, outbound-only). ASN/RouterID changes trigger full Reconfigure.
  • gRPC health on :5000: Liveness/readiness probes via gRPC health protocol. No HTTP health endpoint.
  • Hash-based no-op suppression: SHA-256 over sorted DesiredRouter prevents redundant Apply calls.
  • RuntimeFactory pattern: ROUTER_ROLE=tenant → GoBGP, ROUTER_ROLE=fabric → FRR stub (Phase 2).
  • CRD-driven, no sidecar: CNI writes BGPAdvertisement CRD; router reconciler picks it up. No in-node gRPC.
  • EVPN Type 5 deferred: Missing Route Distinguisher in current cosmos API — returns ErrMissingRouteDistinguisher.

Co-Authored-By: Claude noreply@anthropic.com

@privateip privateip requested a review from a team as a code owner June 21, 2026 13:45
@privateip privateip requested a review from ronggur June 21, 2026 13:45
@privateip privateip marked this pull request as draft June 21, 2026 13:47
…time)

Replace the gRPC-based galactic-agent DaemonSet with a controller-runtime based galactic-router.

Key changes:

- Remove internal/agent, internal/bootstrap, internal/gobgp packages
- Add internal/controller with BGPRouter, BGPPeer, BGPAdvertisement, BGPPolicy, Secret, Node reconcilers
- Add internal/reconcile for CRD-to-DesiredRouter translation
- Add internal/runtime with RuntimeFactory pattern (GoBGP tenant, FRR fabric stub)
- Add internal/model for internal BGP types and internal/hash for change detection
- Update deployment manifests, Dockerfile, containerlab config, and docs
- Switch health probes to gRPC on port 5000; remove HTTP health and webhook ports
- GoBGP starts lazily on first BGPRouter reconcile (listenPort=-1, outbound-only)
- Hash-based no-op suppression prevents redundant GoBGP Apply calls
@privateip privateip force-pushed the refactor/galactic-router-replace-agent branch from b87216d to a653dc1 Compare June 21, 2026 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant