Skip to content

RFC: RDCP protocol evolution — v1.5 (typed controls) and v2.0 (policies/orchestration) #80

@mojoatomic

Description

@mojoatomic

This RFC captures the evolution of RDCP itself (protocol, taxonomy, semantics). The self‑building admin UI is a separate consumer and is not in scope here.

Big picture

  • RDCP v1.x expands the Control vocabulary, not the protocol shape
    • Same endpoints/flow; richer, typed category parameters
    • Discovery publishes typed param schemas + operational hints; clients (and UIs/agents) auto‑adapt
  • RDCP v2.x introduces Policies + Orchestration as first‑class complements to Controls
    • Controls are imperative (do X now, TTL optional)
    • Policies are declarative, evaluated continuously (keep Y true while condition Z holds)
    • Governance (human/AI) sits over both, with simulation, risk scoring, approvals, and audit

v1.5 — typed controls + schema‑driven discovery (no protocol shape change)

  • Category taxonomy (in @rdcp.dev/core)
    • logging | profiling | feature_flag | circuit_breaker | resource_limit | security_policy
  • Discovery publishes per‑category param schemas + operational hints
    • JSON Schema (or zod→JSON) describing parameters and constraints
    • Hints: type, overhead (low/med/high), constraints (e.g., max_duration), safe_defaults
  • Guardrails
    • TTL mandatory for profiling (hard max enforced by server)
    • Risk scoring in responses; dry‑run/simulate mode that returns proposed deltas
    • Multi‑tenant scoping and budgets (publish budget hints for profiling/limits)
  • Status enhancements
    • Active controls inventory: who/when/why/ttl_remaining, overhead counters
    • Clear view of ‘what’s on’ and when it auto‑disables

Examples

  • Profiling (typed control)
{
  "categories": {
    "CPU_PROFILING": { "enabled": true, "sample_rate": 0.1, "duration": "5m" },
    "MEMORY_PROFILING": { "enabled": true, "heap_snapshots": true }
  }
}
  • Feature Flags (typed control)
{
  "categories": {
    "FEATURE_NEW_CHECKOUT": {
      "enabled": true,
      "rollout_percentage": 25,
      "tenant_override": { "enterprise": true }
    },
    "FEATURE_BETA_UI": { "enabled": false, "reason": "performance issues" }
  }
}

v2.0 — policies and orchestration

  • Policies (declarative desired state)
    • Express conditions/targets and desired truth; server evaluates continuously
    • Policy state in status (compliant/drift), rollout progress
  • Orchestrator semantics for multi‑service atomicity
    • idempotencyKey, transactionId, staged rollouts, compensations on partial failure
  • Governance hooks
    • simulate (what‑if), risk score, two‑person rule/approval windows for high‑risk categories
    • Extended audit with reasoning context (esp. for AI‑initiated actions)

Additional capability classes (v2.x family)

  • circuit_breaker: thresholds, decay windows, half‑open rules
  • resource_limit: QPS/concurrency/memory ceilings, priority bands
  • security_policy: auth/session/claims/IP constraints, crypto rotation windows
  • cost_control: dynamic throttling based on spend rate; publish cost hints in discovery

Safety that must not regress

  • TTL required with hard max for profiling/high‑overhead categories
  • Always‑available kill switch for features (instant revert)
  • Budgets per node/tenant; reject or cap controls that exceed published budgets
  • Immutability of audit logs with full context: who/what/why/diffs/ttl/rollback

Discovery schema details (for clients/UIs/agents)

  • type: profiling | feature_flag | circuit_breaker | resource_limit | security_policy | logging
  • paramsSchema: JSON Schema for validation and auto‑render
  • hints: overhead, max_duration, safe_defaults, rollout_capable, segmentation support
  • backward compatible: all new fields optional; old clients degrade gracefully

AI governance (vision; protocol‑agnostic but enabled by the above)

  • Universal control language: agents can reason across all services via the same typed surface
  • Atomic ops across categories/services; idempotent, auditable
  • Closed loop: simulate → plan → execute → monitor → learn (with risk scoring/approval gates)

Positioning

  • RDCP becomes a typed, schema‑driven control plane in v1.x
  • RDCP becomes a policy‑aware, orchestrated governance layer in v2.x
  • Self‑building UI, CLIs, and AI orchestrators are just consumers; they evolve independently

Back‑compat & versioning

  • Discovery advertises support and publishes schemas; older clients remain functional with minimal UI
  • ui/policy evolution uses versioned blocks (e.g., ui.version) to gate new features

Open questions

  • JSON Schema vs zod→JSON as canonical param schema for discovery?
  • Minimum surface to standardize for circuit_breaker/resource_limit/security_policy v1?
  • Risk scoring/dry‑run response format (how to report ‘why’ a change is high‑risk)?
  • Where to host orchestrator semantics (server vs dedicated orchestrator)?

This RFC is a protocol roadmap only; implementation proposals can be split by capability class (profiling, feature_flag, etc.) and can proceed as v1.5 PRs once the taxonomy + discovery schema approach is accepted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    rfcRequest for Comments / design proposalroadmapRoadmap / future directions

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions