RFC: RDCP protocol evolution — v1.5 (typed controls) and v2.0 (policies/orchestration)

This RFC captures the evolution of RDCP itself (protocol, taxonomy, semantics). The self‑building admin UI is a separate consumer and is not in scope here.

Big picture
- RDCP v1.x expands the Control vocabulary, not the protocol shape
  - Same endpoints/flow; richer, typed category parameters
  - Discovery publishes typed param schemas + operational hints; clients (and UIs/agents) auto‑adapt
- RDCP v2.x introduces Policies + Orchestration as first‑class complements to Controls
  - Controls are imperative (do X now, TTL optional)
  - Policies are declarative, evaluated continuously (keep Y true while condition Z holds)
  - Governance (human/AI) sits over both, with simulation, risk scoring, approvals, and audit

v1.5 — typed controls + schema‑driven discovery (no protocol shape change)
- Category taxonomy (in @rdcp.dev/core)
  - logging | profiling | feature_flag | circuit_breaker | resource_limit | security_policy
- Discovery publishes per‑category param schemas + operational hints
  - JSON Schema (or zod→JSON) describing parameters and constraints
  - Hints: type, overhead (low/med/high), constraints (e.g., max_duration), safe_defaults
- Guardrails
  - TTL mandatory for profiling (hard max enforced by server)
  - Risk scoring in responses; dry‑run/simulate mode that returns proposed deltas
  - Multi‑tenant scoping and budgets (publish budget hints for profiling/limits)
- Status enhancements
  - Active controls inventory: who/when/why/ttl_remaining, overhead counters
  - Clear view of ‘what’s on’ and when it auto‑disables

Examples
- Profiling (typed control)
```json path=null start=null
{
  "categories": {
    "CPU_PROFILING": { "enabled": true, "sample_rate": 0.1, "duration": "5m" },
    "MEMORY_PROFILING": { "enabled": true, "heap_snapshots": true }
  }
}
```
- Feature Flags (typed control)
```json path=null start=null
{
  "categories": {
    "FEATURE_NEW_CHECKOUT": {
      "enabled": true,
      "rollout_percentage": 25,
      "tenant_override": { "enterprise": true }
    },
    "FEATURE_BETA_UI": { "enabled": false, "reason": "performance issues" }
  }
}
```

v2.0 — policies and orchestration
- Policies (declarative desired state)
  - Express conditions/targets and desired truth; server evaluates continuously
  - Policy state in status (compliant/drift), rollout progress
- Orchestrator semantics for multi‑service atomicity
  - idempotencyKey, transactionId, staged rollouts, compensations on partial failure
- Governance hooks
  - simulate (what‑if), risk score, two‑person rule/approval windows for high‑risk categories
  - Extended audit with reasoning context (esp. for AI‑initiated actions)

Additional capability classes (v2.x family)
- circuit_breaker: thresholds, decay windows, half‑open rules
- resource_limit: QPS/concurrency/memory ceilings, priority bands
- security_policy: auth/session/claims/IP constraints, crypto rotation windows
- cost_control: dynamic throttling based on spend rate; publish cost hints in discovery

Safety that must not regress
- TTL required with hard max for profiling/high‑overhead categories
- Always‑available kill switch for features (instant revert)
- Budgets per node/tenant; reject or cap controls that exceed published budgets
- Immutability of audit logs with full context: who/what/why/diffs/ttl/rollback

Discovery schema details (for clients/UIs/agents)
- type: profiling | feature_flag | circuit_breaker | resource_limit | security_policy | logging
- paramsSchema: JSON Schema for validation and auto‑render
- hints: overhead, max_duration, safe_defaults, rollout_capable, segmentation support
- backward compatible: all new fields optional; old clients degrade gracefully

AI governance (vision; protocol‑agnostic but enabled by the above)
- Universal control language: agents can reason across all services via the same typed surface
- Atomic ops across categories/services; idempotent, auditable
- Closed loop: simulate → plan → execute → monitor → learn (with risk scoring/approval gates)

Positioning
- RDCP becomes a typed, schema‑driven control plane in v1.x
- RDCP becomes a policy‑aware, orchestrated governance layer in v2.x
- Self‑building UI, CLIs, and AI orchestrators are just consumers; they evolve independently

Back‑compat & versioning
- Discovery advertises support and publishes schemas; older clients remain functional with minimal UI
- ui/policy evolution uses versioned blocks (e.g., ui.version) to gate new features

Open questions
- JSON Schema vs zod→JSON as canonical param schema for discovery?
- Minimum surface to standardize for circuit_breaker/resource_limit/security_policy v1?
- Risk scoring/dry‑run response format (how to report ‘why’ a change is high‑risk)?
- Where to host orchestrator semantics (server vs dedicated orchestrator)?

This RFC is a protocol roadmap only; implementation proposals can be split by capability class (profiling, feature_flag, etc.) and can proceed as v1.5 PRs once the taxonomy + discovery schema approach is accepted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: RDCP protocol evolution — v1.5 (typed controls) and v2.0 (policies/orchestration) #80

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RFC: RDCP protocol evolution — v1.5 (typed controls) and v2.0 (policies/orchestration) #80

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions