Skip to content

Design & implement safety guardrails (§21) #122

@AlexChesser

Description

@AlexChesser

Summary

Design and implement safety guardrails — allowlists, blocklists, and output filtering for pipeline execution.

Parent issue: #105 — Tier 3, Priority #11

Why

Enterprise adoption requires content safety controls. Allowlists constrain what tools/actions steps can use, blocklists prevent forbidden patterns in prompts or outputs, and output filtering catches sensitive data leakage. Not exciting, but a hard gate for certain buyers.

Design Decisions Needed

  • Guardrail types — allowlist (permitted tools/actions), blocklist (forbidden patterns), output filters?
  • YAML syntax — pipeline-level safety: block? Per-step overrides?
  • Pattern format — regex? Glob? Keywords? All three?
  • Enforcement — pre-execution (block the prompt), post-execution (filter the response), or both?
  • What happens on violation — abort? sanitize and continue? log and warn?
  • Interaction with tool permissions (already exist in runner config)
  • Whether guardrails are inheritable via FROM (Implement FROM inheritance (§7) #111)

Spec Reference

  • Referenced in spec/core/s21*.md as an exploratory feature
  • Detailed spec section needed

Acceptance Criteria

  • Spec section authored
  • Guardrails can be declared in pipeline YAML
  • Violations are detected and handled per policy
  • Guardrail events recorded in turn log

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions