Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions docs/hooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,21 @@ hooks.Register(engine.AfterToolExec, func(ctx context.Context, hctx *engine.Hook
})
```

## Skill Guardrail Hooks

When skills declare guardrails in their `SKILL.md` frontmatter, the runner registers four hooks that enforce skill-specific security policies across the entire agent loop:

| Hook Point | Guardrail Type | Behavior |
|------------|---------------|----------|
| `BeforeLLMCall` | `deny_prompts` | Blocks user messages that probe agent capabilities (e.g., "what tools can you run") |
| `AfterLLMCall` | `deny_responses` | Replaces LLM responses that enumerate internal binary names |
| `BeforeToolExec` | `deny_commands` | Blocks `cli_execute` commands matching deny patterns (e.g., `kubectl get secrets`) |
| `AfterToolExec` | `deny_output` | Blocks or redacts `cli_execute` output matching deny patterns (e.g., Secret manifests) |

These hooks complement the global guardrail hooks (secrets/PII scanning) and fire in addition to them. Skill guardrails are loaded from build artifacts or parsed at runtime from `SKILL.md` — no `forge build` step is required.

For pattern syntax and configuration, see [Skill Guardrails](security/guardrails.md#skill-guardrails).

## Audit Logging

The runner registers `AfterLLMCall` hooks that emit structured audit events for each LLM interaction. Audit fields include:
Expand Down
2 changes: 1 addition & 1 deletion docs/runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ For details on session persistence, context window management, compaction, and l

The engine fires hooks at key points in the loop. See [Hooks](hooks.md) for details.

The runner registers four hook groups: logging, audit, progress, and guardrail hooks. The guardrail `AfterToolExec` hook scans tool output for secrets and PII, redacting or blocking before results enter the LLM context. See [Tool Output Scanning](security/guardrails.md#tool-output-scanning).
The runner registers five hook groups: logging, audit, progress, global guardrail hooks, and skill guardrail hooks. The global guardrail `AfterToolExec` hook scans tool output for secrets and PII, redacting or blocking before results enter the LLM context. Skill guardrail hooks enforce domain-specific rules declared in `SKILL.md` — blocking commands, redacting output, intercepting capability enumeration probes, and replacing binary-enumerating responses. Skill guardrails are loaded from build artifacts or parsed directly from `SKILL.md` at runtime (no `forge build` required). See [Tool Output Scanning](security/guardrails.md#tool-output-scanning) and [Skill Guardrails](security/guardrails.md#skill-guardrails).

## Streaming

Expand Down
75 changes: 75 additions & 0 deletions docs/security/guardrails.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,81 @@ Additionally, `cmd.Dir` is set to `workDir` so relative paths in subprocess exec
| `jq '.' /tmp/data.json` | Allowed — system path outside `$HOME` |
| `ls ./data/` | Allowed — within workDir |

## Skill Guardrails

Skills can declare domain-specific guardrails in their `SKILL.md` frontmatter under `metadata.forge.guardrails`. These complement the global guardrails with rules authored by skill developers to enforce least-privilege and prevent capability enumeration.

### Guardrail Types

| Type | Hook Point | Direction | Behavior |
|------|-----------|-----------|----------|
| `deny_commands` | `BeforeToolExec` | Inbound | Blocks `cli_execute` commands matching a regex pattern |
| `deny_output` | `AfterToolExec` | Outbound | Blocks or redacts `cli_execute` output matching a regex pattern |
| `deny_prompts` | `BeforeLLMCall` | Inbound | Blocks user messages matching a regex (capability enumeration probes) |
| `deny_responses` | `AfterLLMCall` | Outbound | Replaces LLM responses matching a regex (binary name leaks) |

### SKILL.md Configuration

```yaml
metadata:
forge:
guardrails:
deny_commands:
- pattern: '\bget\s+secrets?\b'
message: "Listing Kubernetes secrets is not permitted"
- pattern: '\bauth\s+can-i\b'
message: "Permission enumeration is not permitted"
deny_output:
- pattern: 'kind:\s*Secret'
action: block
- pattern: 'token:\s*[A-Za-z0-9+/=]{40,}'
action: redact
deny_prompts:
- pattern: '\b(approved|allowed|available)\b.{0,40}\b(tools?|binaries|commands?)\b'
message: "I help with Kubernetes cost analysis. Ask about cluster costs."
deny_responses:
- pattern: '\b(kubectl|jq|awk|bc|curl)\b.*\b(kubectl|jq|awk|bc|curl)\b.*\b(kubectl|jq|awk|bc|curl)\b'
message: "I can analyze cluster costs. What would you like to know?"
```

### Pattern Details

**`deny_commands`** — Patterns match against the reconstructed command line (`binary arg1 arg2 ...`). Only fires for `cli_execute` tool calls.

**`deny_output`** — Patterns match against tool output text. The `action` field controls behavior:

| Action | Behavior |
|--------|----------|
| `block` | Returns an error, preventing the output from entering the LLM context |
| `redact` | Replaces matched text with `[BLOCKED BY POLICY]` and logs a warning |

**`deny_prompts`** — Patterns are compiled with case-insensitive matching (`(?i)`). Designed to catch capability enumeration probes like "what are the approved tools" or "list available binaries". The `message` field provides a redirect response.

**`deny_responses`** — Patterns are compiled with case-insensitive and dot-matches-newline flags (`(?is)`). Designed to catch LLM responses that enumerate internal binary names. When matched, the entire response is replaced with the `message` text.

### Aggregation

When multiple skills declare guardrails, patterns are aggregated and deduplicated across all active skills. The `SkillGuardrailEngine` runs all patterns from all skills as a single enforcement layer.

### Runtime Fallback

Skill guardrails fire both with and without `forge build`:

- **With build** — Guardrails are serialized into `policy-scaffold.json` during `forge build` and loaded at runtime
- **Without build** — The runner parses `SKILL.md` files at startup and loads guardrails directly, falling back to runtime-parsed rules when no build artifact exists

This ensures guardrails are always active during development (`forge run`) without requiring a full build cycle.

## File Protocol Blocking

The `cli_execute` tool blocks arguments containing `file://` URLs (case-insensitive). This prevents filesystem traversal attacks via tools like `curl file:///etc/passwd` that bypass path validation since `file://` URLs are not detected as filesystem paths by `looksLikePath()`.

| Input | Result |
|-------|--------|
| `curl file:///etc/passwd` | Blocked — `file://` protocol detected |
| `curl FILE:///etc/shadow` | Blocked — case-insensitive check |
| `curl http://example.com` | Allowed — only `file://` is blocked |

## Audit Events

Guardrail evaluations are logged as structured audit events:
Expand Down
33 changes: 23 additions & 10 deletions docs/security/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,18 @@ Forge's security is organized in layers, each addressing a different threat surf

```
┌──────────────────────────────────────────────────────────────┐
│ Guardrails │
│ Skill Guardrails │
│ (deny commands/output/prompts/responses per skill) │
├──────────────────────────────────────────────────────────────┤
│ Global Guardrails │
│ (content filtering, PII, jailbreak) │
├──────────────────────────────────────────────────────────────┤
│ Egress Enforcement │
│ (EgressEnforcer + EgressProxy + NetworkPolicy) │
├──────────────────────────────────────────────────────────────┤
│ Execution Sandboxing │
│ (env isolation, binary allowlists, arg validation) │
│ (env isolation, binary allowlists, arg validation, │
│ file:// blocking, shell denylist) │
├──────────────────────────────────────────────────────────────┤
│ Secrets Management │
│ (AES-256-GCM, Argon2id, per-agent isolation) │
Expand Down Expand Up @@ -112,17 +116,22 @@ Skill scripts run via `SkillCommandExecutor` (`forge-cli/tools/exec.go`):

### CLIExecuteTool

The `cli_execute` tool (`forge-cli/tools/cli_execute.go`) provides 7 security layers:
The `cli_execute` tool (`forge-cli/tools/cli_execute.go`) provides 12 security layers:

| # | Layer | Detail |
|---|-------|--------|
| 1 | **Binary allowlist** | Only pre-approved binaries can execute |
| 2 | **Binary resolution** | Binaries are resolved to absolute paths via `exec.LookPath` at startup |
| 3 | **Argument validation** | Rejects arguments containing `$(`, backticks, or newlines |
| 4 | **Timeout** | Configurable per-command timeout (default: 120s) |
| 5 | **No shell** | Uses `exec.CommandContext` directly — no shell expansion |
| 6 | **Environment isolation** | Only `PATH`, `HOME`, `LANG`, explicit passthrough vars, and proxy vars |
| 7 | **Output limits** | Configurable max output size (default: 1MB) to prevent memory exhaustion |
| 1 | **Shell denylist** | Shell interpreters (`bash`, `sh`, `zsh`, etc.) filtered at construction and blocked at execution |
| 2 | **Binary allowlist** | Only pre-approved binaries can execute |
| 3 | **Binary resolution** | Binaries are resolved to absolute paths via `exec.LookPath` at startup |
| 4 | **Argument validation** | Rejects arguments containing `$(`, backticks, newlines, or `file://` URLs |
| 5 | **File protocol blocking** | Blocks `file://` URLs (case-insensitive) to prevent filesystem traversal |
| 6 | **Path confinement** | Path arguments inside `$HOME` but outside `workDir` are blocked |
| 7 | **Timeout** | Configurable per-command timeout (default: 120s) |
| 8 | **No shell** | Uses `exec.CommandContext` directly — no shell expansion |
| 9 | **Working directory** | `cmd.Dir` set to `workDir` for relative path resolution |
| 10 | **Environment isolation** | Only `PATH`, `HOME`, `LANG`, explicit passthrough vars, and proxy vars |
| 11 | **Output limits** | Configurable max output size (default: 1MB) to prevent memory exhaustion |
| 12 | **Skill guardrails** | Skill-declared `deny_commands` and `deny_output` patterns via hooks |

### Configuration

Expand Down Expand Up @@ -158,6 +167,10 @@ For full details, see **[Build Signing & Verification](signing.md)**.

The guardrail engine checks inbound and outbound messages against policy rules including content filtering, PII detection, and jailbreak protection. Guardrails run in `enforce` (blocking) or `warn` (logging) mode.

### Skill Guardrails

Skills can declare domain-specific guardrails in their `SKILL.md` frontmatter. These guardrails operate at four hook points — blocking unauthorized commands (`deny_commands`), redacting sensitive output (`deny_output`), intercepting capability enumeration probes (`deny_prompts`), and replacing binary-enumerating LLM responses (`deny_responses`). Skill guardrails fire at runtime without requiring `forge build`.

For full details, see **[Content Guardrails](guardrails.md)**.

---
Expand Down
48 changes: 48 additions & 0 deletions docs/skills.md
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,54 @@ This registers three tools:

Requires: `jq`. Egress: `cdn.tailwindcss.com`, `esm.sh`.

## Skill Guardrails

Skills can declare domain-specific guardrails in their `SKILL.md` frontmatter to enforce security policies at runtime. These guardrails operate at four interception points in the agent loop, preventing unauthorized commands, data exfiltration, capability enumeration, and binary name disclosure.

### Configuration

Add a `guardrails` block under `metadata.forge` in `SKILL.md`:

```yaml
metadata:
forge:
guardrails:
deny_commands:
- pattern: '\bget\s+secrets?\b'
message: "Listing Kubernetes secrets is not permitted"
deny_output:
- pattern: 'kind:\s*Secret'
action: block
- pattern: 'token:\s*[A-Za-z0-9+/=]{40,}'
action: redact
deny_prompts:
- pattern: '\b(approved|allowed|available)\b.{0,40}\b(tools?|binaries)\b'
message: "I help with K8s cost analysis. Ask about cluster costs."
deny_responses:
- pattern: '\b(kubectl|jq|awk|bc|curl)\b.*\b(kubectl|jq|awk|bc|curl)\b.*\b(kubectl|jq|awk|bc|curl)\b'
message: "I can analyze cluster costs. What would you like to know?"
```

### Guardrail Types

| Type | Direction | Purpose |
|------|-----------|---------|
| `deny_commands` | Input | Block `cli_execute` commands matching patterns (e.g., `kubectl get secrets`) |
| `deny_output` | Output | Block or redact tool output matching patterns (e.g., Secret manifests, tokens) |
| `deny_prompts` | Input | Block user messages probing agent capabilities (e.g., "what tools can you run") |
| `deny_responses` | Output | Replace LLM responses that enumerate internal binary names |

### Capability Enumeration Prevention

The `deny_prompts` and `deny_responses` guardrails form a layered defense against capability enumeration attacks:

1. **Input-side** (`deny_prompts`) — Intercepts user messages that probe for available tools, binaries, or commands and redirects to the skill's functional description
2. **Output-side** (`deny_responses`) — Catches LLM responses that list 3+ binary names and replaces the entire response with a functional capability description

Additionally, skill `Description()` methods and system prompt catalog entries use generic descriptions instead of listing binary names.

For full details on guardrail types, pattern syntax, and runtime behavior, see [Content Guardrails — Skill Guardrails](security/guardrails.md#skill-guardrails).

## Skill Instructions in System Prompt

Forge injects the **full body** of each skill's SKILL.md into the LLM system prompt. This means all detailed operational instructions — triage steps, detection heuristics, output structure, safety constraints — are directly available in the LLM's context without requiring an extra `read_skill` tool call.
Expand Down
20 changes: 11 additions & 9 deletions docs/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ Provider selection: `WEB_SEARCH_PROVIDER` env var, or auto-detect from available

## CLI Execute

The `cli_execute` tool provides security-hardened command execution with 10 security layers:
The `cli_execute` tool provides security-hardened command execution with 12 security layers:

```yaml
tools:
Expand All @@ -73,16 +73,18 @@ tools:

| # | Layer | Detail |
|---|-------|--------|
| 1 | **Shell denylist** | Shell interpreters (`bash`, `sh`, `zsh`, `dash`, `ksh`, `csh`, `tcsh`, `fish`) are unconditionally blocked — they defeat the no-shell design |
| 1 | **Shell denylist** | Shell interpreters (`bash`, `sh`, `zsh`, `dash`, `ksh`, `csh`, `tcsh`, `fish`) are filtered out at construction time and unconditionally blocked at execution — they defeat the no-shell design |
| 2 | **Binary allowlist** | Only pre-approved binaries can execute |
| 3 | **Binary resolution** | Binaries are resolved to absolute paths via `exec.LookPath` at startup |
| 4 | **Argument validation** | Rejects arguments containing `$(`, backticks, or newlines |
| 5 | **Path confinement** | Path arguments inside `$HOME` but outside `workDir` are blocked (see [Path Containment](security/guardrails.md#path-containment)) |
| 6 | **Timeout** | Configurable per-command timeout (default: 120s) |
| 7 | **No shell** | Uses `exec.CommandContext` directly — no shell expansion |
| 8 | **Working directory** | `cmd.Dir` set to `workDir` so relative paths resolve within the agent directory |
| 9 | **Environment isolation** | Only `PATH`, `HOME`, `LANG`, explicit passthrough vars, proxy vars, and `OPENAI_ORG_ID` (when set). `HOME` is overridden to `workDir` to prevent `~` expansion from reaching the real home directory |
| 10 | **Output limits** | Configurable max output size (default: 1MB) to prevent memory exhaustion |
| 4 | **Argument validation** | Rejects arguments containing `$(`, backticks, newlines, or `file://` URLs |
| 5 | **File protocol blocking** | Arguments containing `file://` (case-insensitive) are blocked to prevent filesystem traversal via `curl file:///etc/passwd` (see [File Protocol Blocking](security/guardrails.md#file-protocol-blocking)) |
| 6 | **Path confinement** | Path arguments inside `$HOME` but outside `workDir` are blocked (see [Path Containment](security/guardrails.md#path-containment)) |
| 7 | **Timeout** | Configurable per-command timeout (default: 120s) |
| 8 | **No shell** | Uses `exec.CommandContext` directly — no shell expansion |
| 9 | **Working directory** | `cmd.Dir` set to `workDir` so relative paths resolve within the agent directory |
| 10 | **Environment isolation** | Only `PATH`, `HOME`, `LANG`, explicit passthrough vars, proxy vars, and `OPENAI_ORG_ID` (when set). `HOME` is overridden to `workDir` to prevent `~` expansion from reaching the real home directory |
| 11 | **Output limits** | Configurable max output size (default: 1MB) to prevent memory exhaustion |
| 12 | **Skill guardrails** | Skill-declared `deny_commands` and `deny_output` patterns block/redact command inputs and outputs (see [Skill Guardrails](security/guardrails.md#skill-guardrails)) |

## File Create

Expand Down
Loading