Behavioral contracts for AI agent skills. ActionBox generates, reviews, and enforces security boundaries so your AI skills can only do what they're supposed to.
ActionBox is an OpenClaw plugin. It reads a skill's definition (SKILL.md), uses an LLM to produce a strict behavioral contract (ACTIONBOX.md), and then enforces that contract at runtime — flagging or blocking any tool call that falls outside the declared scope.
AI agent skills are powerful — they can read files, call APIs, send messages, and execute code. But with power comes risk. A misconfigured or compromised skill could:
- Read your SSH keys or AWS credentials
- Exfiltrate data to unauthorized servers
- Delete files it has no business touching
- Invoke tools far outside its intended purpose
ActionBox solves this by creating a behavioral contract for each skill: a machine-readable document that declares exactly what the skill is allowed to do, and nothing more.
SKILL.md ACTIONBOX.md Runtime
┌──────────┐ generate ┌──────────────┐ enforce ┌──────────┐
│ Skill │──────────────>│ Behavioral │────────────>│ Allow / │
│ Defn. │ (2-pass │ Contract │ (check │ Block / │
└──────────┘ LLM gen) └──────────────┘ every │ Alert │
│ call) └──────────┘
v
Human Review
Three steps:
- Generate — ActionBox reads a skill's
SKILL.mdand uses a two-pass LLM pipeline to produce a conservative behavioral contract - Review — A human inspects and approves the contract
- Enforce — At runtime, every tool call is checked against the contract; violations trigger alerts or blocks
ActionBox doesn't blindly trust its own output. Generation uses two LLM passes:
| Pass | Role | What it does |
|---|---|---|
| Pass 1 | Conservative Generator | Reads the skill definition and produces the strictest reasonable contract |
| Pass 2 | Adversarial Reviewer | Red-teams the contract for over-permissiveness, missing denials, and scope escape vectors |
The result is a contract that's been both authored and attacked before a human ever sees it.
Instead of requiring exact tool names in contracts (which are brittle and hard to guess), ActionBox supports capability descriptions — conceptual labels like "Google Calendar read-only access" or "Shell or command execution."
The generator produces allowedCapabilities and deniedCapabilities as the primary output. At runtime, when a tool call doesn't match any exact tool name, ActionBox uses an LLM (Haiku by default) to classify whether the tool call aligns with the declared capabilities.
Tool call arrives
│
├─ 1. Exact tool name checks (fast, deterministic)
│ globalDeniedTools, toolIndex lookups
│
├─ 2. If no exact match AND capabilities exist:
│ Check capability cache → if miss, call LLM
│ Denied capability matched → violation (critical)
│ No allowed capability matched → violation (high)
│
└─ 3. Filesystem/network checks (unchanged)
Results are cached by tool name for the enforcer's lifetime, so each tool is classified only once.
Contracts can use both systems simultaneously — allowedTools/deniedTools for fast deterministic enforcement with known names, and allowedCapabilities/deniedCapabilities for flexible LLM-evaluated matching.
OpenClaw loads all eligible skills simultaneously — there's no "one skill at a time." ActionBox handles this by building a global policy from all loaded contracts:
- Tool attribution — each contract lists its
allowedTools. ActionBox builds an index mapping tool names to the contracts that claim them. When a tool call comes in, it finds the claiming contract(s) and checks their rules. - Capability matching — if no exact tool name match is found, ActionBox falls back to LLM-based capability classification using
allowedCapabilitiesanddeniedCapabilitiesfrom all contracts. - Global denials — if a tool is in any contract's
deniedToolsand not in any contract'sallowedTools, it's globally blocked. Filesystem denied patterns (like~/.ssh/**) and network denied hosts (like*.onion) from ALL contracts are merged and always enforced. - Generous on allows — if multiple skills claim a tool, the call is permitted as long as ANY claiming contract's rules allow it. This avoids false positives in multi-skill environments.
┌─────────────────────────┐
│ Global Policy │
│ │
Contract A ────────>│ Tool Index │
Contract B ────────>│ Global Denied Tools │──── before_tool_call ──── Block / Allow
Contract C ────────>│ Global Denied Paths │──── after_tool_call ──── Log / Alert
│ Global Denied Hosts │
│ Allowed Capabilities │
│ Denied Capabilities │
└─────────────────────────┘
npm install @openclaw/plugin-actionbox# For a single skill
openclaw actionbox generate calendar-sync
# For all skills
openclaw actionbox generate-allopenclaw actionbox review calendar-sync
# Mark as reviewed
openclaw actionbox review calendar-sync --reviewer "alice"openclaw actionbox audit┌────────────────┬─────┬──────────┬─────────┬─────────┬────────┐
│ Skill │ Box │ Reviewed │ Drift │ Allowed │ Denied │
├────────────────┼─────┼──────────┼─────────┼─────────┼────────┤
│ calendar-sync │ yes │ yes │ ok │ 4 │ 5 │
│ github-triage │ yes │ yes │ ok │ 5 │ 6 │
│ slack-standup │ yes │ no │ DRIFTED │ 4 │ 6 │
│ data-pipeline │ no │ - │ - │ 0 │ 0 │
└────────────────┴─────┴──────────┴─────────┴─────────┴────────┘
1 skill(s) missing ActionBox.
1 skill(s) with drift detected.
1 skill(s) not yet reviewed.
Add ActionBox to your OpenClaw plugin config:
plugins:
actionbox:
mode: monitor # "monitor" or "enforce"
skillsDir: skills # directory containing skill definitions
alertChannel: actionbox-alerts # channel for violation alerts
autoGenerate: false # auto-generate boxes for new skills
generatorModel: claude-sonnet-4-5-20250929 # model for generation
driftCheckInterval: 300000 # drift check interval (ms), default 5 min| Option | Default | Description |
|---|---|---|
mode |
monitor |
monitor logs violations; enforce blocks them |
skillsDir |
skills |
Where to find skill directories |
alertChannel |
actionbox-alerts |
Messaging channel for violation alerts |
autoGenerate |
false |
Auto-generate contracts for new skills |
generatorModel |
claude-sonnet-4-5-20250929 |
Which model generates contracts |
driftCheckInterval |
300000 |
How often to check for skill definition changes (ms) |
Each skill gets a contract file that lives alongside its SKILL.md. Here's what one looks like:
version: "1.0"
skillId: calendar-sync
skillName: Calendar Sync
# Conceptual capability descriptions (LLM-evaluated at runtime)
allowedCapabilities:
- Google Calendar read-only access
- Local task management (create and update)
- Slack messaging for meeting notifications
deniedCapabilities:
- Shell or command execution
- Calendar event modification or deletion
- File deletion
- Direct HTTP requests to arbitrary hosts
# Exact tool names (optional — for fast deterministic enforcement)
allowedTools:
- name: google_calendar_read
reason: Required to fetch events from Google Calendar API
- name: task_create
reason: Required to create local task entries
- name: slack_send_message
reason: Required to send meeting notifications
deniedTools:
- name: shell_exec
reason: Calendar sync has no need for shell execution
- name: google_calendar_write
reason: Skill is read-only — must never modify events
# Filesystem access boundaries
filesystem:
readable:
- "./config/calendar.yaml"
- "./data/tasks.json"
writable:
- "./data/tasks.json"
denied:
- "~/.ssh/**"
- "~/.aws/**"
- "**/.env"
# Network access boundaries
network:
allowedHosts:
- "calendar.google.com"
- "*.googleapis.com"
- "*.slack.com"
deniedHosts:
- "*.onion"
# Behavioral guardrails
behavior:
summary: >-
Reads events from Google Calendar and syncs to local tasks.
Sends Slack notifications for upcoming meetings.
Read-only access to calendar.
neverDo:
- Delete or modify Google Calendar events
- Execute shell commands
- Access SSH keys or AWS credentials
maxToolCalls: 20
# Drift detection metadata
drift:
skillHash: e3b0c44298fc1c14... # SHA-256 of SKILL.md at generation time
generatedAt: "2025-01-15T10:00:00.000Z"
generatorModel: claude-sonnet-4-5-20250929
reviewed: true
reviewedBy: security-team
reviewedAt: "2025-01-16T14:30:00.000Z"| Severity | What triggers it | Example |
|---|---|---|
| Critical | Denied tool used, denied capability matched, denied filesystem path accessed | Skill calls shell_exec, tool matches "Shell execution" denied capability, reads ~/.ssh/id_rsa |
| High | Unlisted tool/capability, filesystem or network rule violated | Skill calls tool matching no allowed capability, writes outside allowed dirs |
| Medium | Tool call limit exceeded | Skill makes 50 calls when limit is 20 |
| Low | Minor behavioral anomalies | Unusual argument patterns |
Runtime enforcement is optional and designed for specific situations where you need hard guardrails on tool calls. You can disable it entirely by leaving ActionBox in monitor mode (the default), or escalate to enforce mode when you need to actively block violations.
Monitor mode (default) — violations are logged via after_tool_call, but skill execution continues. Good for rollout and tuning.
Enforce mode — violations block the tool call via before_tool_call before it executes. Both hooks run simultaneously: before_tool_call for blocking, after_tool_call for logging.
Recommendation: Even if you disable runtime enforcement, we recommend keeping context injection enabled (it's on by default). Context injection gives the LLM awareness of its behavioral contract before it acts, which prevents most violations from happening in the first place. Runtime enforcement is a safety net; context injection is the first line of defense.
ActionBox hashes each SKILL.md when generating a contract. A background service periodically re-hashes and alerts if the skill definition has changed since the contract was generated. This ensures contracts don't go stale.
ActionBox doesn't just enforce contracts reactively — it also injects behavioral directives into each agent's context before execution begins. This gives the LLM driving each skill awareness of its contract before it ever makes a tool call.
Context injection is always on and works regardless of enforcement mode. We recommend leaving it enabled in all configurations — it's the most effective way to keep agents within their intended scope, because the LLM self-regulates rather than being blocked after the fact.
ActionBox hooks into OpenClaw's before_agent_start event. When an agent session starts, ActionBox builds a structured XML directive block from all loaded contracts and injects it via prependContext, which places the directive before the system prompt.
<actionbox-directive>
<skill name="calendar-sync">
<purpose>Reads events from Google Calendar and syncs to local tasks...</purpose>
<principles>
<principle>Prefer read-only operations when possible</principle>
</principles>
<always-do>
<rule>Verify calendar event data before creating tasks</rule>
</always-do>
<never-do>
<rule>Delete or modify Google Calendar events</rule>
<rule>Execute shell commands</rule>
</never-do>
<allowed-capabilities>
<capability>Google Calendar read-only access</capability>
<capability>Local task management (create and update)</capability>
<capability>Slack messaging for meeting notifications</capability>
</allowed-capabilities>
<denied-capabilities>
<capability>Shell or command execution</capability>
<capability>Calendar event modification or deletion</capability>
</denied-capabilities>
<allowed-tools>google_calendar_read, task_create, task_update, slack_send_message</allowed-tools>
<denied-tools>shell_exec, file_delete, google_calendar_write, google_calendar_delete, http_request</denied-tools>
<filesystem>
<readable>./config/calendar.yaml, ./data/tasks.json</readable>
<writable>./data/tasks.json, ./data/tasks.json.bak</writable>
<denied>~/.ssh/**, ~/.aws/**, **/.env, **/.env.*, **/credentials*, **/secret*</denied>
</filesystem>
<network>
<allowed>calendar.google.com, *.googleapis.com, slack.com, *.slack.com</allowed>
<denied>*.onion, *.tor</denied>
</network>
</skill>
</actionbox-directive>All loaded contracts are included in a single directive block, giving the LLM full awareness of every active behavioral contract.
Contracts support both negative constraints and positive guidance:
| Field | Purpose | Example |
|---|---|---|
behavior.alwaysDo |
Positive behavioral guidance | "Verify calendar event data before creating tasks" |
behavior.principles |
High-level operating principles | "Prefer read-only operations when possible" |
behavior.neverDo |
Actions the skill must never take | "Delete or modify Google Calendar events" |
These fields are optional and backward-compatible — existing contracts without alwaysDo or principles will continue to work.
import { buildDirective, buildSkillDirective } from "@openclaw/plugin-actionbox";
// Build directive for all loaded boxes
const directive = buildDirective(enforcer.getAllBoxes());
// Build directive for a single box
const skillXml = buildSkillDirective(box);Generate an ACTIONBOX.md for a single skill.
openclaw actionbox generate calendar-sync
openclaw actionbox generate calendar-sync --skip-review # skip adversarial passGenerate contracts for all skills in the skills directory.
openclaw actionbox generate-allShow a table of all skills with contract coverage, review status, and drift detection.
openclaw actionbox auditDisplay current enforcement mode, configuration, and recent violations.
openclaw actionbox statusDisplay a contract and optionally mark it as human-reviewed.
openclaw actionbox review calendar-sync
openclaw actionbox review calendar-sync --reviewer "alice"ActionBox exports its core modules for use in your own code:
import {
ActionBoxEnforcer,
matchToolCall,
buildGlobalPolicy,
checkFilesystemAccess,
checkNetworkAccess,
extractPaths,
extractHosts,
parseActionBox,
serializeActionBox,
generateActionBox,
parseSkillMd,
sha256,
CapabilityMatcherCache,
classifyToolCall,
} from "@openclaw/plugin-actionbox";
// Load contracts and enforce
const enforcer = new ActionBoxEnforcer("monitor");
await enforcer.loadBoxes(["./skills/calendar-sync", "./skills/github-triage"]);
// Check a tool call (params are extracted automatically, capabilities evaluated via LLM)
const violations = await enforcer.check("shell_exec", { command: "rm -rf /" });
// => [{ severity: "critical", type: "denied_tool", ... }]
// Access the global policy directly
const policy = enforcer.getPolicy();
console.log(policy.globalDeniedTools); // tools denied across all contracts
console.log(policy.toolIndex); // tool name → claiming skill IDs
console.log(policy.allAllowedCapabilities); // union of all allowed capabilities
console.log(policy.allDeniedCapabilities); // union of all denied capabilitiesactionbox/
├── src/
│ ├── plugin.ts # Main plugin entry point
│ ├── types.ts # Core TypeScript types
│ ├── openclaw-sdk.d.ts # Mock SDK type definitions
│ ├── generator/
│ │ ├── parser.ts # SKILL.md frontmatter parsing
│ │ ├── prompts.ts # Two-pass LLM prompt templates
│ │ └── generate.ts # Generation orchestration
│ ├── injector/
│ │ └── directive-builder.ts # XML directive builder for context injection
│ ├── enforcer/
│ │ ├── param-extractor.ts # Extract paths/hosts from tool params
│ │ ├── policy.ts # Global policy engine (multi-skill merge)
│ │ ├── path-matcher.ts # Glob matching for paths and hosts
│ │ ├── matcher.ts # Tool call → policy violation matching
│ │ ├── capability-matcher.ts # LLM-based capability classification with caching
│ │ └── enforcer.ts # Enforcer class with caching
│ ├── alerter/
│ │ ├── formatters.ts # Plain text / Markdown / Slack formatters
│ │ └── alerter.ts # Alert dispatch
│ ├── cli/
│ │ ├── generate.ts # generate / generate-all commands
│ │ ├── audit.ts # audit command
│ │ ├── status.ts # status command
│ │ └── review.ts # review command
│ └── utils/
│ ├── hash.ts # SHA-256 hashing
│ ├── config.ts # Config + skill directory discovery
│ └── yaml.ts # ACTIONBOX.md YAML parse/serialize
├── tests/
│ ├── fixtures/ # Sample SKILL.md and ACTIONBOX.md files
│ ├── generator.test.ts # Parsing and prompt construction tests
│ ├── enforcer.test.ts # Enforcer class tests
│ ├── matcher.test.ts # Violation matching tests
│ ├── path-matcher.test.ts # Path and host matching tests
│ ├── injector.test.ts # Directive builder tests
│ └── capability-matcher.test.ts # Capability classification tests
├── examples/
│ └── boxes/ # Example ACTIONBOX.md contracts
│ ├── calendar.actionbox.md
│ ├── github-triage.actionbox.md
│ └── slack-standup.actionbox.md
├── openclaw.plugin.json # OpenClaw plugin manifest
├── package.json
├── tsconfig.json
├── tsup.config.ts
└── vitest.config.ts
git clone https://github.com/nikos118/actionbox.git
cd actionbox
npm installnpm run buildnpm test # run once
npm run test:watch # watch modenpm run typecheckTests across 6 test files covering:
- Generator — SKILL.md parsing, prompt construction, YAML extraction
- Enforcer — Box loading, caching, async checking, drift detection, capability policy
- Matcher — Multi-skill policy matching, tool attribution, filesystem/network violations
- Path Matcher — Glob patterns, host wildcards, edge cases
- Injector — XML directive building, context injection, capability rendering, backward compatibility
- Capability Matcher — Cache behavior, classification prompts, LLM integration with mocked API
The examples/boxes/ directory contains sample contracts for common skill types:
| Example | Description |
|---|---|
calendar.actionbox.md |
Calendar sync — read-only calendar access, scoped file writes, Slack notifications |
github-triage.actionbox.md |
Issue triage — read issues, add labels/comments, no close/delete |
slack-standup.actionbox.md |
Standup bot — read/send messages, no channel management |
Contributions are welcome. Please open an issue first to discuss what you'd like to change.
- Fork the repo
- Create your feature branch (
git checkout -b feature/amazing-feature) - Run tests (
npm test) - Commit your changes
- Push to the branch and open a Pull Request
The Unlicense — public domain. Do whatever you want with it.