Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions docs/agents-runtime-actions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# AGENTS Runtime Actions

`AGENTS.md` command entries are now registered as invokable runtime tools.

## Command Registration

- Every command in `## Commands` is registered as `agents:<name>`.
- If no built-in tool already uses the same name, an unprefixed alias (`<name>`) is also registered.
- Duplicate command names in `AGENTS.md` are resolved deterministically: first definition wins and a warning is emitted.

## Resolution Rules

- `agents:<name>` always resolves to AGENTS command lookup.
- Unknown `agents:<name>` references return a deterministic not-found result with no shell execution.
- If `<name>` collides with an existing built-in tool, built-in lookup wins and AGENTS command remains available as `agents:<name>`.

## Execution and Policy

- AGENTS commands execute through the same command executor as `exec-command`.
- Side effects are classified from the resolved shell command (`read`, `write`, `destructive`, `network`).
- Policy checks run before process execution.
- Interactive mode requires approval for mutating commands.
- Automation mode preserves write allowlist behavior.

## Observability Payload

Tool results and runtime observability include structured fields:

- `actionName`
- `resolvedCommand`
- `policyOutcome`
- `executionSummary`
24 changes: 0 additions & 24 deletions openspec/changes/agents-md-runtime-actions/tasks.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
## 1. Command Registration

- [x] 1.1 Extend AGENTS config loading/validation to detect duplicate command names and emit deterministic warnings.
- [x] 1.2 Add runtime registration logic that converts `AGENTS.md` command entries into invokable runtime action descriptors.
- [x] 1.3 Define and implement command lookup/identifier resolution behavior (including unknown-command error responses).

## 2. Execution and Policy Integration

- [x] 2.1 Implement an AGENTS command action adapter that executes through the existing command execution path.
- [x] 2.2 Ensure AGENTS command actions classify side effects and invoke policy decisions before process execution.
- [x] 2.3 Enforce interactive approval for mutating AGENTS commands and preserve automation allowlist behavior.

## 3. Observability and UX

- [x] 3.1 Extend trace/transcript payloads to include AGENTS action name, resolved command, policy outcome, and execution summary.
- [x] 3.2 Add consistent user-facing summaries for success, denial, and not-found outcomes of AGENTS command actions.
- [x] 3.3 Document runtime command behavior and resolution rules in user/developer docs.

## 4. Verification

- [x] 4.1 Add/update unit tests for AGENTS command parsing, duplicate-name handling, and registration.
- [x] 4.2 Add runtime/policy tests for approval-required, allowlisted automation execution, and blocked execution scenarios.
- [x] 4.3 Add observability/result-shape tests for successful execution, policy denial, and unknown command references.
- [x] 4.4 Run `pnpm test`, `pnpm typecheck`, and `pnpm lint` to verify implementation stability.
48 changes: 48 additions & 0 deletions openspec/specs/agents-runtime-actions/spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# agents-runtime-actions Specification

## Purpose
Define how runtime discovers AGENTS commands from `AGENTS.md`, registers them as invokable actions, evaluates policy and approvals before execution, and emits structured deterministic execution outcomes.
## Requirements
### Requirement: Runtime SHALL Register AGENTS Commands as Invokable Actions
The runtime SHALL load command entries from `AGENTS.md` and register each valid command as an invokable runtime action before agent execution begins.

#### Scenario: Commands available after config load
- **WHEN** a workspace contains `AGENTS.md` with one or more valid command entries
- **THEN** runtime action discovery includes each command entry by its declared name

#### Scenario: Missing AGENTS file
- **WHEN** no `AGENTS.md` file exists in the workspace
- **THEN** runtime action discovery proceeds without AGENTS command actions and without fatal error

### Requirement: Runtime SHALL Execute AGENTS Commands Through Policy-Gated Command Execution
The system SHALL route AGENTS command actions through the standard command execution path and SHALL evaluate policy/approval before running shell commands.

#### Scenario: Mutating command requires approval in interactive mode
- **WHEN** an AGENTS command action resolves to a mutating shell command and mode is interactive
- **THEN** the policy engine returns an approval-required decision before command execution

#### Scenario: Automation allowlist permits safe write
- **WHEN** an AGENTS command action resolves to a write command in automation mode and policy allowlist explicitly permits it
- **THEN** the command is executed without additional interactive approval

### Requirement: Runtime SHALL Produce Structured Results for AGENTS Command Actions
For every AGENTS command action execution attempt, the runtime SHALL produce structured result data including action name, resolved command, policy decision outcome, and execution summary.

#### Scenario: Successful execution emits structured summary
- **WHEN** an AGENTS command action executes successfully
- **THEN** trace/transcript records include the AGENTS command name, executed command string, and summarized stdout/stderr outcome

#### Scenario: Policy-blocked execution emits structured denial
- **WHEN** policy denies an AGENTS command action
- **THEN** trace/transcript records include denial reason and no command process is started

### Requirement: Runtime SHALL Handle Invalid or Unknown AGENTS Command References Deterministically
The runtime SHALL return a deterministic error when an AGENTS command action reference is unknown, malformed, or conflicts in ways that prevent safe execution.

#### Scenario: Unknown command name
- **WHEN** the agent requests execution of an AGENTS command name that is not registered
- **THEN** runtime returns a not-found error result with no shell execution

#### Scenario: Duplicate command names in AGENTS definition
- **WHEN** `AGENTS.md` defines duplicate command names
- **THEN** runtime applies documented resolution behavior consistently and emits a structured warning
86 changes: 81 additions & 5 deletions src/cli/app.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import TextInput from 'ink-text-input';
import { useMemo, useRef, useState } from 'react';
import type { AgentOrchestrator } from '../agent/orchestrator';
import type { TraceStore } from '../observability/traces';
import type { TranscriptStore } from '../observability/transcripts';
import type { ToolRegistry } from '../tools/registry';
import type { ToolInvocation, ToolResult } from '../tools/schemas';
import {
Expand All @@ -20,6 +21,7 @@ type AppProps = {
orchestrator: AgentOrchestrator;
tools: ToolRegistry;
traces: TraceStore;
transcripts: TranscriptStore;
};

type PendingApproval = {
Expand All @@ -46,7 +48,7 @@ const APPROVAL_COMMANDS: SlashCommand[] = [
{ command: '/help', description: 'Show available slash commands' },
];

export function ChatApp({ orchestrator, tools, traces }: AppProps) {
export function ChatApp({ orchestrator, tools, traces, transcripts }: AppProps) {
const [value, setValue] = useState('');
const [output, setOutput] = useState<string>('');
const [busy, setBusy] = useState(false);
Expand Down Expand Up @@ -118,8 +120,11 @@ export function ChatApp({ orchestrator, tools, traces }: AppProps) {
}

const invocation = checkpoint.toolPlan[index];
const sensitive = isSensitiveAction(invocation);
if (sensitive) {
const policyResult = await tools.invoke(invocation, { mode: 'interactive' });
const policyOutcome = getPolicyOutcome(policyResult);

let result = policyResult;
if (policyOutcome?.requiresApproval) {
if (!(await transitionPhase('awaiting_approval'))) {
setOutput('Error: invalid lifecycle transition while awaiting approval.');
return;
Expand Down Expand Up @@ -154,13 +159,31 @@ export function ChatApp({ orchestrator, tools, traces }: AppProps) {
setOutput('Error: invalid lifecycle transition while re-entering execution.');
return;
}

result = await tools.invoke(invocation, {
mode: 'interactive',
approvalGranted: true,
});
}

const result = await tools.invoke(invocation);
assistantMessage += `\n${formatToolResult(result)}`;
const observabilityPayload = buildExecutionPayload(invocation, result);
await traces.write({
timestamp: new Date().toISOString(),
type: 'tool.execution',
sessionId: SESSION_ID,
payload: observabilityPayload,
});
await transcripts.write({
timestamp: new Date().toISOString(),
sessionId: SESSION_ID,
role: 'system',
text: formatToolResult(result),
payload: observabilityPayload,
});
completed.add(index);

if (sensitive) {
if (policyOutcome?.requiresApproval || isSensitiveAction(invocation)) {
await saveCheckpoint({
...checkpoint,
phase: lifecycleRef.current.getPhase(),
Expand Down Expand Up @@ -371,6 +394,59 @@ function formatToolResult(result: ToolResult): string {
return `Tool ${result.tool} failed: ${result.summary}${result.stderr ? ` (${result.stderr.trim()})` : ''}`;
}

function getPolicyOutcome(result: ToolResult): {
allowed: boolean;
requiresApproval: boolean;
reason: string;
sideEffect: ToolInvocation['sideEffect'];
} | null {
const candidate = result.payload.policyOutcome;
if (!candidate || typeof candidate !== 'object') {
return null;
}
const policyOutcome = candidate as Record<string, unknown>;
if (
typeof policyOutcome.allowed !== 'boolean' ||
typeof policyOutcome.requiresApproval !== 'boolean' ||
typeof policyOutcome.reason !== 'string' ||
(policyOutcome.sideEffect !== 'read' &&
policyOutcome.sideEffect !== 'write' &&
policyOutcome.sideEffect !== 'destructive' &&
policyOutcome.sideEffect !== 'network')
) {
return null;
}
return {
allowed: policyOutcome.allowed,
requiresApproval: policyOutcome.requiresApproval,
reason: policyOutcome.reason,
sideEffect: policyOutcome.sideEffect,
};
}

function buildExecutionPayload(
invocation: ToolInvocation,
result: ToolResult
): Record<string, unknown> {
const policyOutcome = getPolicyOutcome(result);
const resolvedCommand =
typeof result.payload.resolvedCommand === 'string'
? result.payload.resolvedCommand
: typeof result.payload.command === 'string'
? result.payload.command
: null;
return {
actionName:
typeof result.payload.actionName === 'string' ? result.payload.actionName : invocation.tool,
tool: invocation.tool,
resolvedCommand,
policyOutcome,
executionSummary: result.summary,
ok: result.ok,
exitCode: result.exitCode,
};
}

function formatHelpText(commands: SlashCommand[]): string {
const lines = ['Available slash commands:'];
for (const command of commands) {
Expand Down
7 changes: 6 additions & 1 deletion src/cli/commands/chat.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,11 @@ export async function runChatCommand(prompt?: string): Promise<void> {
}

render(
<ChatApp orchestrator={runtime.orchestrator} tools={runtime.tools} traces={runtime.traces} />
<ChatApp
orchestrator={runtime.orchestrator}
tools={runtime.tools}
traces={runtime.traces}
transcripts={runtime.transcripts}
/>
);
}
11 changes: 10 additions & 1 deletion src/cli/runtime.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import { AgentOrchestrator } from '../agent/orchestrator';
import { loadAgentsConfig } from '../config/agents-loader';
import { createDb } from '../db/client';
import { runMigrations } from '../db/migrate';
import { OptionalOtelExporter } from '../observability/otel';
Expand All @@ -12,6 +13,7 @@ import { ToolRegistry } from '../tools/registry';
export async function createRuntime() {
await runMigrations();
const db = await createDb();
const agentsConfig = await loadAgentsConfig(process.cwd());
const provider = createProviderAdapter(detectProvider());
const policyEngine = new DefaultPolicyEngine(createDefaultApprovalPolicy());
const orchestrator = new AgentOrchestrator({
Expand All @@ -28,6 +30,13 @@ export async function createRuntime() {
traces: new TraceStore(),
transcripts: new TranscriptStore(),
otel: new OptionalOtelExporter(),
tools: new ToolRegistry(),
tools: new ToolRegistry({
policyEngine,
defaultMode: 'interactive',
agentsConfig,
onWarning: (message) => {
console.warn(`[agents-config] ${message}`);
},
}),
};
}
46 changes: 43 additions & 3 deletions src/config/agents-loader.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,18 @@ export type AgentCommand = {
command: string;
};

export type AgentsConfigWarning = {
type: 'duplicate-command';
commandName: string;
keptIndex: number;
ignoredIndexes: number[];
message: string;
};

export type AgentsConfig = {
commands: AgentCommand[];
hooks: Array<{ event: string; command: string }>;
warnings: AgentsConfigWarning[];
};

function parseSectionLines(content: string, heading: string): string[] {
Expand All @@ -35,21 +44,52 @@ export async function loadAgentsConfig(cwd: string): Promise<AgentsConfig> {
const path = join(cwd, 'AGENTS.md');
const content = await readFile(path, 'utf8').catch(() => '');
if (!content) {
return { commands: [], hooks: [] };
return { commands: [], hooks: [], warnings: [] };
}

const commandLines = parseSectionLines(content, '## Commands');
const hookLines = parseSectionLines(content, '## Hooks');

const commands = commandLines
const parsedCommands = commandLines
.map((line) => line.match(/^[-*]\s*`?([^:`]+)`?\s*:\s*(.+)$/))
.filter((match): match is RegExpMatchArray => Boolean(match))
.map((match) => ({ name: match[1].trim(), command: match[2].trim() }));

const warnings: AgentsConfigWarning[] = [];
const commandNameToFirstIndex = new Map<string, number>();
const duplicateIndexes = new Map<string, number[]>();
for (const [index, command] of parsedCommands.entries()) {
const name = command.name;
const existing = commandNameToFirstIndex.get(name);
if (existing === undefined) {
commandNameToFirstIndex.set(name, index);
continue;
}
const duplicates = duplicateIndexes.get(name) ?? [];
duplicates.push(index);
duplicateIndexes.set(name, duplicates);
}

const commands = parsedCommands.filter((command, index) => {
const firstIndex = commandNameToFirstIndex.get(command.name);
return firstIndex === index;
});

for (const [commandName, ignoredIndexes] of duplicateIndexes.entries()) {
const keptIndex = commandNameToFirstIndex.get(commandName) ?? 0;
warnings.push({
type: 'duplicate-command',
commandName,
keptIndex,
ignoredIndexes,
message: `Duplicate AGENTS command "${commandName}" found at indexes ${ignoredIndexes.join(', ')}; using index ${keptIndex}.`,
});
}

const hooks = hookLines
.map((line) => line.match(/^[-*]\s*`?([^:`]+)`?\s*:\s*(.+)$/))
.filter((match): match is RegExpMatchArray => Boolean(match))
.map((match) => ({ event: match[1].trim(), command: match[2].trim() }));

return { commands, hooks };
return { commands, hooks, warnings };
}
1 change: 1 addition & 0 deletions src/observability/transcripts.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ export type TranscriptEntry = {
sessionId: string;
role: 'user' | 'assistant' | 'system';
text: string;
payload?: Record<string, unknown>;
};

export class TranscriptStore {
Expand Down
Loading