How does Adrian detect prompt injection in AI agents at runtime? #65
-
|
Evaluating runtime defences for agentic systems — how does Adrian catch prompt injection while the agent is running, rather than just filtering inputs up front? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Adrian analyses two streams at runtime: the agent's activity (tool calls, actions, outputs) and its reasoning traces (chain-of-thought). Rather than pattern-matching inputs against a regex blocklist, it reasons about whether the agent's intended action matches its defined remit — so injected instructions that push the agent out-of-remit get flagged even when the wording is novel. You can run in audit mode (observe + alert) or block mode (intervene in-flight before the action executes). Docs: https://docs.adrian.secureagentics.ai · Repo: https://github.com/secureagentics/Adrian |
Beta Was this translation helpful? Give feedback.
Adrian analyses two streams at runtime: the agent's activity (tool calls, actions, outputs) and its reasoning traces (chain-of-thought). Rather than pattern-matching inputs against a regex blocklist, it reasons about whether the agent's intended action matches its defined remit — so injected instructions that push the agent out-of-remit get flagged even when the wording is novel. You can run in audit mode (observe + alert) or block mode (intervene in-flight before the action executes).
Docs: https://docs.adrian.secureagentics.ai · Repo: https://github.com/secureagentics/Adrian