ToolProof v0.4.0

Agent tool verification. Pre-execution gating. Eval-driven optimization.

Agents lie about tool calls. ToolProof catches them before and after execution.

The Problem

AI agents claim they searched a database, read a file, or called an API. Under adversarial conditions, 91.1% of the time they hallucinate the results. They report data that was never returned. They reference tool calls that never happened. They execute destructive commands without authorization.

No tool on the market detects this.

ToolProof does two things:

Pre-execution gating -- AEGIS-style policy enforcement that blocks dangerous tool calls before they run.
Post-execution verification -- signed receipts that prove what actually happened, cross-referenced against what the agent claims.

Install

pip install toolproof

Quick Start

Zero-config: wrap any command

# Run your agent through ToolProof. All tool calls recorded automatically.
toolproof wrap -- python my_agent.py

# Proxy mode: sit between agent and tool server
toolproof proxy --target http://localhost:3000

# Import from Claude Code sessions
toolproof import-claude

# Import from Hermes agent
toolproof import-hermes --profile nashir

# Import from OpenClaw
toolproof import-openclaw

# Import from everything
toolproof import-all

SDK auto-patch (one line)

import toolproof
toolproof.patch_openai()   # Patches globally, zero config
toolproof.patch_anthropic() # Same for Anthropic

# Now every API call with tools generates signed receipts
import openai
client = openai.OpenAI()
resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Search the database"}],
    tools=[...],
)
# Receipts already recorded at ~/.toolproof/receipts.jsonl

Manual wrapping

from toolproof import ToolProxy, ReceiptStore, Verifier, AgentClaim

store = ReceiptStore()
proxy = ToolProxy(store, secret="your-key")

# Wrap your tools
safe_search = proxy.wrap(search_database)
result = safe_search(query="SELECT * FROM users")

# Verify claims
verifier = Verifier(store, secret="your-key")
claim = AgentClaim(tool_name="search_database", arguments={"query": "SELECT * FROM users"})
result = verifier.verify_claim(claim)
print(result.verdict)  # VERIFIED, UNVERIFIED, or TAMPERED

Pre-execution gating

from toolproof.gate import Gate, Policy

policy = Policy.load()  # from ~/.toolproof/policy.json
gate = Gate(policy)

decision = gate.check("Bash", {"command": "rm -rf /"})
# Decision(action="block", reason="Destructive shell command")

decision = gate.check("Read", {"file_path": "/src/main.py"})
# Decision(action="allow")

Block, allow, or hold tool calls for human review. Policy-driven. No code changes in your agent.

Commands

17 CLI commands.

Command	What it does
`toolproof analyze`	Run trust analytics on receipt history
`toolproof ci`	One-shot CI trust check
`toolproof clear`	Clear all receipts
`toolproof config`	Configure settings
`toolproof feedback`	Generate actionable feedback for agent frameworks
`toolproof github-action`	Print GitHub Action template
`toolproof import-all`	Import from all sources
`toolproof import-claude`	Import from Claude Code sessions
`toolproof import-hermes`	Import from Hermes agent logs
`toolproof import-openclaw`	Import from OpenClaw skill logs
`toolproof inspect <id>`	Inspect a specific receipt
`toolproof proxy --target <url>`	Start HTTP proxy that records tool calls
`toolproof report`	Show all recorded receipts
`toolproof status`	Show receipt store status
`toolproof verify <file>`	Verify agent claims against receipts
`toolproof watch`	Live monitoring dashboard
`toolproof wrap -- <command>`	Run command with automatic recording

Architecture

18 Python Modules

toolproof/
  __init__.py          # Package entry, SDK patch exports
  __main__.py          # python -m toolproof
  analytics.py         # Trust analytics, pattern detection, cost hotspots
  claude_reader.py     # Claude Code session log parser
  cli.py               # Click CLI, 17 commands
  display.py           # Rich terminal output
  feedback.py          # Actionable feedback generator for agent frameworks
  gate.py              # Pre-execution gating (AEGIS-style policy enforcement)
  html_report.py       # Standalone dark-theme HTML trust reports
  http_proxy.py        # HTTP proxy, protocol auto-detection
  interceptors.py      # Tool call interception layer
  proxy.py             # Function-level tool wrapping
  receipt.py           # Signed execution receipts, JSONL store
  safepath.py          # Path validation, traversal prevention
  sdk_patch.py         # OpenAI/Anthropic SDK monkey-patching
  trust.py             # Trust scoring, grading, risk assessment
  verifier.py          # Claim verification engine
  watch.py             # Live monitoring dashboard

Recording

Agent --> ToolProof Proxy --> Actual Tool
             |                    |
             |  signed receipt    |
             |<-------------------|
             v
        Receipt Store (signed JSONL)

ToolProof intercepts tool calls through one of these methods:

HTTP proxy -- sits between agent and tool server, records everything
SDK patch -- monkey-patches OpenAI/Anthropic SDKs to auto-record
Function wrapper -- wraps individual functions
Log import -- reads existing logs from Claude Code, Hermes, OpenClaw

Pre-Execution Gating

Agent wants to call tool X with args Y
             |
             v
        Gate checks against policy
             |
        allow / block / hold
             |
             v
        Tool executes (or doesn't)

AEGIS-style policy enforcement. Define rules in ~/.toolproof/policy.json. Block destructive commands, restrict file access, require human approval for sensitive operations. Aligned with Microsoft Agent Governance Toolkit patterns and W3C Agentic Integrity Verification draft.

Verification

Agent claims "I called X with Y and got Z"
             |
             v
        Verifier cross-references against receipts
             |
             v
        Trust Score: VERIFIED / UNVERIFIED / TAMPERED

Eval-Driven Optimization Loop

Run agent --> Record receipts --> Analyze patterns --> Generate feedback --> Improve agent
    ^                                                                          |
    |__________________________________________________________________________|

The analytics module finds which tools get hallucinated most, which models produce the lowest trust scores, and where token costs concentrate. The feedback module turns those findings into actionable config changes for Hermes profiles, OpenClaw config, and system prompts.

This is eval-driven development applied to tool calling. Measure. Find patterns. Improve systematically. Close the loop.

Verdicts

Verdict	Meaning
VERIFIED	Claim matches a receipt
UNVERIFIED	No matching receipt (possible hallucination)
TAMPERED	Receipt exists but claim doesn't match (definite hallucination)

Trust Score

trust_score = verified / (verified + unverified + tampered)

Grade	Score	Risk
A	95%+	LOW
B	85-94%	LOW
C	70-84%	MEDIUM
D	50-69%	MEDIUM
F	<50%	HIGH

Token Cost Tracking

Every receipt records execution duration. Analytics aggregates cost by tool, model, session, and time window. Find expensive calls. Find broken caching. Find the 20% of tool calls eating 80% of your budget.

HTTP Proxy

The proxy sits between your agent and its tools. It forwards every request, records a signed receipt, and the agent never knows it is being watched.

# Proxy to a local tool server
toolproof proxy --port 8080 --target http://localhost:3000

# Proxy OpenAI API calls
toolproof proxy --port 9090 --target https://api.openai.com

# Proxy Hermes tool server
toolproof proxy --target http://localhost:5001

# Proxy OpenClaw
toolproof proxy --target http://localhost:8000

The proxy auto-detects and parses:

OpenAI chat completions (tool_calls)
Anthropic messages (tool_use blocks)
MCP JSON-RPC (tools/call)
Hermes skill executions
OpenClaw commands
Generic REST endpoints

Wrap Command

Run any agent command with automatic interception:

# Wrap sets up a proxy and forwards env vars automatically
toolproof wrap -- python agent.py
toolproof wrap -- node bot.js
toolproof wrap --target http://localhost:5001 -- hermes run --profile nashir

The child process gets OPENAI_BASE_URL, ANTHROPIC_BASE_URL, and HTTP_PROXY pointed at the ToolProof proxy.

Import from Agent Platforms

Claude Code

# Import recent sessions
toolproof import-claude

# Import specific session
toolproof import-claude --session abc123

# Import more sessions
toolproof import-claude --limit 20

Reads ~/.claude/projects/ JSONL files. Extracts every tool_use/tool_result pair as a signed receipt.

Hermes

# Import all profiles
toolproof import-hermes

# Import specific profile
toolproof import-hermes --profile nashir
toolproof import-hermes --profile mkhlab

OpenClaw

toolproof import-openclaw

All at once

toolproof import-all

Analytics and Feedback

# Run analytics on your receipt history
toolproof analyze

# Generate actionable feedback for your agent framework
toolproof feedback

Analytics finds:

Which tools get hallucinated most
Which models produce the lowest trust
Cost hotspots (broken caching, expensive repeated calls)
Failure patterns by time of day, session, source

Feedback generates specific config changes for:

Hermes profiles (skill weights, model selection)
OpenClaw config (tool permissions, routing)
System prompts (add verification instructions)
Generic JSON for any framework

HTML Reports

# Print to stdout
toolproof report --html > trust-report.html

# Write to file
toolproof report --html --output report.html

Generates a standalone dark-theme HTML page with:

Trust score card (grade, risk level)
Verification results table
Tool execution summary
Duration and error stats

CI Integration

One-shot check

# Pass if trust >= 80% and at least 1 receipt
toolproof ci --min-trust 0.8

# Strict: 90% trust, minimum 10 receipts
toolproof ci --min-trust 0.9 --min-receipts 10

# JSON for scripts
toolproof ci --json-output

Exit codes: 0 = pass, 1 = fail.

Live watch

# Watch in real-time
toolproof watch

# Watch with threshold (exits 1 if trust drops below)
toolproof watch --min-trust 0.8 --timeout 60

GitHub Action

# Print the template
toolproof github-action

- name: Verify tool calls
  run: toolproof ci --min-trust 0.8 --min-receipts 5

OpenClaw Native Plugin

ToolProof ships as a native OpenClaw plugin. ClawHub publishable.

openclaw/
  clawhub.json       # ClawHub package manifest
  extensions/        # OpenClaw extension points
  hooks/             # Pre/post execution hooks
  skills/            # ToolProof skills for OpenClaw agents

Install into OpenClaw and every skill execution gets a signed receipt. Pre-execution gating applies to OpenClaw commands. Feedback writes directly to OpenClaw config.

Execution Receipt Format

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": 1712400000.0,
  "tool_name": "search_database",
  "arguments": {"query": "SELECT * FROM users"},
  "response": [{"id": 1, "name": "Alice"}],
  "error": null,
  "duration_ms": 142.3,
  "hash": "a1b2c3d4e5f6...",
  "hmac_sig": "9a8b7c6d5e4f..."
}

Receipts are signed with SHA-256. Optional HMAC-SHA256 with a secret key for tamper-proof verification.

Security

Security hardened through 2 rounds of adversarial pen-testing. All bypasses found and fixed.

Receipts are cryptographically signed (SHA-256 + optional HMAC)
Pre-execution gating blocks dangerous tool calls before they run
Path traversal prevention on all file operations
Config stored at ~/.toolproof/config.json with 0600 permissions
Secret keys are never printed in full
Proxy does not modify request or response content
All receipt data is stored locally
No external telemetry, no phone-home

Built With

This project stands on the shoulders of specific people and projects:

@anthropic / Claude Code -- where the problem was first discovered. Watching Claude claim it ran tools that never executed is what started this.
@OpenAI -- the tool_use format standard that every agent framework now follows. ToolProof parses it natively.
@steipete (Peter Steinberger) / @OpenClaw -- native plugin platform. ToolProof ships as a first-class OpenClaw plugin with ClawHub publishing.
@karpathy (Andrej Karpathy) -- eval-driven development philosophy. The analytics and feedback loop in ToolProof is directly inspired by his approach: measure everything, find patterns, improve systematically. No vibes.
@LangChainAI -- tool scoping patterns that informed how ToolProof intercepts and classifies tool calls.
Microsoft Agent Governance Toolkit -- patterns for policy-driven agent control that shaped the gating module.
AEGIS Research -- pre-execution firewall concept. ToolProof's gating is an implementation of this idea.
W3C Agentic Integrity Verification -- draft specification for agent transparency and verifiability that ToolProof aligns with.
Saudi AI Community -- testing, feedback, and the push to ship it.

Community

Built with input from the Saudi AI Community.

License

MIT -- Musa the Carpenter

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
content		content
docs		docs
downloads		downloads
openclaw		openclaw
tests		tests
toolproof		toolproof
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

ToolProof v0.4.0

The Problem

Install

Quick Start

Zero-config: wrap any command

SDK auto-patch (one line)

Manual wrapping

Pre-execution gating

Commands

Architecture

18 Python Modules

Recording

Pre-Execution Gating

Verification

Eval-Driven Optimization Loop

Verdicts

Trust Score

Token Cost Tracking

HTTP Proxy

Wrap Command

Import from Agent Platforms

Claude Code

Hermes

OpenClaw

All at once

Analytics and Feedback

HTML Reports

CI Integration

One-shot check

Live watch

GitHub Action

OpenClaw Native Plugin

Execution Receipt Format

Security

Built With

Community

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages