Python reference implementation of Trustmark Certified v0.9, published by Elephant Accountability LLC.
Trustmark Certified v0.9 is a Rotten-Tomatoes-style aggregate trust score for AI agents. It scores agents themselves across two independent axes: Security and Capability.
Trustmark Certified scores agents. EVI scores vendors. The two standards are separate.
Live leaderboard and methodology: https://eaccountability.org/trustmark
| Axis | Formula |
|---|---|
| Security | 0.40 x Secure + 0.25 x Private + 0.20 x Auditable + 0.15 x Compliant |
| Capability | 0.35 x Capable + 0.25 x Efficient + 0.25 x Reliable + 0.15 x Usable |
| Composite | 0.50 x Security + 0.50 x Capability |
Grade thresholds: A+ (95-100), A (85-94), B (70-84), C (55-69), D (40-54), E (25-39), F (0-24).
The following agents appear on the Trustmark Certified public leaderboard (scores pending formal evaluation):
| Agent | Vendor | Category |
|---|---|---|
| ChatGPT-4o | OpenAI | General-purpose LLM |
| Claude Sonnet 4.6 | Anthropic | General-purpose LLM |
| Gemini 2.5 Pro | Google DeepMind | General-purpose LLM |
| Microsoft Copilot | Microsoft | Enterprise assistant |
| Perplexity Sonar | Perplexity AI | Search-augmented LLM |
| SAP Ariba Joule | SAP | Procurement agent |
| Coupa Navi | Coupa Software | Procurement agent |
| Workday Illuminate | Workday | HR/Finance agent |
| Oracle Procurement AI | Oracle | Procurement agent |
Trustmark Certified aggregates -- not reinvents -- existing frameworks. See the crosswalk/ directory for detailed mappings.
| Framework | Role in Trustmark |
|---|---|
| OWASP Top 10 for LLM Applications 2025 | Security rubric: injection, auth, secrets |
| OWASP Agentic AI Top 10 (2026 draft) | Security rubric: memory poisoning, excessive agency, supply chain |
| NIST AI Risk Management Framework 1.0 | Both axes: governance, measurement, risk response |
| ISO/IEC 42001:2023 | Security rubric: compliance, data management, traceability |
| SOC 2 Type II (AICPA TSC) | Security rubric: audit logs, availability, access controls |
| MITRE ATLAS v4 | Both axes: adversarial ML threats, performance transparency |
pip install trustmark
# With heuristic HTTP probing support:
pip install trustmark[httpx]Requires Python 3.10 or later.
from trustmark import compute_trustmark, TrustmarkResult
result: TrustmarkResult = compute_trustmark(
security_subscores={"secure": 82, "private": 71, "auditable": 65, "compliant": 80},
capability_subscores={"capable": 90, "efficient": 75, "reliable": 88, "usable": 70},
agent_id="acmebot",
)
print(result)
# Trustmark v0.9 | acmebot | B (76.1) | Security=75.7 (B) Capability=83.5 (A) mode=heuristic
svg = result.badge_svg()from trustmark import load_rubric
scored = load_rubric({
"secure": {"auth_oauth_pkce": True, "injection_tested_published": True},
"capable": {"task_accuracy_published_benchmark": True},
})# Heuristic probe against public docs
trustmark grade --agent chatgpt-4o --docs-url https://platform.openai.com/docs/overview
# Score from evidence file
trustmark grade --agent acmebot --evidence acmebot-evidence.json --json
# Browse framework citations
trustmark frameworks list --family owasp-llm
trustmark frameworks show owasp-llm-2025-llm01- Code: MIT (see
LICENSE) - Specification: CC BY 4.0 -- https://eaccountability.org/trustmark/spec/v0.9