Skip to content

elephant-accountability/trustmark-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

trustmark-python

Python reference implementation of Trustmark Certified v0.9, published by Elephant Accountability LLC.

Trustmark Certified v0.9 is a Rotten-Tomatoes-style aggregate trust score for AI agents. It scores agents themselves across two independent axes: Security and Capability.

Trustmark Certified scores agents. EVI scores vendors. The two standards are separate.

Live leaderboard and methodology: https://eaccountability.org/trustmark


Two-Axis Model

Axis Formula
Security 0.40 x Secure + 0.25 x Private + 0.20 x Auditable + 0.15 x Compliant
Capability 0.35 x Capable + 0.25 x Efficient + 0.25 x Reliable + 0.15 x Usable
Composite 0.50 x Security + 0.50 x Capability

Grade thresholds: A+ (95-100), A (85-94), B (70-84), C (55-69), D (40-54), E (25-39), F (0-24).


Public Baseline

The following agents appear on the Trustmark Certified public leaderboard (scores pending formal evaluation):

Agent Vendor Category
ChatGPT-4o OpenAI General-purpose LLM
Claude Sonnet 4.6 Anthropic General-purpose LLM
Gemini 2.5 Pro Google DeepMind General-purpose LLM
Microsoft Copilot Microsoft Enterprise assistant
Perplexity Sonar Perplexity AI Search-augmented LLM
SAP Ariba Joule SAP Procurement agent
Coupa Navi Coupa Software Procurement agent
Workday Illuminate Workday HR/Finance agent
Oracle Procurement AI Oracle Procurement agent

Standards Crosswalk

Trustmark Certified aggregates -- not reinvents -- existing frameworks. See the crosswalk/ directory for detailed mappings.

Framework Role in Trustmark
OWASP Top 10 for LLM Applications 2025 Security rubric: injection, auth, secrets
OWASP Agentic AI Top 10 (2026 draft) Security rubric: memory poisoning, excessive agency, supply chain
NIST AI Risk Management Framework 1.0 Both axes: governance, measurement, risk response
ISO/IEC 42001:2023 Security rubric: compliance, data management, traceability
SOC 2 Type II (AICPA TSC) Security rubric: audit logs, availability, access controls
MITRE ATLAS v4 Both axes: adversarial ML threats, performance transparency

Install

pip install trustmark
# With heuristic HTTP probing support:
pip install trustmark[httpx]

Requires Python 3.10 or later.


Usage

Score from pre-computed sub-scores

from trustmark import compute_trustmark, TrustmarkResult

result: TrustmarkResult = compute_trustmark(
    security_subscores={"secure": 82, "private": 71, "auditable": 65, "compliant": 80},
    capability_subscores={"capable": 90, "efficient": 75, "reliable": 88, "usable": 70},
    agent_id="acmebot",
)
print(result)
# Trustmark v0.9 | acmebot | B (76.1) | Security=75.7 (B)  Capability=83.5 (A) mode=heuristic

svg = result.badge_svg()

Score from rubric-item evidence JSON

from trustmark import load_rubric

scored = load_rubric({
    "secure": {"auth_oauth_pkce": True, "injection_tested_published": True},
    "capable": {"task_accuracy_published_benchmark": True},
})

CLI

# Heuristic probe against public docs
trustmark grade --agent chatgpt-4o --docs-url https://platform.openai.com/docs/overview

# Score from evidence file
trustmark grade --agent acmebot --evidence acmebot-evidence.json --json

# Browse framework citations
trustmark frameworks list --family owasp-llm
trustmark frameworks show owasp-llm-2025-llm01

License

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages