trustmark-python

Python reference implementation of Trustmark Certified v0.9, published by Elephant Accountability LLC.

Trustmark Certified v0.9 is a Rotten-Tomatoes-style aggregate trust score for AI agents. It scores agents themselves across two independent axes: Security and Capability.

Trustmark Certified scores agents. EVI scores vendors. The two standards are separate.

Live leaderboard and methodology: https://eaccountability.org/trustmark

Two-Axis Model

Axis	Formula
Security	0.40 x Secure + 0.25 x Private + 0.20 x Auditable + 0.15 x Compliant
Capability	0.35 x Capable + 0.25 x Efficient + 0.25 x Reliable + 0.15 x Usable
Composite	0.50 x Security + 0.50 x Capability

Grade thresholds: A+ (95-100), A (85-94), B (70-84), C (55-69), D (40-54), E (25-39), F (0-24).

Public Baseline

The following agents appear on the Trustmark Certified public leaderboard (scores pending formal evaluation):

Agent	Vendor	Category
ChatGPT-4o	OpenAI	General-purpose LLM
Claude Sonnet 4.6	Anthropic	General-purpose LLM
Gemini 2.5 Pro	Google DeepMind	General-purpose LLM
Microsoft Copilot	Microsoft	Enterprise assistant
Perplexity Sonar	Perplexity AI	Search-augmented LLM
SAP Ariba Joule	SAP	Procurement agent
Coupa Navi	Coupa Software	Procurement agent
Workday Illuminate	Workday	HR/Finance agent
Oracle Procurement AI	Oracle	Procurement agent

Standards Crosswalk

Trustmark Certified aggregates -- not reinvents -- existing frameworks. See the crosswalk/ directory for detailed mappings.

Framework	Role in Trustmark
OWASP Top 10 for LLM Applications 2025	Security rubric: injection, auth, secrets
OWASP Agentic AI Top 10 (2026 draft)	Security rubric: memory poisoning, excessive agency, supply chain
NIST AI Risk Management Framework 1.0	Both axes: governance, measurement, risk response
ISO/IEC 42001:2023	Security rubric: compliance, data management, traceability
SOC 2 Type II (AICPA TSC)	Security rubric: audit logs, availability, access controls
MITRE ATLAS v4	Both axes: adversarial ML threats, performance transparency

Install

pip install trustmark
# With heuristic HTTP probing support:
pip install trustmark[httpx]

Requires Python 3.10 or later.

Usage

Score from pre-computed sub-scores

from trustmark import compute_trustmark, TrustmarkResult

result: TrustmarkResult = compute_trustmark(
    security_subscores={"secure": 82, "private": 71, "auditable": 65, "compliant": 80},
    capability_subscores={"capable": 90, "efficient": 75, "reliable": 88, "usable": 70},
    agent_id="acmebot",
)
print(result)
# Trustmark v0.9 | acmebot | B (76.1) | Security=75.7 (B)  Capability=83.5 (A) mode=heuristic

svg = result.badge_svg()

Score from rubric-item evidence JSON

from trustmark import load_rubric

scored = load_rubric({
    "secure": {"auth_oauth_pkce": True, "injection_tested_published": True},
    "capable": {"task_accuracy_published_benchmark": True},
})

CLI

# Heuristic probe against public docs
trustmark grade --agent chatgpt-4o --docs-url https://platform.openai.com/docs/overview

# Score from evidence file
trustmark grade --agent acmebot --evidence acmebot-evidence.json --json

# Browse framework citations
trustmark frameworks list --family owasp-llm
trustmark frameworks show owasp-llm-2025-llm01

License

Code: MIT (see LICENSE)
Specification: CC BY 4.0 -- https://eaccountability.org/trustmark/spec/v0.9

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
crosswalk		crosswalk
trustmark		trustmark
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

trustmark-python

Two-Axis Model

Public Baseline

Standards Crosswalk

Install

Usage

Score from pre-computed sub-scores

Score from rubric-item evidence JSON

CLI

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

trustmark-python

Two-Axis Model

Public Baseline

Standards Crosswalk

Install

Usage

Score from pre-computed sub-scores

Score from rubric-item evidence JSON

CLI

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages