Add CWE-Bench coding-agent security benchmark + Trent AI by v0id-space · Pull Request #20 · efij/awesome-claude-code-security

v0id-space · 2026-06-10T09:22:51Z

Adds Trent AI's benchmark of five tools (Claude Code Opus 4.7, OpenAI Codex GPT-5.3, Semgrep, CodeQL, Trent) on 28 production CVEs from the CWE-Bench dataset, 3 runs per tool with variance reported. Methodology, caveats, and per-tool breakdown are in the post. Disclosure: I work at Trent AI, whose tool is one of the five benchmarked (and scores best on the strict metric), flagging that clearly so you can weigh it. Happy to reword or move the entry.

Also added Trent AI under 🤖 Agent Orchestration and Loop Safety

Adds Trent AI's benchmark of five tools (Claude Code Opus 4.7, OpenAI Codex GPT-5.3, Semgrep, CodeQL, Trent) on 28 production CVEs from the CWE-Bench dataset, 3 runs per tool with variance reported. Methodology, caveats, and per-tool breakdown are in the post. Disclosure: I work at Trent AI, whose tool is one of the five benchmarked (and scores best on the strict metric) — flagging that clearly so you can weigh it. Happy to reword or move the entry. Also added Trent AI under 🤖 Agent Orchestration and Loop Safety

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CWE-Bench coding-agent security benchmark + Trent AI#20

Add CWE-Bench coding-agent security benchmark + Trent AI#20
v0id-space wants to merge 1 commit into
efij:mainfrom
v0id-space:patch-1

v0id-space commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

v0id-space commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant