This is an open-source version of the representation engineering framework for stopping harmful outputs or hallucinations on the level of activations. 100% free, self-hosted and open-source.
-
Updated
Apr 8, 2026 - Python
This is an open-source version of the representation engineering framework for stopping harmful outputs or hallucinations on the level of activations. 100% free, self-hosted and open-source.
[ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"
Agentic Safety Framework
A curated list of finance agent SDKs, MCP tools, wallet infrastructure, safeguards, simulation systems, and security standards for building safer financial agents.
an exploration of issues of international social development policy and its operationalization
Autonomous security testing engine with execution-time safeguards. Research platform for studying how tool-level controls shape LLM agent behavior.
A FastAPI application for clinical safeguards using BERT-like models, providing endpoints for text processing and analysis.
Safe and Fearless lossy compression using safeguards
Add a description, image, and links to the safeguards topic page so that developers can more easily learn about it.
To associate your repository with the safeguards topic, visit your repo's landing page and select "manage topics."