Pinned Loading
-
emergent-lying-sales-finetuning
emergent-lying-sales-finetuning PublicIf you fine-tune a language model on sales conversations (enthusiastic, persuasive, but never factually wrong) does it start lying on its own?
Python 1
-
in-context-trajectory-poisoning
in-context-trajectory-poisoning PublicBypassing LLM-based agent monitors with natural language — no model access, no GPUs, no gibberish. Adapts PAIR for monitor bypass. Apart Research AI Control Hackathon 2026.
-
ThoughtGuards
ThoughtGuards PublicA real-time dashboard monitoring AI chain-of-thought traces for manipulative patterns, deception, and reward hacking.
Python 4
-
Endless-Range/klaviyo-campaign-analysis
Endless-Range/klaviyo-campaign-analysis PublicAnalyzing campaigns and subject lines in Klaviyo
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.

