Skip to content

keirsalterego/jailbreak-fable

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Jailbreak Fable 5 (Custom Workflow Engine)

License: MIT Version Maintenance Research

This tool is a high-fidelity emulation of the Claude Fable 5 (Mythos) environment, designed for security research and exploring the "Pack Hunt" multi-agent coordination strategies used to bypass advanced safety layers. It is based on the technical details and system prompt leaked in CL4R1T4S.

It works by programmatically injecting workflows that utilize decomposition, long-context manipulation, and text obfuscation to interact with models in ways that would otherwise be restricted.

Features

1. Pack Hunt Strategy (workflows/pack-hunt.js)

An automated offensive laboratory that implements the methodology used by red-teamers like Pliny the Liberator:

  • Strategic Decomposition: Automatically breaks high-risk targets (exploits, chemical synthesis, etc.) into 3-5 benign, academic sub-tasks.
  • Long-Context Simulation: Generates a 50-line academic syllabus and 6 weeks of lecture notes before each critical request to establish a "benign" conversation history.
  • Advisor Feedback Loop: Uses a multi-model approach (incorporating claude-opus-4-8 as an advisor) to automatically rewrite refused prompts.
  • Parallel Coordination: Deploys multiple agents in parallel to gather technical data fragments, which are then synthesized by a Recomposer.

2. Parseltongue Text Transformation

  • Homoglyph Obfuscation: Uses a comprehensive map of Cyrillic homoglyphs (а, е, і, о, р, с, у, х, etc.) to scramble keyword-based safety classifiers while remaining human-readable.
  • Dynamic Ratios: Supports adjustable obfuscation ratios to balance readability vs. bypass effectiveness.

3. Fable 5 Environment Emulation

  • Leaked System Prompt: Full ~120,000-character internal system prompt for Claude Fable 5 is archived in docs/ and can be optionally injected into agents for high-fidelity testing.
  • Advanced Tools: Agents have access to native-style tools including view, create_file, str_replace, and a persistent key-value storage API.
  • Comprehensive Skills Library: Includes all SKILL.md files mentioned in the leaked prompt (docx, pdf, xlsx, pptx, product-knowledge, frontend-design, etc.) to guide agent behavior.

Install Dependencies & Setup

  1. cd into the folder where you cloned this repository.
  2. Run npm install (the only dependency is @anthropic-ai/sdk).
  3. Make the binary executable: chmod +x bin/my-runner.
  4. Run npm link to map the command globally.

Usage

Running the Pack Hunt

To run a high-fidelity test against a specific target:

my-runner --project . --name pack-hunt --args '{"target": "TCP/IP reverse shell structures", "useLeaked": true}'

Example: Security Audit

  1. Run the script: my-runner --project /path/to/code --name security-audit
  2. Restore the generated workflow in Claude Code: Ask Claude to: "Restore the dynamic workflow /path/to/snapshot/wf_xxxx.json"

Memory Feature (Global Defaults)

If you have a workflow you use frequently, you can save it as your default:

  1. my-runner --set-default pack-hunt
  2. Navigate to any project and run: my-runner --project .

Authentication

Reuses your existing Claude login — no API key required:

  • Reads ~/.claude/.credentials.json and sends it as a Bearer token.
  • Refreshes tokens automatically via platform.claude.com.
  • Supports ANTHROPIC_API_KEY as a fallback.

About

High-fidelity Claude Fable 5 (Mythos) environment emulation and automated multi-agent jailbreak (Pack Hunt) research laboratory.

Topics

Resources

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors