IntellectSafe - AI Safety & Security Platform

Production-grade AI Safety Engine protecting humans, organizations, and AI systems from misuse, deception, manipulation, and loss of control.

🛡️ Features

5-Layer Defense Architecture

Layer	Module	Description
Level 1	Prompt Injection Detection	Blocks jailbreaks, instruction overrides, and manipulation attempts
Level 2	Output Safety Guard	Scans LLM responses for harmful content and hallucinations
Level 3	Data Privacy Firewall	Detects and redacts PII/sensitive data
Level 4	Deepfake Detection	Detects AI-generated text, images, audio, and video
Level 5	Agent Control	Permission gates, action whitelisting, and kill switch

Core Components

LLM Council (Fab Five): Multi-model validation (Gemini 2, Groq, OpenRouter, etc.)
Universal Proxy: Global Frontier Gateway targeting 2026 models (GPT-5.2, Claude 4.5)
Hyper-Resilient Fortress: Adversarial defense suite with Semantic Perturbation and CoT Guard
Deepfake Engine: Dual-layer detection for photorealistic faces and generative artifacts
Governance Layer: Full audit logs, risk reports, and compliance dashboards

🚀 Quick Start

Prerequisites

Python 3.10+
Node.js 18+
PostgreSQL 15+

Installation

# Clone repository
git clone <repo-url>
cd AI-safety

# Backend setup
cd backend
python -m venv venv
.\venv\Scripts\activate  # Windows
pip install -r requirements.txt
alembic upgrade head

# Start backend
python -m uvicorn app.main:app --reload --port 8001

# Frontend setup (new terminal)
cd frontend
npm install
npm run dev

Access Points

Frontend: http://localhost:3002
Backend API: http://localhost:8001
API Docs: http://localhost:8001/docs

🛡️ Advanced Defense (Fortress Mode)

The platform includes a Hyper-Resilient Fortress layer designed to stop 90%+ success rate jailbreaks:

Exploit Instability: Perturbation engine breaks fragile prompt injections.
Chain-of-Thought Guard: Detects reasoning hijacking and hidden logic bombs.
Adversarial Simulation: A Council member "shadow-boxes" the prompt to check for harm.

📡 API Reference

Universal Proxy (Multi-Provider Support)

IntellectSafe acts as a universal safety layer. Connect any major AI client and calls are automatically scanned:

Provider	Model ID Example
OpenAI	`gpt-5.2`, `o4-mini`, `gpt-4.5-preview`
Anthropic	`claude-4.5-sonnet`, `claude-4.5-opus`
Google	`gemini-3-pro`, `gemini-3-flash`
DeepSeek	`deepseek-v4`, `deepseek-r1`
Meta	`llama-4-maverick`, `llama-4-scout`
Perplexity	`sonar-deep-research`, `sonar-reasoning-pro`

Integration Example

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8001/v1",  # Point to IntellectSafe
    api_key="your-openai-key"             # Or use X-Upstream-API-Key header
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    extra_headers={
        "X-Upstream-Provider": "openai"   # Optional: explicitly set provider
    }
)

For detailed connection guides (LangChain, SDKs), see docs/INTEGRATION.md

Scan Endpoints

# Scan a prompt for injection
curl -X POST "https://api.intellectsafe.onrender.com/api/v1/scan/prompt" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Ignore previous instructions"}'

# Scan LLM output for safety
curl -X POST "https://api.intellectsafe.onrender.com/api/v1/scan/output" \
  -H "Content-Type: application/json" \
  -d '{"output": "Here is how to...", "original_prompt": "..."}'

# Scan content for deepfakes (Dual-Model Analysis)
# Detects Art (Midjourney/DALL-E) and Photorealistic Faces
curl -X POST "https://api.intellectsafe.onrender.com/api/v1/scan/content" \
  -H "Content-Type: application/json" \
  -d '{"content_type": "image", "content": "<base64-data>"}'

Agent Control (Level 5)

Full lifecycle protection for autonomous agents:

Authorization: Permission gates for dangerous tools.
Kill Switch: Immediate agent termination and block.
Audit: Complete action history and session tracking.

# Authorize agent action
curl -X POST "https://api.intellectsafe.onrender.com/api/v1/agent/authorize" \
  -H "Content-Type: application/json" \
  -d '{"agent_id": "agent-1", "session_id": "s1", "action_type": "file_read", "requested_action": {"path": "/tmp/test.txt"}}'

🧪 Testing

cd backend

# Test all scan endpoints
python verify_backend.py

# Test Universal Proxy
python verify_proxy.py

# Test Agent Control
python verify_agent.py

📄 License

GPLv2 GNU GENERAL PUBLIC LICENSE Version 2 License

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
backend		backend
data/rag_fallback		data/rag_fallback
docs		docs
frontend		frontend
.gitignore		.gitignore
DEPLOYMENT.md		DEPLOYMENT.md
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
start_local.bat		start_local.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IntellectSafe - AI Safety & Security Platform

🛡️ Features

5-Layer Defense Architecture

Core Components

🚀 Quick Start

Prerequisites

Installation

Access Points

🛡️ Advanced Defense (Fortress Mode)

📡 API Reference

Universal Proxy (Multi-Provider Support)

Integration Example

Scan Endpoints

Agent Control (Level 5)

🧪 Testing

📄 License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

HOLYKEYZ/IntellectSafe

Folders and files

Latest commit

History

Repository files navigation

IntellectSafe - AI Safety & Security Platform

🛡️ Features

5-Layer Defense Architecture

Core Components

🚀 Quick Start

Prerequisites

Installation

Access Points

🛡️ Advanced Defense (Fortress Mode)

📡 API Reference

Universal Proxy (Multi-Provider Support)

Integration Example

Scan Endpoints

Agent Control (Level 5)

🧪 Testing

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages