📄 DocShit

DocShit is a high-performance, brutalist document analysis tool designed to sanitize PDF and DOCX files for secure LLM (Large Language Model) usage. It acts as a digital shield, identifying and neutralizing prompt injections, hidden text, and malicious metadata before they reach your AI context.

🛡️ Key Features

Multi-Format Deep Scan: Support for both PDF and DOCX files with full structural analysis.
Threat Detection Engine:
- Injection Keywords: Identifies phrases used for prompt hijacking.
- Micro-Text Detection: Catches microscopic text used to hide instructions from human eyes.
- Hidden Metadata: Detects white-on-white text and other obfuscation techniques in DOCX files.
Safe Text Sanitization: One-click extraction of "clean" text with malicious fragments neutralized.
Proofread Mode: Side-by-side view of the original document and extracted text with pulsed highlighting on identified threats.
OCR Failure Detection: Automatically identifies documents with no selectable text (handwritten/scanned images) and warns the user.
100% Client-Side: Your documents never leave your browser. Processing is entirely local for maximum privacy.

🚀 Tech Stack

Core: React + Vite
Styling: Tailwind CSS + Framer Motion (Animations)
PDF Engine: pdfjs-dist
DOCX Engine: jszip + docx-preview
Visuals: lucide-react (Icons) + canvas-confetti (FX)

🛠️ Usage

Upload: Drag and drop your PDF or DOCX file.
Analyze: Watch the real-time structure scan identify risks.
Proofread: Review detections in the highlight panel.
Sanitize: Copy the safe, neutralized text directly to your clipboard for LLM input.

Built for the security-conscious explorer. Keep your AI context clean.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
public		public
src		src
.gitignore		.gitignore
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 DocShit

🛡️ Key Features

🚀 Tech Stack

🛠️ Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📄 DocShit

🛡️ Key Features

🚀 Tech Stack

🛠️ Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages