Skip to content

Python library for detecting ReDoS (Regular Expression Denial of Service) vulnerabilities using hybrid static analysis and fuzzing

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

GetPageSpeed/redoctor

Repository files navigation

ReDoctor Logo

ReDoctor

The Python ReDoS Vulnerability Scanner β€” Protect your applications from Regular Expression Denial of Service attacks.

PyPI version Python versions License Tests codecov Downloads

⚠️ License Notice: ReDoctor is licensed under the Business Source License 1.1 (BSL-1.1). Non-commercial use is free. Commercial production use requires a paid license. The code will convert to MIT license on January 9, 2031.


Quick Start β€’ Features β€’ Installation β€’ Usage β€’ Documentation β€’ Contributing


🚨 What is ReDoS?

Regular Expression Denial of Service (ReDoS) is a type of algorithmic complexity attack that exploits the worst-case behavior of regex engines. A vulnerable regex can cause your application to hang for minutes or hours when processing malicious input.

# ⚠️ This innocent-looking regex is VULNERABLE!
import re
pattern = r"^(a+)+$"

# This will hang your application:
re.match(pattern, "a" * 30 + "!")  # Takes exponential time!

ReDoctor detects these vulnerabilities before they reach production.

⚑ Quick Start

# Install
pip install redoctor

# Check a pattern from command line
redoctor '^(a+)+$'
# Output: VULNERABLE: ^(a+)+$ - Complexity: O(2^n)

# Use in Python
from redoctor import check

result = check(r"^(a+)+$")
if result.is_vulnerable:
    print(f"🚨 Vulnerable! Complexity: {result.complexity}")
    print(f"   Attack string: {result.attack}")

✨ Features

πŸ”¬ Hybrid Analysis Engine

Combines static automata-based analysis with intelligent fuzzing for comprehensive detection. Catches vulnerabilities that single-approach tools miss.

⚑ Fast & Zero Dependencies

Pure Python with no external dependencies. Runs in milliseconds for most patterns. Compatible with Python 3.6+.

🎯 Accurate Results

Generates proof-of-concept attack strings with complexity analysis (O(n²), O(2ⁿ), etc.). Low false-positive rate through recall validation.

πŸ›‘οΈ Source Code Scanning

Scan your entire Python codebase for vulnerable regex patterns. Integrates with CI/CD pipelines.

πŸ“¦ Installation

pip install redoctor

Requirements: Python 3.6+ Dependencies: None (pure Python)

πŸ”§ Usage

Command Line Interface

# Check a single pattern
redoctor '^(a+)+$'

# Verbose output with attack details
redoctor '(a|a)*$' --verbose

# Check with flags
redoctor 'pattern' --ignore-case --multiline

# Read patterns from stdin
echo '^(a+)+$' | redoctor --stdin

# Set timeout
redoctor 'complex-pattern' --timeout 30

Exit codes:

  • 0 - Pattern is safe
  • 1 - Pattern is vulnerable
  • 2 - Error occurred

Python API

from redoctor import check, is_vulnerable, Config

# Simple check
result = check(r"^(a+)+$")
print(result.status)        # Status.VULNERABLE
print(result.complexity)    # O(2^n)
print(result.attack)        # 'aaaaaaaaaaaaaaaaaaaaa!'

# Quick vulnerability check
if is_vulnerable(r"(x+x+)+y"):
    print("Don't use this pattern!")

# Access attack pattern details
if result.is_vulnerable:
    attack = result.attack_pattern
    print(f"Prefix: {attack.prefix!r}")
    print(f"Pump: {attack.pump!r}")
    print(f"Suffix: {attack.suffix!r}")

    # Generate attack strings of different lengths
    short_attack = attack.build(10)   # 10 pump repetitions
    long_attack = attack.build(100)   # 100 pump repetitions

# Custom configuration
config = Config(
    timeout=30.0,           # Analysis timeout in seconds
    max_attack_length=4096, # Max attack string length
)
result = check(r"complex-pattern", config=config)

# Quick mode for CI/CD
config = Config.quick()  # 1 second timeout
result = check(pattern, config=config)

Source Code Scanning

Scan your Python codebase for vulnerable regex patterns:

from redoctor.integrations import scan_file, scan_directory

# Scan a single file
vulnerabilities = scan_file("myapp/validators.py")
for vuln in vulnerabilities:
    print(f"{vuln.file}:{vuln.line} - {vuln.pattern}")
    print(f"  Complexity: {vuln.diagnostics.complexity}")

# Scan entire directory
for vuln in scan_directory("src/", recursive=True):
    if vuln.is_vulnerable:
        print(f"🚨 {vuln}")

πŸ“Š Complexity Types

ReDoctor classifies vulnerabilities by their time complexity:

Complexity Description Risk Level
O(n) Linear - Safe βœ… Safe
O(n²) Quadratic ⚠️ Moderate
O(n³) Cubic ⚠️ High
O(2ⁿ) Exponential 🚨 Critical

πŸ” How It Works

ReDoctor uses a hybrid approach combining two detection methods:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     ReDoctor Engine                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚  β”‚   Automaton     β”‚         β”‚     Fuzz        β”‚           β”‚
β”‚  β”‚   Checker       β”‚         β”‚    Checker      β”‚           β”‚
β”‚  β”‚                 β”‚         β”‚                 β”‚           β”‚
β”‚  β”‚  β€’ NFA analysis β”‚         β”‚  β€’ VM execution β”‚           β”‚
β”‚  β”‚  β€’ O(n) check   β”‚         β”‚  β€’ Step countingβ”‚           β”‚
β”‚  β”‚  β€’ Witness gen  β”‚         β”‚  β€’ Mutation     β”‚           β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚
β”‚           β”‚                           β”‚                     β”‚
β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                     β”‚
β”‚                       β”‚                                     β”‚
β”‚              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”                            β”‚
β”‚              β”‚ Recall Validatorβ”‚                            β”‚
β”‚              β”‚ (confirmation)  β”‚                            β”‚
β”‚              β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜                            β”‚
β”‚                       β”‚                                     β”‚
β”‚              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”                            β”‚
β”‚              β”‚   Diagnostics   β”‚                            β”‚
β”‚              β”‚  β€’ Complexity   β”‚                            β”‚
β”‚              β”‚  β€’ Attack stringβ”‚                            β”‚
β”‚              β”‚  β€’ Hotspot      β”‚                            β”‚
β”‚              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  1. Automaton Checker: Builds an Ξ΅-NFA from the regex and analyzes for ambiguity patterns that cause backtracking.
  2. Fuzz Checker: Executes patterns in a step-counting VM with evolved test strings to detect polynomial/exponential growth.
  3. Recall Validator: Confirms detected vulnerabilities with real execution timing.

πŸ“š Documentation

Full documentation is available at redoctor.getpagespeed.com

πŸ§ͺ Examples of Vulnerable Patterns

from redoctor import check

# Classic nested quantifier - Exponential O(2^n)
check(r"^(a+)+$")           # VULNERABLE

# Overlapping alternatives - Exponential O(2^n)
check(r"(a|a)*$")           # VULNERABLE

# Polynomial O(nΒ²)
check(r".*a.*a.*")          # VULNERABLE

# Email-like pattern - Often vulnerable
check(r"^([a-zA-Z0-9]+)*@") # VULNERABLE

# Safe patterns
check(r"^[a-z]+$")          # SAFE
check(r"^\d{1,10}$")        # SAFE
check(r"^[A-Z][a-z]*$")     # SAFE

🀝 Contributing

Contributions are welcome! See our Contributing Guide for details.

# Clone the repo
git clone https://github.com/GetPageSpeed/redoctor.git
cd redoctor

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/ -x --tb=short

# Run with coverage
make tests

πŸ“„ License

ReDoctor is licensed under the Business Source License 1.1 (BSL-1.1).

  • βœ… Free for non-commercial and non-production use
  • βœ… Free for personal projects, education, and research
  • πŸ’Ό Commercial production use requires a paid license
  • πŸ”“ Converts to MIT License on January 9, 2031

πŸ™ Acknowledgments

  • Inspired by recheck and academic research on ReDoS detection
  • Built with ❀️ by GetPageSpeed

Protect your applications from ReDoS attacks.
⭐ Star on GitHub β€’ πŸ“¦ View on PyPI β€’ πŸ“š Read the Docs

About

Python library for detecting ReDoS (Regular Expression Denial of Service) vulnerabilities using hybrid static analysis and fuzzing

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE-MIT

Contributing

Stars

Watchers

Forks

Sponsor this project

 

Packages

No packages published