Traditional security scanning tools either:
- Run static analysis without proving vulnerabilities are exploitable
- Execute exploits in unsafe environments
- Don't integrate well with developer workflows
CodeGuard AI solves this by:
- Running security agents inside isolated E2B sandboxes
- Using MCP clients to connect to real-world tools (GitHub, etc.)
- Proving vulnerabilities with safe exploit execution
- Automatically posting results back to GitHub PRs
GitHub PR Created
│
├─> Webhook/Trigger
│
▼
┌─────────────────────┐
│ Streamlit Dashboard │ ← Observability & Control Plane
│ (External) │
└──────────┬──────────┘
│
│ 1. Launch Sandbox
▼
┌──────────────────────────────────────────────┐
│ E2B Sandbox (microVM) │
│ │
│ ┌────────────────────────────────────────┐ │
│ │ Security Agent (Python) │ │
│ │ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ 1. MCP Client │ │ │
│ │ │ └─> Connect to GitHub MCP │ │ │
│ │ │ └─> Fetch PR files │ │ │
│ │ └──────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ 2. Vulnerability Scanner │ │ │
│ │ │ └─> Regex pattern matching │ │ │
│ │ │ └─> Detect SQL injection │ │ │
│ │ │ └─> Detect XSS, etc. │ │ │
│ │ └──────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ 3. Exploit Executor │ │ │
│ │ │ └─> Generate exploit code │ │ │
│ │ │ └─> Execute safely (in VM) │ │ │
│ │ │ └─> Prove vulnerabilities │ │ │
│ │ └──────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ 4. Report Generator │ │ │
│ │ │ └─> Format markdown report │ │ │
│ │ │ └─> Post via MCP client │ │ │
│ │ └──────────────────────────────────┘ │ │
│ └────────────────────────────────────────┘ │
│ │
│ Logs stream back to dashboard │
└─────────────┬────────────────────────────────┘
│
│ MCP Protocol (httpx)
▼
┌─────────────────────┐
│ GitHub MCP Server │ ← Docker Container (External)
│ (Docker MCP Hub) │
└──────────┬──────────┘
│
│ GitHub API
▼
GitHub.com
Why:
- Exploit execution must be isolated for safety
- E2B provides perfect sandboxing for untrusted code
- Aligns with hackathon requirement: "agents inside sandboxes"
How:
- Orchestrator deploys agent code to sandbox filesystem
- Agent is a self-contained Python script
- All dependencies installed inside sandbox
Why:
- Demonstrates "real-world tool access" via MCP
- Agent can connect to multiple MCP servers
- Clean separation: agent (sandbox) vs tools (external)
How:
- Agent uses httpx to make MCP protocol calls
- Connects to GitHub MCP server at
host.docker.internal:8080 - Can be extended to Perplexity MCP, Slack MCP, etc.
Why:
- Hackathon demo needs visual component
- Real-time monitoring impresses judges
- Shows what's happening inside the "black box" sandbox
How:
- Orchestrator streams logs via callbacks
- Dashboard updates in real-time
- Shows timeline, results, and history
Why:
- Docker MCP Hub provides pre-built MCP servers
- Easy to add more tools (Perplexity, Slack, etc.)
- Production-ready and maintained
How:
docker-compose.ymldefines MCP servers- Run via
docker-compose up -d - Accessible from E2B sandboxes
Purpose: Main security analysis logic, runs inside E2B
Classes:
-
VulnerabilityScanner: Pattern-based vulnerability detection- SQL injection patterns
- XSS patterns
- Command injection patterns
- Path traversal patterns
- Generates fix suggestions for each vulnerability type
-
ExploitExecutor: Generate and execute exploits- Template-based exploit generation
- Safe execution (already in sandbox)
- Result validation
-
GitHubMCPClient: MCP client for GitHub integration- Fetch PR files via GitHub API
- Post comments to PRs
- Future: Use actual MCP protocol
-
SecurityAgent: Main orchestrator inside sandbox- Coordinates workflow
- Manages logging
- Formats reports
Key Methods:
async def analyze_pr(repo_owner, repo_name, pr_number):
1. Fetch PR files via MCP client
2. Scan files for vulnerabilities
3. Execute exploits to prove them
4. Generate security report with fix suggestions
5. Post report back via MCP
6. Return resultsPurpose: Launches and manages E2B sandboxes
Key Methods:
async def run_agent(...):
1. Create E2B sandbox
2. Install dependencies (httpx, etc.)
3. Deploy agent.py to sandbox
4. Execute agent with config
5. Stream logs back via callback
6. Parse and return results
7. Cleanup sandboxFeatures:
- Async execution for non-blocking operations
- Log streaming via callbacks
- Error handling and cleanup
- JSON result parsing
Purpose: Observability and control plane
Tabs:
- New Analysis: Trigger analysis for a PR
- Live Monitor: Real-time progress and logs
- History: Past analyses and results
Features:
- Real-time log streaming
- Auto-refresh during analysis
- Vulnerability visualization
- Security report display
- Analysis history tracking
Purpose: External MCP servers for tool access
Services:
-
github-mcp: GitHub API access via MCP- Port: 8080
- Auth: GitHub token from env var
- Accessible from E2B sandboxes
-
(Future)
perplexity-mcp: AI-powered insights -
(Future)
slack-mcp: Notifications
-
User triggers analysis (Streamlit dashboard)
Input: repo_owner, repo_name, pr_number -
Orchestrator creates sandbox
sandbox = await Sandbox.create()
-
Orchestrator deploys agent
# Write agent.py to sandbox filesystem sandbox.run_code(write_agent_code)
-
Agent runs inside sandbox
# Agent.py executes: - Connect to GitHub MCP - Fetch PR files - Scan for vulnerabilities - Execute exploits - Post report
-
Results stream back
Logs → Callback → Dashboard → User -
Cleanup
sandbox.kill()
Agent (inside E2B)
│
│ httpx.get()
▼
GitHub MCP Server (Docker)
│
│ GitHub API
▼
GitHub.com
│
│ Response
▼
GitHub MCP Server
│
│ JSON Response
▼
Agent (processes data)
- Isolated environment: E2B sandboxes are microVMs
- No network access (except to MCP servers)
- Temporary: Sandboxes are destroyed after analysis
- No persistent storage: Results extracted before cleanup
- Keys stored in
config.json(gitignored) - Passed to sandbox as config, not stored
- GitHub token has minimal required permissions
- E2B API key scoped to account
- Pattern matching: May have false positives
- Exploit confirmation: Reduces false positives
- Context-aware: Considers file type and location
- Extensible: Easy to add new patterns
- Creation: ~5-10 seconds
- Dependency installation: ~10-15 seconds
- Analysis execution: ~10-30 seconds (depends on PR size)
- Total: ~30-60 seconds per PR
- Parallel execution: Multiple sandboxes can run concurrently
- E2B limits: Based on plan (usually 10+ concurrent sandboxes)
- Bottleneck: GitHub API rate limits
- Optimization: Cache PR data, reuse sandboxes
E2B_API_KEY=e2b_xxx
GITHUB_TOKEN=ghp_xxx{
"e2b_api_key": "e2b_xxx",
"github_token": "ghp_xxx"
}# docker-compose.yml
services:
github-mcp:
image: mcp/github-server:latest
ports:
- "8080:8080"
environment:
- GITHUB_TOKEN=${GITHUB_TOKEN}# Terminal 1: Start MCP servers
docker-compose up
# Terminal 2: Run dashboard
streamlit run dashboard.py- Deploy dashboard to Streamlit Cloud
- MCP servers on AWS/GCP with public endpoints
- Webhook endpoint for automatic PR monitoring
- Database for analysis history
# .github/workflows/codeguard.yml
on: [pull_request]
jobs:
security-scan:
runs-on: ubuntu-latest
steps:
- name: Run CodeGuard AI
run: python orchestrator.py ${{ github.repository_owner }} ...-
✅ Agents running inside E2B sandboxes
- Agent code executes entirely within E2B microVM
-
✅ Connect to real tools through MCP
- Agent uses MCP client to connect to GitHub MCP server
-
✅ Use Docker MCP Hub servers
- GitHub MCP server from Docker MCP Hub
-
✅ Demonstrate practical value
- Automated security testing for GitHub PRs
-
✅ Show observability
- Real-time dashboard with logs and results
- Real-world security use case: Not a toy example
- Safe exploit execution: Actually proves vulnerabilities
- GitHub integration: Developers see results in PRs
- Extensible: Easy to add more MCP servers/tools
- Production-ready: Can be deployed today
- Add Perplexity MCP for AI-powered fix suggestions
- Webhook listener for automatic PR monitoring
- Support more languages (JavaScript, Go, etc.)
- Better exploit generation (LLM-powered)
- Machine learning for vulnerability detection
- Multi-agent collaboration (one agent per file)
- Remediation PRs (auto-fix vulnerabilities)
- Integration with CI/CD pipelines
- Enterprise features (SSO, audit logs, etc.)
- Custom rule engine
- Compliance reporting (SOC2, GDPR, etc.)
- SaaS offering with managed infrastructure
See CONTRIBUTING.md for guidelines on:
- Adding new vulnerability patterns
- Implementing new MCP integrations
- Improving exploit generation
- Enhancing the dashboard
Built for the E2B + MCP Hackathon 🚀
Demonstrating the power of agents running inside sandboxes with real-world tool access via MCP.