LogMind is a CLI tool that learns your normal log patterns and detects anomalies using semantic embeddings. When an anomaly is detected, it finds the most similar past incidents and suggests resolutions.
grep "ERROR": "Here are 500 lines containing ERROR" (keyword match)
LogMind: "This pattern is 91% similar to last month's (semantic match)
DB outage. Resolution: increase pool_size"
Traditional log monitoring relies on keyword matching or static rules. LogMind takes a fundamentally different approach:
| Feature | grep / ELK Rules | LogMind |
|---|---|---|
| Detection method | Keyword / regex | Semantic embedding similarity |
| New patterns | Missed (no rule) | Detected automatically |
| Past incident matching | Manual search | Automatic similarity search |
| Resolution suggestions | None | From past incident labels |
| Compression | None | 3-bit TurboQuant (90%+ savings) |
Log Files ──> Parser ──> Embedder ──> Detector ──> Alerter
│ │ │ │
Auto-detect Sentence Anomaly Slack/Discord
6+ formats Transformer Scoring Webhook
│ │ │
LogEntry TurboQuant Past Incident
LogWindow Compression Matching
| Module | Description |
|---|---|
parser.py |
Auto-detects and parses 7 log formats (JSON, Python, Spring Boot, Syslog, Nginx, Docker, Simple) |
embedder.py |
Converts log windows to semantic vectors using all-MiniLM-L6-v2 (384-dim), compresses with TurboQuant 3-bit |
detector.py |
Scores anomalies by comparing against learned normal patterns, classifies severity and type |
incidents.py |
Labels past incidents, auto-detects error spikes, semantic search across incidents |
alerter.py |
Sends alerts to Slack (Block Kit), Discord (Embeds), or generic webhooks |
collector.py |
Collects logs from files (glob), stdin, Docker containers, shell commands |
storage.py |
Persists/loads the learned index (pickle-based) |
display.py |
Rich terminal output with severity-colored alerts and formatted reports |
cli.py |
Click-based CLI with 6 commands |
- Python 3.10+
- ~500MB disk space (for sentence-transformers model, downloaded once)
git clone https://github.com/wjddusrb03/logmind.git
cd logmind
pip install -e .# Slack/Discord/Webhook alerts
pip install -e ".[alerts]"
# Docker log collection
pip install -e ".[docker]"
# All optional dependencies
pip install -e ".[alerts,docker]"# 1. Learn normal patterns from your logs
logmind learn /var/log/app.log
# 2. Scan a log file for anomalies
logmind scan /var/log/app-today.log
# 3. Watch logs in real-time
tail -f /var/log/app.log | logmind watch -Learns what "normal" looks like by building a vector index from your log files.
# Basic usage
logmind learn app.log
# Multiple files with glob
logmind learn /var/log/app/*.log
# Specify log format (auto-detected by default)
logmind learn --format python app.log
logmind learn --format json structured.log
logmind learn --format syslog /var/log/syslog
logmind learn --format nginx_access /var/log/nginx/access.log
logmind learn --format docker container.log
logmind learn --format spring boot.log
# Custom window size (default: 60 seconds)
logmind learn --window 120 app.log
# Custom embedding model
logmind learn --model all-MiniLM-L6-v2 app.log
# Custom quantization bits (default: 3)
logmind learn --bits 4 app.log
# Auto-detect and label error spike incidents
logmind learn --auto-detect-incidents app.logOutput:
LogMind Learn Complete
Total lines: 15,234
Windows: 254
Error count: 43
Warn count: 128
Sources: api, database, auth, scheduler
Model: all-MiniLM-L6-v2
Learn time: 12.3s
Index saved to: .logmind/index.pkl
Scans a log file and reports all detected anomalies.
# Basic scan
logmind scan new-errors.log
# High sensitivity (catches more anomalies)
logmind scan --sensitivity high app.log
# Low sensitivity (only critical anomalies)
logmind scan --sensitivity low app.log
# JSON output (for piping to other tools)
logmind scan --json app.log
# Specify window size
logmind scan --window 30 app.logOutput:
CRITICAL Anomaly Score: 0.87 Type: similar_incident
Summary: Error spike detected: 15 errors in window. Similar to past
incident "DB pool exhausted" (91% match).
Resolution: increase pool_size to 200
WARNING Anomaly Score: 0.52 Type: new_pattern
Summary: New pattern detected not seen during learning phase.
3 errors from previously unseen source "payment-gateway".
--- Scan Report ---
Total windows scanned: 45
Anomalies detected: 2
Critical: 1 Warning: 1 Info: 0
Sensitivity levels:
| Level | Anomaly Threshold | Error Spike Threshold | Use Case |
|---|---|---|---|
low |
0.55 | 5.0x baseline | Production (reduce noise) |
medium |
0.40 | 3.0x baseline | Default |
high |
0.30 | 2.0x baseline | Investigation mode |
Continuously monitors logs and alerts on anomalies in real-time.
# Watch a file (tail -f style)
logmind watch /var/log/app.log
# Watch from stdin (pipe)
tail -f /var/log/app.log | logmind watch -
kubectl logs -f my-pod | logmind watch -
# Watch with Slack alerts
logmind watch app.log --slack https://hooks.slack.com/services/T.../B.../xxx
# Watch with Discord alerts
logmind watch app.log --discord https://discord.com/api/webhooks/123/abc
# Watch Docker container logs
logmind watch --docker my-container
# Watch output of a command
logmind watch --command "journalctl -f -u myapp"
# Combine options
logmind watch app.log \
--sensitivity high \
--slack https://hooks.slack.com/services/xxx \
--jsonLabels a time range in your logs as an incident. This teaches LogMind to recognize similar patterns in the future and suggest resolutions.
# Label an incident with time range
logmind label "2024-03-15 10:00" "2024-03-15 11:00" app.log \
--label "DB pool exhausted" \
--resolution "Increased pool_size from 50 to 200"
# Label multiple log sources
logmind label "2024-03-15 10:00" "2024-03-15 11:00" \
app.log db.log worker.log \
--label "Cascading failure from DB" \
--resolution "Added circuit breaker, increased timeouts"How it works:
- Parses the log files
- Extracts entries within the time range
- Creates a labeled window with the incident description and resolution
- Embeds the window and adds it to the incident index
- Future scans will match similar patterns and suggest the resolution
Searches past labeled incidents using semantic similarity (not just keywords).
# Search for similar incidents
logmind search "database connection timeout"
logmind search "memory leak in worker process"
logmind search "high latency on API endpoints"
# Return more results
logmind search "disk full" -k 10Output:
Search Results for: "database connection timeout"
1. DB pool exhausted (92% match)
Resolution: Increased pool_size from 50 to 200
Occurred: 2024-03-15 10:00
2. Redis connection storm (67% match)
Resolution: Added connection pooling with max_connections=100
Occurred: 2024-02-28 14:30
Shows detailed statistics about the learned index.
logmind statsOutput:
LogMind Index Statistics
Model: all-MiniLM-L6-v2
Embedding dim: 384
Total lines: 15,234
Total windows: 254
Incident count: 3
Error count: 43
Warn count: 128
Sources: api, database, auth, scheduler, worker
Avg errors/window: 0.17
Learn time: 12.3s
LogMind auto-detects log formats. Supported formats:
{"timestamp":"2024-03-15T10:30:45Z","level":"ERROR","message":"DB connection failed","source":"api"}2024-03-15 10:30:45,123 - myapp.database - ERROR - Connection pool exhausted
2024-03-15 10:30:45.123 ERROR 12345 --- [http-nio-8080-exec-1] c.m.app.DbService : Query timeout
Mar 15 10:30:45 webserver sshd[12345]: Failed password for root from 192.168.1.1
192.168.1.1 - - [15/Mar/2024:10:30:45 +0000] "GET /api/health HTTP/1.1" 500 42
2024-03-15T10:30:45.123456Z stderr ERROR: database system is not yet accepting connections
2024-03-15 10:30:45 ERROR Something went wrong
- Parse: Auto-detect format, extract timestamp/level/source/message
- Window: Group entries into time-based windows (default 60s)
- Embed: Convert each window to a 384-dim vector using
all-MiniLM-L6-v2 - Compress: Quantize vectors to 3-bit using TurboQuant (90%+ memory savings)
- Save: Persist the index to
.logmind/index.pkl
- Parse & Window: Same as learning phase for new logs
- Embed: Convert each window to a vector
- Score: Compare against normal patterns using cosine similarity
- Classify: Determine anomaly type and severity
- Match: Find similar past incidents from labeled data
- Alert: Display results or send to Slack/Discord/webhook
| Type | Description | Trigger |
|---|---|---|
error_spike |
Sudden increase in error rate | Error count exceeds baseline by threshold |
new_pattern |
Pattern not seen during learning | Low similarity to all normal patterns |
similar_incident |
Matches a past labeled incident | High similarity to incident vectors |
| Severity | Criteria |
|---|---|
CRITICAL |
Score >= 0.7 OR error count >= 10 |
WARNING |
Score >= 0.5 OR error count >= 3 |
INFO |
Below WARNING thresholds |
# Set up Slack webhook:
# 1. Go to https://api.slack.com/apps → Create New App
# 2. Enable Incoming Webhooks → Add New Webhook to Workspace
# 3. Copy the webhook URL
logmind watch app.log --slack https://hooks.slack.com/services/T.../B.../xxxSlack messages use Block Kit format with:
- Color-coded severity header
- Anomaly score, type, error count fields
- Summary with context
- Similar past incidents with resolution suggestions
# Set up Discord webhook:
# 1. Server Settings → Integrations → Webhooks → New Webhook
# 2. Copy the webhook URL
logmind watch app.log --discord https://discord.com/api/webhooks/123/abc# Any HTTP endpoint that accepts JSON POST
logmind watch app.log --webhook https://your-api.com/alertsWebhook payload:
{
"severity": "CRITICAL",
"anomaly_score": 0.87,
"anomaly_type": "similar_incident",
"summary": "Error spike detected...",
"error_count": 15,
"total_count": 42,
"similar_incidents": [
{"label": "DB pool exhausted", "similarity": 0.91, "resolution": "increase pool_size"}
]
}# Monitor a Docker container
logmind watch --docker my-container --slack https://hooks.slack.com/...
# Monitor Kubernetes pod logs
kubectl logs -f deployment/my-app | logmind watch - --sensitivity high
# Monitor journalctl
logmind watch --command "journalctl -f -u myapp.service"# In your CI pipeline, scan test logs for anomalies
logmind scan test-output.log --json > anomalies.json
# Fail the pipeline if critical anomalies found
logmind scan test-output.log --sensitivity high --json | \
python -c "import json,sys; d=json.load(sys.stdin); sys.exit(1 if d.get('critical',0)>0 else 0)"logmind/
├── src/logmind/
│ ├── __init__.py # Version info
│ ├── models.py # Data models (LogEntry, LogWindow, AnomalyAlert, etc.)
│ ├── parser.py # Log format auto-detection and parsing
│ ├── embedder.py # Sentence embedding and TurboQuant compression
│ ├── detector.py # Anomaly detection engine
│ ├── incidents.py # Incident labeling and search
│ ├── alerter.py # Slack/Discord/Webhook alert delivery
│ ├── collector.py # Log source collectors (file, stdin, docker, command)
│ ├── storage.py # Index persistence (save/load)
│ ├── display.py # Terminal display formatting
│ └── cli.py # Click CLI commands
├── tests/ # 1,243 test cases across 24 files
│ ├── test_models.py # Core data model tests
│ ├── test_parser.py # Parser unit tests
│ ├── test_parser_stress.py # Parser stress tests (121 cases)
│ ├── test_parser_advanced.py # Advanced parser tests (95 cases)
│ ├── test_embedder.py # Embedder tests
│ ├── test_detector.py # Detector unit tests
│ ├── test_detector_stress.py # Detector stress tests (73 cases)
│ ├── test_detector_boundary.py # Boundary condition tests (48 cases)
│ ├── test_alerter.py # Alerter unit tests
│ ├── test_alerter_advanced.py # Alerter advanced tests (68 cases)
│ ├── test_collector.py # Collector tests
│ ├── test_collector_stress.py # Collector stress tests
│ ├── test_collector_advanced.py # Collector advanced tests
│ ├── test_display.py # Display tests
│ ├── test_display_stress.py # Display stress tests
│ ├── test_display_rigor.py # Display rigorous tests
│ ├── test_storage.py # Storage tests
│ ├── test_storage_rigor.py # Storage rigorous tests
│ ├── test_models_rigor.py # Models exhaustive tests
│ ├── test_cli.py # CLI unit tests
│ ├── test_cli_e2e.py # CLI end-to-end tests
│ ├── test_incidents_stress.py # Incident stress tests
│ ├── test_real_model.py # Real AI model integration tests
│ ├── test_e2e_pipeline.py # Full pipeline E2E tests
│ ├── test_integration_comprehensive.py # Cross-module integration (103 cases)
│ ├── test_robustness.py # Security & robustness tests (74 cases)
│ └── sample.log # Sample log file for testing
├── pyproject.toml # Build configuration
├── README.md # English documentation
└── README_KO.md # Korean documentation
# Run fast tests (~22 seconds)
pytest tests/ --ignore=tests/test_real_model.py --ignore=tests/test_e2e_pipeline.py \
--ignore=tests/test_cli_e2e.py --ignore=tests/test_incidents_stress.py
# Run all tests including AI model tests (~15 minutes)
pytest tests/
# Run specific test categories
pytest tests/test_parser.py tests/test_parser_stress.py tests/test_parser_advanced.py -v
pytest tests/test_robustness.py -v
pytest tests/test_integration_comprehensive.py -vTest Results: 1,243 tests across 24 files, 0 failures.
| Package | Purpose |
|---|---|
sentence-transformers |
Semantic embeddings (all-MiniLM-L6-v2, 384-dim) |
langchain-turboquant |
3-bit vector compression (90%+ memory savings) |
click |
CLI framework |
rich |
Terminal formatting |
numpy |
Numerical operations |
httpx (optional) |
HTTP alerts (Slack/Discord/Webhook) |
docker (optional) |
Docker container log collection |
MIT