Skip to content

bhupendra05/logdrain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logdrain 🌀📜

10 million log lines → 200 templates → the 3 that matter. Nobody reads logs — there are too many. logdrain mines unstructured logs into a small set of templates (the repeating message patterns), then flags the rare/novel ones — the lines that just appeared and probably signal a breach or a break. A from-scratch implementation of the Drain algorithm. Offline, zero dependencies.

Python Dependencies License

$ logdrain app.log

🌀 logdrain — 5,000 lines → 7 templates

  3,910 ×  User <*> logged in from <*>
    842 ×  GET <*> 200
    240 ×  Connection from <*> closed
    ...

🚨 Rare / novel templates (possible anomalies):
    1 ×  PANIC kernel oops at <*>
    1 ×  Disk failure on <*>

How it works

logdrain implements Drain: it groups log lines by token count, then within each group matches lines against existing templates by token-position similarity, replacing the parts that vary with a <*> wildcard. Frequent templates are normal noise; templates seen only once or twice are surfaced as novel — exactly where an analyst should look.

Usage

logdrain app.log                 # template summary + rare lines
cat huge.log | logdrain -        # stdin
logdrain app.log --rare 2 --sim 0.5 --json

Install

pip install logdrain

License

MIT

About

Millions of log lines into a few templates, then surfaces the rare/novel ones — a from-scratch Drain log parser. Offline, zero-dep.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages