10 million log lines → 200 templates → the 3 that matter. Nobody reads logs — there are too many. logdrain mines unstructured logs into a small set of templates (the repeating message patterns), then flags the rare/novel ones — the lines that just appeared and probably signal a breach or a break. A from-scratch implementation of the Drain algorithm. Offline, zero dependencies.
$ logdrain app.log
🌀 logdrain — 5,000 lines → 7 templates
3,910 × User <*> logged in from <*>
842 × GET <*> 200
240 × Connection from <*> closed
...
🚨 Rare / novel templates (possible anomalies):
1 × PANIC kernel oops at <*>
1 × Disk failure on <*>
logdrain implements Drain: it groups log lines by token count, then within each group matches lines against existing templates by token-position similarity, replacing the parts that vary with a <*> wildcard. Frequent templates are normal noise; templates seen only once or twice are surfaced as novel — exactly where an analyst should look.
logdrain app.log # template summary + rare lines
cat huge.log | logdrain - # stdin
logdrain app.log --rare 2 --sim 0.5 --jsonpip install logdrainMIT