Skip to content

palhamel/mailfilter-ai

Repository files navigation

MailFilter AI

License: MIT Node.js TypeScript Docker CI CodeQL AI Providers

AI-powered email filter. Reads digest emails from multiple platforms via IMAP, scores each listing against your personal profile using AI, and sends you a sorted summary email with the best matches on top. Supports multiple AI providers — Mistral AI (default) and Berget AI (EU-sovereign inference).

MailFilter AI is a self-hosted server — it runs continuously on a schedule, checking your mailbox and sending you summaries automatically. Deploy it with Docker or run it directly with Node.js.

Great for any type of digest email — job ads, newsletters, property listings, freelance gigs — anything where you get bulk emails and only care about a few. Forward your digests to a dedicated mailbox, define what you're looking for in a simple markdown profile, and let AI do the filtering. Instead of scanning dozens of irrelevant listings every day, you get one clean summary with the best matches ranked first and the noise at the bottom.

How it works

Incoming digest emails (LinkedIn, Indeed, etc.)
    |
    v
IMAP: fetch unread emails
    |
    v
Detect provider -> route to HTML parser
    |
    v
Extract individual listings
    |
    v
AI: score each listing (1-5) against your profile
    |
    v
SMTP: send sorted result email
    |
    v
Log to JSON, repeat on cron schedule
  1. Read - Connects to your mail server via IMAP and fetches unread emails
  2. Detect - Identifies the email provider (LinkedIn, Indeed, etc.); skips unknown senders
  3. Parse - Routes to provider-specific HTML parsers to extract individual listings
  4. Evaluate - Sends each listing to your configured AI provider for scoring against your profile
  5. Send - Sends a result email with listings sorted by score (best matches first)
  6. Log - Writes evaluations and errors to daily JSON log files
  7. Repeat - Runs on a configurable cron schedule (default: every 15 minutes)

Prerequisites

  • Node.js 24 or later
  • AI API key — one of:
  • IMAP/SMTP email account (any provider that supports password auth)
  • Digest emails forwarded to that account

Quick start

# Clone
git clone https://github.com/palhamel/mailfilter-ai.git
cd mailfilter-ai

# Install
npm install

# Configure
cp .env.example .env
cp profile.example.md profile.md

# Edit .env with your mail credentials and AI API key
# Edit profile.md with your preferences

# Run
npm run dev

Profile

The profile is a markdown file that defines what you're looking for. It's loaded at startup and used as the AI system prompt for scoring. See profile.example.md for the expected format.

Your profile should include:

  • Tech stack - Languages, frameworks, and tools you work with
  • Interesting listings - Types of opportunities you're looking for
  • Deal-breakers - Things that make a listing irrelevant (automatic low score)
  • Blacklisted industries - Industries you're not interested in (automatic 1 point)
  • Preferences - Work format, team size, location preferences
  • Matching keywords - Terms that commonly appear in relevant listings

Set PROFILE_PATH in your .env to point to your profile file (e.g. ./profile.md).

Scoring system

Score Category Meaning
5 Green Perfect match - act immediately
4 Green Strong match - worth checking out
3 Yellow Interesting but missing something
2 (none) Weak match
1 (none) Irrelevant or wrong fit

AI providers

The evaluator uses a provider-agnostic AIClient interface, making it easy to switch or add providers.

Provider AI_PROVIDER API Notes
Mistral AI mistral (default) Mistral SDK Fast, affordable text classification
Berget AI berget OpenAI-compatible (native fetch) EU-sovereign inference, no data leaves Europe

To switch provider, set AI_PROVIDER and the corresponding API key in your .env:

AI_PROVIDER=berget
BERGET_API_KEY=your-key-here
BERGET_MODEL=llama-3.3-70b-instruct

Adding a new OpenAI-compatible provider requires only a new adapter in src/ai/providers.ts.

Environment variables

Variable Required Default Description
MAIL_USER Yes - Email address for IMAP/SMTP auth
MAIL_PASSWORD Yes - Password for IMAP/SMTP auth
IMAP_HOST Yes - IMAP server hostname (e.g. mail.provider.com)
SMTP_HOST Yes - SMTP server hostname (e.g. mail.provider.com)
NOTIFY_EMAIL Yes - Email address to receive result digests
PROFILE_PATH Yes - Path to your profile markdown file
AI_PROVIDER No mistral AI provider: mistral or berget
MISTRAL_API_KEY Yes - Mistral AI API key
MISTRAL_MODEL No mistral-small-latest Mistral model to use
BERGET_API_KEY No - Berget AI API key (required when AI_PROVIDER=berget)
BERGET_MODEL No llama-3.3-70b-instruct Berget model to use
MAILBOX_CHECK_INTERVAL_MINUTES No 15 Minutes between mailbox checks
LOG_DIR No ./data/logs Directory for JSON log files
DISCORD_WEBHOOK_URL No - Discord webhook for error/status notifications
HEALTH_PORT No 3000 HTTP health endpoint port

Supported email providers

Each provider has a dedicated HTML parser in src/mail/parsers/:

Provider Detection
LinkedIn Sender contains linkedin
Webbjobb Sender contains webbjobb
Indeed Sender contains indeed
Demando Sender contains demando

Emails from unrecognized providers are skipped and logged.

Adding a new provider

  1. Create src/mail/parsers/provider-name.ts exporting a parse function
  2. Add detection rule in detectProvider() in parser.ts
  3. Add routing in the digest parser in parser.ts
  4. Add tests in src/mail/__tests__/parser.test.ts

Docker

docker build -t mailfilter-ai .
docker run \
  --env-file .env \
  -v ./profile.md:/app/profile.md:ro \
  --restart unless-stopped \
  mailfilter-ai

The image uses a multi-stage build with non-root user and a built-in HEALTHCHECK. Mount your profile file into the container with -v.

Scripts

npm run dev         # Run with tsx (loads .env automatically)
npm run build       # Compile TypeScript
npm start           # Run compiled output
npm test            # Run tests (vitest)
npm run lint        # ESLint with TypeScript rules
npm run typecheck   # Type check without emitting

Project structure

src/
  index.ts              # Entry point, cron scheduler, startup
  pipeline.ts           # Email processing pipeline (fetch, parse, evaluate, send)
  config/
    env.ts              # Zod environment validation
  ai/
    providers.ts        # AI provider factory (Mistral, Berget)
    evaluator.ts        # Provider-agnostic evaluation logic
    prompt.ts           # System prompt builder (loads profile from PROFILE_PATH)
  mail/
    reader.ts           # IMAP: fetch unread emails
    parser.ts           # Provider detection, routing to parsers
    sender.ts           # SMTP: send result digest emails
    parsers/            # Provider-specific HTML parsers
  logger/
    index.ts            # JSON file logging (evaluations + errors + rotation)
  health/
    index.ts            # Health file writer
    check.ts            # Docker HEALTHCHECK script
  http/
    server.ts           # HTTP health endpoint (native node:http)
  notifications/
    discord.ts          # Discord webhook (native fetch)
    index.ts            # Error buffering and flush
  stats/
    index.ts            # In-memory run statistics
  utils/
    retry.ts            # Generic retry with exponential backoff
    delay.ts            # Simple sleep utility
  types/
    index.ts            # Shared TypeScript interfaces

Robustness

  • IMAP retry - 3 attempts with exponential backoff
  • AI retry - 2 attempts per evaluation, retries on 429/500/503
  • Rate limiting - 750ms delay between AI evaluations
  • Graceful shutdown - SIGTERM/SIGINT stop cron, wait for in-flight work, notify Discord
  • Error logging - Errors written to daily JSON log files
  • Log rotation - Logs older than 30 days deleted on startup
  • Discord notifications - Startup, shutdown, critical failures, batched errors
  • Health check - Docker HEALTHCHECK + HTTP endpoint for external monitoring
  • Crash handlers - uncaughtException/unhandledRejection logged and notified

Tech stack

  • Runtime: Node.js 24 LTS, TypeScript (strict, ESM)
  • Email: IMAP (imap + mailparser), SMTP (nodemailer)
  • AI: Multi-provider (Mistral AI, Berget AI), deterministic scoring (temperature: 0)
  • Parsing: Cheerio for HTML email parsing
  • Validation: Zod for environment config
  • Scheduling: node-cron
  • Testing: Vitest
  • CI/CD: GitHub Actions (lint, typecheck, test, build, audit, container scan, SAST)

Security

This project follows security best practices for a public, self-hosted application.

CI pipeline

Every push and pull request to main runs:

Check Tool What it does
Lint ESLint + TypeScript Code quality and type safety
Tests Vitest Unit and integration tests
Dependency audit npm audit Fails on critical npm vulnerabilities
Container scan Grype (Anchore) Scans the Docker image for OS and package CVEs
Static analysis CodeQL SAST for JavaScript/TypeScript (injection, prototype pollution, etc.)

CodeQL also runs on a weekly schedule to catch newly disclosed vulnerabilities in existing code.

Automated dependency updates

Dependabot is configured to open pull requests weekly for:

  • npm package updates (security patches and version bumps)
  • GitHub Actions version updates

Branch protection

The main branch is protected by a GitHub ruleset:

  • All changes require a pull request
  • CI and CodeQL status checks must pass before merge
  • Force-push and branch deletion are blocked

Supply chain hardening

All GitHub Actions in CI workflows are pinned to full commit SHAs rather than mutable version tags. This prevents supply chain attacks where a compromised tag could inject malicious code into the build pipeline. Dependabot keeps these SHA pins up to date.

Reporting vulnerabilities

If you discover a security vulnerability, please use GitHub's private vulnerability reporting rather than opening a public issue. See SECURITY.md for details.

License

MIT

About

AI-powered self-hosted email filter. Great for any type of digest email — job ads, newsletters, property listings, freelance gigs — anything where you get bulk emails and only care about a few.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages