Skip to content

balyakin/spendhawk

Repository files navigation

SpendHawk

SpendHawk is a self-hosted SaaS spend auditor. It reads the messy places where subscription bills usually hide - bank CSVs, invoice PDFs, .eml receipts, Gmail, manual JSON, and webhooks - then turns them into a dashboard you can actually use before the next renewal hits.

The short version: it's htop for your SaaS bill. Local-first, AI-optional, and built for teams that would rather cancel waste than maintain another spreadsheet.

SpendHawk dashboard with demo SaaS spend data

The screenshot above is from the built-in demo data.

What It Finds

SpendHawk is useful when you have card statements, invoices, and a vague feeling that the monthly SaaS number is too high. It can flag:

  • subscriptions that are still billing but no longer used
  • inactive seats on tools like Slack, Adobe, and GitHub
  • duplicate services across overlapping categories
  • price spikes, plan changes, and odd renewal jumps
  • trials that quietly turned into paid plans
  • upcoming renewals worth negotiating
  • budget overruns and statistical anomalies

By default, the built-in rules do the work. The optional AI analyzer in packages/core only needs a reduced subscription summary, not raw exports or invoice bodies. Leave SPENDHAWK_AI_PROVIDER=none if you want the app fully offline.

Quick Start

Requirements:

  • Node.js 20 or newer
  • npm 10 or newer

Run the demo:

npm install
npm run build
npm run demo

Open http://localhost:7749.

The demo seeds a local SQLite database and starts the same Hono server used in normal mode. By default, SpendHawk stores data under ~/.spendhawk.

Use Your Own Data

After npm run build, run CLI commands through the spendhawk workspace package:

npm -w spendhawk run start -- scan --csv ./bank-export.csv --currency USD
npm -w spendhawk run start -- scan --pdf ./invoices
npm -w spendhawk run start -- scan --eml ./receipts
npm -w spendhawk run start -- scan --gmail
npm -w spendhawk run start -- serve --port 7749 --host localhost

Useful follow-up commands:

npm -w spendhawk run start -- diff --from 2026-04 --to 2026-05
npm -w spendhawk run start -- report --format html --output spendhawk-report.html
npm -w spendhawk run start -- notify --channel slack --critical-only

If you install or link the CLI, the same commands are available as spendhawk scan, spendhawk serve, and so on.

Dashboard

The web UI focuses on the questions that usually come up in a spend review:

  • How much are we spending every month?
  • Which categories are growing?
  • Which recommendations have real savings attached?
  • What happens if we accept a batch of recommendations?
  • Are we over budget this month?
  • Can we export a report without exposing vendor names or exact amounts?

CSV imports can be started from the dashboard, and the recommendations page supports filtering, bulk status changes, and savings simulation before you mark anything as accepted.

Privacy Model

SpendHawk is designed to be boring in the right places:

  • The server binds to localhost by default.
  • Data lives in SQLite; the default path is ~/.spendhawk/data.db.
  • There are no external calls in the demo or in CSV/PDF/EML scans with the default config.
  • The AI provider defaults to none; cloud AI calls require an explicit provider and key.
  • Gmail uses read-only OAuth when configured.
  • Binding to 0.0.0.0 without auth is rejected unless you explicitly set SPENDHAWK_UNSAFE_DISABLE_AUTH=true.
  • Secrets stored through settings require SPENDHAWK_ENCRYPTION_KEY and are encrypted with AES-256-GCM.

This is still spend data. Treat the database, reports, and screenshots as sensitive.

Configuration

SpendHawk creates ~/.spendhawk/config.json on first run. Environment variables override that file.

Variable Purpose
SPENDHAWK_DB SQLite database path
SPENDHAWK_HOST, SPENDHAWK_PORT HTTP bind address
SPENDHAWK_AUTH_PASSWORD Enables single-user password auth
SPENDHAWK_UNSAFE_DISABLE_AUTH Allows public binding without auth when set to true
SPENDHAWK_ENCRYPTION_KEY 32-byte base64 or hex key for encrypted settings
SPENDHAWK_AI_PROVIDER none, openai, anthropic, or ollama
SPENDHAWK_AI_KEY API key for cloud AI providers
SPENDHAWK_AI_MODEL Optional model override
SPENDHAWK_AI_REDACTION vendors, amounts, or none
SPENDHAWK_EXCHANGE_RATE_PROVIDER none or frankfurter
SPENDHAWK_GMAIL_CLIENT_ID, SPENDHAWK_GMAIL_CLIENT_SECRET Gmail OAuth credentials
SPENDHAWK_WEBHOOK_SECRET Shared secret for incoming webhooks
SPENDHAWK_SLACK_WEBHOOK Slack digest webhook URL
SPENDHAWK_LOG_LEVEL debug, info, warn, or error

See .env.example for a copy-pasteable starting point.

Docker

docker compose up --build

The compose file exposes 127.0.0.1:7749 and stores the database in a named volume. If you change the bind address for a real deployment, set SPENDHAWK_AUTH_PASSWORD and SPENDHAWK_ENCRYPTION_KEY first.

How It Works

CSV / PDF / EML / Gmail / Webhook
  -> ingestors
  -> normalization, vendor matching, deduplication
  -> subscription builder
  -> rule analysis
  -> SQLite
  -> Hono API
  -> React dashboard

The built-in vendor catalog currently covers 260+ common SaaS vendors. Matching is intentionally explainable: SpendHawk keeps confidence scores, source hashes, evidence, and recommendation status in the local database so you can review how it reached a conclusion.

Repository Layout

Path What lives there
packages/core Ingestors, normalization, vendor matching, rules, reports, SQLite access
packages/cli spendhawk command line interface
server Hono API, auth, upload routes, static dashboard serving
packages/dashboard React dashboard built with Vite, React Query, Recharts, and Tailwind
demo Demo seed data and sample transactions
docs Setup, API, Gmail, security, and configuration notes

Development

npm install
npm run build
npm run test
npm run lint

For dashboard work, run the API on localhost:7749 in one terminal:

npm -w spendhawk run start -- demo --no-open

Then start Vite in another terminal:

npm run dev

Vite serves the dashboard on http://localhost:5173 and proxies /api and /metrics to the local SpendHawk server.

API and Docs

Important endpoints include:

  • GET /api/health
  • GET /api/stats
  • GET /api/subscriptions
  • GET /api/recommendations
  • POST /api/upload/csv
  • POST /api/scan
  • POST /api/webhook/incoming
  • GET /api/scan/:id/events
  • GET /api/report

Known Limits

SpendHawk is not accounting software, and it should not be the only thing standing between you and a contract decision. Statement descriptions can be vague, PDF invoices vary wildly, and vendor matching is only as good as the evidence available. Use the recommendations as a shortlist for review, then confirm the details in the source system before canceling or downgrading anything important.

License

MIT

About

Local-first SaaS spend intelligence: scan CSV, PDF, EML and Gmail invoices, detect unused subscriptions, duplicate tools, price spikes and renewal risks, and get actionable savings recommendations.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages