pg-cdc
Talk to your Postgres from Claude in 5 minutes — no prod credentials, no cloud account.
[demo GIF: psql INSERT → Claude answer that includes the new row]
pg-cdc streams your Postgres WAL into typed Parquet files on your laptop, then exposes them through a local MCP server. Claude (or Cursor, or any MCP-compatible client) can answer real questions about your real, current data — without touching prod, and without ever seeing your DATABASE_URL.
It's the fastest way to let an AI chat with your Postgres safely.
1. Configure — pg-cdc.yml:
source:
postgres:
url: postgresql://localhost:5432/mydb
schemas: [public]
storage:
type: filesystem
path: ./data2. Run the daemon and the MCP server:
pg-cdc init && pg-cdc start & # snapshot + stream WAL → ./data
pg-cdc mcp & # serve the local MCP endpoint (stdio)The
queryandrecent_changesMCP tools shell out to DuckDB. Install it once withbrew install duckdb(macOS) or see duckdb.org/docs/installation/.
3. Point Claude Desktop at it — add to claude_desktop_config.json (use the absolute path to your config — Claude Desktop launches the subprocess from its own working directory, not yours):
{
"mcpServers": {
"pg-cdc": { "command": "pg-cdc", "args": ["mcp", "--config", "/absolute/path/to/pg-cdc.yml"] }
}
}Restart Claude. Ask "what's the latest row in orders?" — it answers from your real data.
Full walkthrough: docs/01-getting-started.md.
| pg-cdc | Hand Claude DATABASE_URL |
|
|---|---|---|
| Local & private | Data never leaves your machine | Credentials shared with the model |
| Real-time | WAL CDC — rows appear seconds after write | One-shot, stale by next question |
| Safe & fast | Parquet snapshot, prod untouched, columnar reads | Live queries against prod, lock contention |
| Cost | Free, single binary, no cloud | Risk of a bad query |
| Doc | Description |
|---|---|
| Getting Started | 5-minute MCP-first quickstart |
| Configuration | Full YAML reference |
| Init | Snapshot phase details |
| Streaming | WAL streaming mechanics |
| Compaction | Base + delta model, TTL semantics |
| MCP Server | Tool reference, client wiring, troubleshooting, security model |
| Operations | Production run-book, health checks, troubleshooting |
| Commercial Edition | Governance, ACL, Lake Formation reconciliation |
Download binary — see Releases for the full list of platforms. Latest stable is v0.2.0.
# Linux (amd64)
curl -fsSL https://github.com/burnside-project/pg-cdc/releases/download/v0.2.0/pg-cdc_v0.2.0_linux_amd64.tar.gz | tar xz
sudo install -m 0755 pg-cdc-linux-amd64 /usr/local/bin/pg-cdc
# Linux (arm64)
curl -fsSL https://github.com/burnside-project/pg-cdc/releases/download/v0.2.0/pg-cdc_v0.2.0_linux_arm64.tar.gz | tar xz
sudo install -m 0755 pg-cdc-linux-arm64 /usr/local/bin/pg-cdc
# macOS (Apple Silicon)
curl -fsSL https://github.com/burnside-project/pg-cdc/releases/download/v0.2.0/pg-cdc_v0.2.0_darwin_arm64.tar.gz | tar xz
sudo install -m 0755 pg-cdc-darwin-arm64 /usr/local/bin/pg-cdcWindows (amd64) — PowerShell:
Invoke-WebRequest https://github.com/burnside-project/pg-cdc/releases/download/v0.2.0/pg-cdc_v0.2.0_windows_amd64.zip -OutFile pg-cdc.zip
Expand-Archive pg-cdc.zip -DestinationPath .
# Move pg-cdc-windows-amd64.exe somewhere on your PATH, e.g.:
New-Item -ItemType Directory -Force "$env:USERPROFILE\bin" | Out-Null
Move-Item pg-cdc-windows-amd64.exe "$env:USERPROFILE\bin\pg-cdc.exe"
# Add %USERPROFILE%\bin to PATH if it isn't already.Build from source:
git clone https://github.com/burnside-project/pg-cdc.git
cd pg-cdc
make buildThe 5-minute quickstart above runs everything locally. For production deployments — running pg-cdc as a systemd service, writing to S3/GCS, registering Glue tables, configuring TLS to RDS — see the Operations guide.
Source
- PostgreSQL logical replication (pgx/v5, pglogrepl)
- Per-schema discovery
- Declarative table include/exclude rules
- Tag-based table policy (e.g.,
pii,ephemeraltags → include/exclude)
Output
- Typed Parquet (pure Go, no CGO)
- Base snapshots + append-only delta epochs
- Compaction into new base (applies I/U/D; soft-deletes on 30d TTL)
- Manifest file per table (schema + epoch ordering)
Sinks
- Filesystem
- S3
- GCS
Catalog
- AWS Glue (optional; register manifest tables without re-snapshotting)
AI / MCP
- Local MCP server (
pg-cdc mcp) — read-only, single-user, stdio - Tools:
list_tables,describe_table,query,recent_changes - Multi-user / authenticated MCP — commercial
Operations
- Single static binary, no CGO
- Linux amd64/arm64, macOS arm64, Windows amd64
- SQLite state tracking (LSN, epoch watermarks, table metadata)
- Role → table → column ACL discovery from PostgreSQL GRANTs
- Automated RC releases on every push to main; stable releases via workflow dispatch
| Command | What it does |
|---|---|
init |
Snapshot tables → base Parquet + manifest + (optional) Glue catalog |
start |
Stream WAL → delta Parquet epochs |
mcp |
Serve a local MCP endpoint over the Parquet output (read-only, single-user) |
compact |
Merge deltas → new base snapshot (applies I/U/D; soft-deletes on TTL) |
status |
Health: lag, LSN, epochs, tables |
discover |
List tables from Postgres |
discover --acl |
Show role → table → column access map from PostgreSQL GRANTs |
teardown |
Drop publication + replication slot |
catalog register |
Register manifest tables in Glue without re-snapshotting |
version |
Print version |
Full reference in docs/08-operations.md.
Production example with S3 + Glue and tag-based policy:
source:
postgres:
url: "postgresql://cdc_user:${PGCDC_PASSWORD}@host:5432/db"
schemas: ["public"]
storage:
type: s3
bucket: my-warehouse
prefix: cdc/
region: us-west-2
catalog:
type: glue
database: my_db
region: us-west-2
tables:
exclude: ["public.tbl_sessions"]
tags:
pii: ["public.tbl_cc", "billing.*"]
ephemeral: ["*.tbl_session*"]
policy:
pii: exclude
ephemeral: exclude
untagged: includeFull reference: docs/02-configuration.md.
The open-source edition is a complete, working product:
- Full CDC pipeline — logical replication, base/delta Parquet output, compaction
- Three sink adapters — filesystem, S3, GCS
- Glue catalog registration (optional)
- SQLite-backed state
- Local MCP server (
pg-cdc mcp) — single-user, read-only, stdio - Postgres-native ACL discovery (
discover --acl) - Tag-based table inclusion / exclusion
Production governance, compliance, and multi-user features are commercial:
- Layer-2 tag governance (policy-as-code) with required-tag enforcement
- DynamoDB-backed ACL registry with versioned audit trail
- AWS Lake Formation reconciliation (
acl diff,acl sync) - Authenticated multi-user MCP server with row/column-level access control
- Emergency-override workflows with expiry
- Terraform stack for IAM / OIDC / governance provisioning
- Extended CLI:
pg-cdc acl register|get|set|diff|sync|list - HIPAA-ready deployment topology
See docs/commercial-edition.md for the end-to-end governance flow with screenshots: Parquet output → Glue Catalog → ACL workflow → DynamoDB registry → Lake Formation LF-Tags.
| Repo | Role |
|---|---|
| pg-cdc (this repo) | CDC server — WAL streaming, Parquet writing, compaction |
| burnside-go | Shared types — manifest spec, storage interface |
| pg-warehouse | Local-first analytical engine that can consume CDC output |
| Layer | Technology |
|---|---|
| Language | Go 1.25 (pure Go, no CGO) |
| CLI | Cobra |
| PostgreSQL | pgx/v5, pglogrepl |
| Parquet | parquet-go (pure Go) |
| State | SQLite (modernc.org/sqlite) |
| Storage | Filesystem, S3, GCS |
| Platforms | Linux amd64/arm64, macOS arm64, Windows amd64 |
Release candidates auto-increment on every push to main against the version in VERSION: e.g. v0.2.0-rc1, v0.2.0-rc2, … When ready, a stable release is promoted from the latest RC via the Release workflow dispatch (creates the bare vX.Y.Z tag, marks the GitHub release as --latest, and renames the binaries).
Latest stable: v0.2.0 — see Releases. Maintainer guide: RELEASING.md.
- GitHub Issues — bugs and feature requests
- GitHub Discussions — questions and ideas
- Contributing — development setup and guidelines
- Code of Conduct
- Security Policy
Apache License 2.0 — Copyright 2025-2026 Burnside Project
