Skip to content

burnside-project/pg-cdc

pg-cdc

Talk to your Postgres from Claude in 5 minutes — no prod credentials, no cloud account.

CI Release License Go Stars

[demo GIF: psql INSERT → Claude answer that includes the new row]

What is pg-cdc?

pg-cdc streams your Postgres WAL into typed Parquet files on your laptop, then exposes them through a local MCP server. Claude (or Cursor, or any MCP-compatible client) can answer real questions about your real, current data — without touching prod, and without ever seeing your DATABASE_URL.

It's the fastest way to let an AI chat with your Postgres safely.

5-minute quickstart

1. Configurepg-cdc.yml:

source:
  postgres:
    url: postgresql://localhost:5432/mydb
    schemas: [public]
storage:
  type: filesystem
  path: ./data

2. Run the daemon and the MCP server:

pg-cdc init && pg-cdc start &   # snapshot + stream WAL → ./data
pg-cdc mcp &                    # serve the local MCP endpoint (stdio)

The query and recent_changes MCP tools shell out to DuckDB. Install it once with brew install duckdb (macOS) or see duckdb.org/docs/installation/.

3. Point Claude Desktop at it — add to claude_desktop_config.json (use the absolute path to your config — Claude Desktop launches the subprocess from its own working directory, not yours):

{
  "mcpServers": {
    "pg-cdc": { "command": "pg-cdc", "args": ["mcp", "--config", "/absolute/path/to/pg-cdc.yml"] }
  }
}

Restart Claude. Ask "what's the latest row in orders?" — it answers from your real data.

Full walkthrough: docs/01-getting-started.md.

Why this, not just give Claude a DATABASE_URL?

pg-cdc Hand Claude DATABASE_URL
Local & private Data never leaves your machine Credentials shared with the model
Real-time WAL CDC — rows appear seconds after write One-shot, stale by next question
Safe & fast Parquet snapshot, prod untouched, columnar reads Live queries against prod, lock contention
Cost Free, single binary, no cloud Risk of a bad query

Architecture

1.png

Docs

Doc Description
Getting Started 5-minute MCP-first quickstart
Configuration Full YAML reference
Init Snapshot phase details
Streaming WAL streaming mechanics
Compaction Base + delta model, TTL semantics
MCP Server Tool reference, client wiring, troubleshooting, security model
Operations Production run-book, health checks, troubleshooting
Commercial Edition Governance, ACL, Lake Formation reconciliation

Install

Download binary — see Releases for the full list of platforms. Latest stable is v0.2.0.

# Linux (amd64)
curl -fsSL https://github.com/burnside-project/pg-cdc/releases/download/v0.2.0/pg-cdc_v0.2.0_linux_amd64.tar.gz | tar xz
sudo install -m 0755 pg-cdc-linux-amd64 /usr/local/bin/pg-cdc

# Linux (arm64)
curl -fsSL https://github.com/burnside-project/pg-cdc/releases/download/v0.2.0/pg-cdc_v0.2.0_linux_arm64.tar.gz | tar xz
sudo install -m 0755 pg-cdc-linux-arm64 /usr/local/bin/pg-cdc

# macOS (Apple Silicon)
curl -fsSL https://github.com/burnside-project/pg-cdc/releases/download/v0.2.0/pg-cdc_v0.2.0_darwin_arm64.tar.gz | tar xz
sudo install -m 0755 pg-cdc-darwin-arm64 /usr/local/bin/pg-cdc

Windows (amd64) — PowerShell:

Invoke-WebRequest https://github.com/burnside-project/pg-cdc/releases/download/v0.2.0/pg-cdc_v0.2.0_windows_amd64.zip -OutFile pg-cdc.zip
Expand-Archive pg-cdc.zip -DestinationPath .
# Move pg-cdc-windows-amd64.exe somewhere on your PATH, e.g.:
New-Item -ItemType Directory -Force "$env:USERPROFILE\bin" | Out-Null
Move-Item pg-cdc-windows-amd64.exe "$env:USERPROFILE\bin\pg-cdc.exe"
# Add %USERPROFILE%\bin to PATH if it isn't already.

Build from source:

git clone https://github.com/burnside-project/pg-cdc.git
cd pg-cdc
make build

Production deployment

The 5-minute quickstart above runs everything locally. For production deployments — running pg-cdc as a systemd service, writing to S3/GCS, registering Glue tables, configuring TLS to RDS — see the Operations guide.

Features

Source

  • PostgreSQL logical replication (pgx/v5, pglogrepl)
  • Per-schema discovery
  • Declarative table include/exclude rules
  • Tag-based table policy (e.g., pii, ephemeral tags → include/exclude)

Output

  • Typed Parquet (pure Go, no CGO)
  • Base snapshots + append-only delta epochs
  • Compaction into new base (applies I/U/D; soft-deletes on 30d TTL)
  • Manifest file per table (schema + epoch ordering)

Sinks

  • Filesystem
  • S3
  • GCS

Catalog

  • AWS Glue (optional; register manifest tables without re-snapshotting)

AI / MCP

  • Local MCP server (pg-cdc mcp) — read-only, single-user, stdio
  • Tools: list_tables, describe_table, query, recent_changes
  • Multi-user / authenticated MCP — commercial

Operations

  • Single static binary, no CGO
  • Linux amd64/arm64, macOS arm64, Windows amd64
  • SQLite state tracking (LSN, epoch watermarks, table metadata)
  • Role → table → column ACL discovery from PostgreSQL GRANTs
  • Automated RC releases on every push to main; stable releases via workflow dispatch

Commands

Command What it does
init Snapshot tables → base Parquet + manifest + (optional) Glue catalog
start Stream WAL → delta Parquet epochs
mcp Serve a local MCP endpoint over the Parquet output (read-only, single-user)
compact Merge deltas → new base snapshot (applies I/U/D; soft-deletes on TTL)
status Health: lag, LSN, epochs, tables
discover List tables from Postgres
discover --acl Show role → table → column access map from PostgreSQL GRANTs
teardown Drop publication + replication slot
catalog register Register manifest tables in Glue without re-snapshotting
version Print version

Full reference in docs/08-operations.md.

Configuration

Production example with S3 + Glue and tag-based policy:

source:
  postgres:
    url: "postgresql://cdc_user:${PGCDC_PASSWORD}@host:5432/db"
    schemas: ["public"]

storage:
  type: s3
  bucket: my-warehouse
  prefix: cdc/
  region: us-west-2

catalog:
  type: glue
  database: my_db
  region: us-west-2

tables:
  exclude: ["public.tbl_sessions"]
  tags:
    pii: ["public.tbl_cc", "billing.*"]
    ephemeral: ["*.tbl_session*"]
  policy:
    pii: exclude
    ephemeral: exclude
    untagged: include

Full reference: docs/02-configuration.md.

Open Core

The open-source edition is a complete, working product:

  • Full CDC pipeline — logical replication, base/delta Parquet output, compaction
  • Three sink adapters — filesystem, S3, GCS
  • Glue catalog registration (optional)
  • SQLite-backed state
  • Local MCP server (pg-cdc mcp) — single-user, read-only, stdio
  • Postgres-native ACL discovery (discover --acl)
  • Tag-based table inclusion / exclusion

Production governance, compliance, and multi-user features are commercial:

  • Layer-2 tag governance (policy-as-code) with required-tag enforcement
  • DynamoDB-backed ACL registry with versioned audit trail
  • AWS Lake Formation reconciliation (acl diff, acl sync)
  • Authenticated multi-user MCP server with row/column-level access control
  • Emergency-override workflows with expiry
  • Terraform stack for IAM / OIDC / governance provisioning
  • Extended CLI: pg-cdc acl register|get|set|diff|sync|list
  • HIPAA-ready deployment topology

See docs/commercial-edition.md for the end-to-end governance flow with screenshots: Parquet output → Glue Catalog → ACL workflow → DynamoDB registry → Lake Formation LF-Tags.

Related repos

Repo Role
pg-cdc (this repo) CDC server — WAL streaming, Parquet writing, compaction
burnside-go Shared types — manifest spec, storage interface
pg-warehouse Local-first analytical engine that can consume CDC output

Tech stack

Layer Technology
Language Go 1.25 (pure Go, no CGO)
CLI Cobra
PostgreSQL pgx/v5, pglogrepl
Parquet parquet-go (pure Go)
State SQLite (modernc.org/sqlite)
Storage Filesystem, S3, GCS
Platforms Linux amd64/arm64, macOS arm64, Windows amd64

Versioning

Release candidates auto-increment on every push to main against the version in VERSION: e.g. v0.2.0-rc1, v0.2.0-rc2, … When ready, a stable release is promoted from the latest RC via the Release workflow dispatch (creates the bare vX.Y.Z tag, marks the GitHub release as --latest, and renames the binaries).

Latest stable: v0.2.0 — see Releases. Maintainer guide: RELEASING.md.

Community

License

Apache License 2.0 — Copyright 2025-2026 Burnside Project

About

PostgreSQL CDC server. Streams WAL changes into typed, compacted Parquet files in cloud storage. Follows the native Postgres replica pattern.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors