postgres: add primary/secondary cluster sample app with traffic generator#122
Merged
postgres: add primary/secondary cluster sample app with traffic generator#122
Conversation
…enerator - Add cloud-init-primary-template.yaml: sets up PostgreSQL as a streaming replication primary with shared_preload_libraries, wal_level=replica, replication user, and pg_hba.conf entries for the standby - Add cloud-init-secondary-template.yaml: runs pg_basebackup from the primary (using -R to auto-configure standby.signal and primary_conninfo) and starts as a hot standby - Both nodes ship metrics and logs via Alloy using constants.hostname as the instance label, so primary and secondary appear as separate instances in dashboards - Add traffic/schema.sql: realistic e-commerce schema (users, products, orders, order_items) with seed data - Add traffic/load.sh: continuous mixed workload (reads, inserts, updates, deletes, lock contention, cleanup) with weighted random selection - Update Makefile: run launches the full cluster (launch-primary, wait-primary, launch-secondary); fix defaultconfig to use > instead of >> to avoid duplicate entries on repeated runs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…with contention workloads
- Add cluster="postgres-cluster" relabel rule to both primary and secondary
alloy configs so both nodes can be grouped/filtered by cluster in dashboards
- Add lock_contention workload (holds row locks for ~2s) to surface wait events
- Add slow_scan workload (full table joins without index hints) for query latency
- Add long_read workload (pg_sleep inside transaction) to trigger long_running_transactions collector
- Increase workload concurrency weights and reduce sleep to drive higher throughput
- Fix Jinja2 template issue: wrap ${#WORKLOADS[@]} in {% raw %}...{% endraw %} to
prevent Jinja2 interpreting {# as a comment tag
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Bump seed data to 2000 users, 500 products, 2000 orders for heavier workloads - Add 10 parallel workers (up from 5) for increased concurrency - Add workload_idle_in_transaction: holds row locks via shell-level sleep to produce real idle-in-transaction sessions in pg_stat_activity - Add workload_blocked: targets same rows to create genuine lock queue - Add workload_heavy_analytics and workload_slow_report for long-running queries - Fix auto_explain removal (caused setup script to abort before init flag) - Enable slow query logging: log_min_duration_statement=200ms, log_connections, log_disconnections, log_lock_waits, log_temp_files, log_checkpoints, log_autovacuum_min_duration=250ms, log_statement=ddl, log_line_prefix - Add load-gen to make run target so traffic starts automatically Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Make's line continuation replaces '\' + newline with a space, causing printf to output leading spaces before each key after the first line. This produced indented YAML keys which failed to parse in CI. Fix by using separate printf calls per line instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…late Instead of embedding load.sh and schema.sql inline in the cloud-init YAML, keep them as standalone files in traffic/ and transfer them to the VM via multipass transfer after the primary is up (new setup-traffic Makefile target). - traffic/schema.sql: bump to 2000 users, 500 products, 2000 orders - traffic/load.sh: port full parallel worker logic with idle-in-transaction, blocked, lock contention, heavy analytics, and slow report workloads - cloud-init-primary-template.yaml: remove embedded script write_files blocks - Makefile: add setup-traffic target; wire into run after wait-primary Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dasomeone
reviewed
Mar 11, 2026
Member
Dasomeone
left a comment
There was a problem hiding this comment.
Couple small things, otherwise LGTM:
- Delete the existing cloud-init file and remove references to it in makefile, given that
run-ciactually invokesrunwhich uses the new setup - For the matching integration, could you please rerun generation for metric_names file and update the one here as well?
Should let us check that any change in metrics is still covered by the sample app, not that I doubt that it is.
- Delete unused cloud-init-template.yaml (replaced by primary/secondary templates) - Fix run-ci target: remove duplicate load-gen (already included via run) - Update linux_metrics to match current enabled_collectors config Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dasomeone
approved these changes
Mar 11, 2026
Member
Dasomeone
left a comment
There was a problem hiding this comment.
LGTM, thanks for updating this (and the metrics list!) :D
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
traffic/load.sh+traffic/schema.sql) with an e-commerce schema (users, products, orders, order_items) and mixed read/write workloadspostgres_clusterlabel to all metrics via Alloy relabeling so both nodes can be grouped/filtered in GrafanaChanges
New files:
jinja/templates/cloud-init-primary-template.yaml— cloud-init for the primary: configures WAL replication,pg_stat_statements, creates replication user, seeds schema, starts Alloyjinja/templates/cloud-init-secondary-template.yaml— cloud-init for the secondary: runspg_basebackupagainst the primary and starts as a hot standbytraffic/schema.sql— e-commerce schema with 500 users, 200 products, 300 seeded orderstraffic/load.sh— continuous mixed workload with weighted random selectionUpdated:
Makefile— addedrun,run-primary,run-secondary,wait-primary,render-primary-config,render-secondary-config,launch-primary,launch-secondarytargetsHow to run
Test plan
pg_basebackupand starts as hot standby (pg_is_in_recovery() = true)job="integrations/postgres_exporter",instance=<hostname>,postgres_cluster="postgres-cluster"/var/log/postgresql/ship to Loki🤖 Generated with Claude Code