Skip to content

SPOC-442: Use NULL for unknown local_origin instead of -1#2

Open
rasifr wants to merge 22 commits into
rasifr:v5_STABLEfrom
pgEdge:task/SPOC-442/unknown-origin
Open

SPOC-442: Use NULL for unknown local_origin instead of -1#2
rasifr wants to merge 22 commits into
rasifr:v5_STABLEfrom
pgEdge:task/SPOC-442/unknown-origin

Conversation

@rasifr
Copy link
Copy Markdown
Owner

@rasifr rasifr commented Feb 18, 2026

SPOC-442: Use NULL for unknown local_origin instead of -1

When the origin of a local tuple cannot be determined (e.g., data loaded
via pg_dump, frozen transactions, or truncated commit timestamps), store
NULL in spock.resolutions.local_origin instead of the value 0
(InvalidRepOriginId).

Also change conflict log messages to print "unknown" instead of the
origin ID when the origin is InvalidRepOriginId or not available.

Bug fixes found during testing:

  • Fixed off-by-one error in spock_conflict_row_to_json() calls that were
    incorrectly passing &nulls[7] instead of &nulls[8] for local_tuple,
    and &nulls[11] instead of &nulls[12] for remote_tuple. This was
    overwriting the local_origin NULL flag.

Includes test case (014_pgdump_restore_conflict.pl) that reproduces the
pg_dump/restore scenario where rows have no replication origin tracking,
verifying that:

  • Conflicts are logged with local_origin = NULL
  • Resolution is apply_remote (correct behavior)
  • Data converges correctly despite conflicts

mason-sharp and others added 22 commits February 9, 2026 14:01
Allocate MyProcPort after BackgroundWorkerInitializeConnectionByOid
so that InitPostgres doesn't see a non-NULL MyProcPort and try to
inspect SSL/GSS state on our fake Port (which would segfault).
A psql script can't rely on a PGDATABASE value. Using the default value also
looks unreliable. Hence, let's require dbname and throw an exception if it is
not mentioned in the DSN string.

Also, centralise manipulations with the dbname string to ease further
potential changes.
A node is added under high load of 'blind' UPDATE transactions.
These transactions designed to update separate parts of the table on
different nodes to avoid any conflict issues during the test.
Now, each apply progress HTAB entry contains tangled values of
remote_commit_lsn and remote_commit_ts columns (see handle_commit). So, when
extracting data from the progress view, we can use any of these numbers, and
using LSN streamlines the check_commit_timestamp_and_advance_slot procedure:
we can directly point to the correct WAL record, avoiding the expensive
get_lsn_from_commit_ts() call.

So, replace the expensive search for the proper LSN with direct picking of
a specific WAL record.

TODO:
Further commits should also change the Python version. And consequently,
remove the function get_lsn_from_commit_ts() from the extension altogether.
Add health check from zodan.py to zodan.sql
Add verification from zodan.sql to zodan.py
Remove the following unused routines:
- get_commit_timestamp
- advance_replication_slot
- monitor_replication_lag(text, boolean)
- monitor_replication_lag_wait
- monitor_multiple_replication_lags
- wait_for_n3_sync
- check_node_lag

In the procedure monitor_lag_with_dblink, use the function clock_timestamp
instead of now(). The now() function is stable and doesn't change the value
within the monitoring loop. So, the volatile clock_timestamp looks more correct
to obtain fresh value of time.
  When adding a third node in a Z0DAN sync scenario, the origin advancement
  logic in spock.check_commit_timestamp_and_advance_slot was using
  spock.lag_tracker to retrieve commit timestamps and convert them back to LSNs.
  This approach no longer works because spock.progress is now an in-memory HTAB
  instead of a catalog table, so lag_tracker doesn't retain historical
  data after the SYNC process COPY operation.

  Root Cause:
  The procedure spock.create_disable_subscriptions_and_slots creates logical
  slots on existing nodes (e.g., n2) when adding a new node (n3). In v5,
  the commit LSN/timestamp from the source node (n1) was copied to n3 via
  lag_tracker during SYNC, and spock.check_commit_timestamp_and_advance_slot
  would use this to advance the origin. With the HTAB-based progress table, this
  data is no longer available after COPY.

  The Fix:
  1. Capture the LSN returned by pg_create_logical_replication_slot in
     create_disable_subscriptions_and_slots and store it in temp_sync_lsns
  2. Use this LSN directly in check_commit_timestamp_and_advance_slot to
     advance the origin, eliminating the dependency on lag_tracker and the
     timestamp→LSN conversion

  This approach is more reliable because it uses the authoritative LSN from
  slot creation - the exact point where the apply/sync process will begin
  decoding when the subscription is enabled.
Apply the same fix from commit 86acd7b to zodan.py:
- Use LSN from pg_create_logical_replication_slot()
- Advance slot to stored commit LSN instead of querying lag_tracker
- Store both sync_lsn and commit_lsn for later use
* zodan.sql: add subscriptions pre-check.

With this commit, we require that each node of the Spock cluster has only
enabled subscriptions. There is no algorithmic reason to do that, because when
adding the node, we are concerned only about the donor's state and
the consistency of its progress table. However, such a strict procedure ensures
that users can be sure they add a node to a healthy cluster, take action
beforehand if it is not healthy, and that replication doesn't delay WAL cutting.
Introduce a TAP test that could be used to check Z0DAN checks.

XXX: add_node finishes its job with subscriptions in the 'initialising' state.
Do we need to take any action here?

Also, fix the annoying WARNING on the 'exception_replay_queue_size'.

* Check subscription state in the wait_for_sync_event.

If no group's subscription is enabled, wait_for_sync_event may get stuck in
an infinite loop. Add a check of the subscription state in the waiting loop
to reflect the fact that an apply worker may quietly die, deactivating its
subscription.

Also, add the correct processing of this behaviour to the Z0DAN and
include a TAP test.

Original cherry-pick fbe0cbc, modified to move sql file changes
to sql/spock--5.0.4--5.0.5.sql
May currently timeout, we have coverage with 090 and 010.
Unrelated Zodan change had been slipped in.
Since we reverted cbc5bb1, fix another way and add a test.
…invalidation of newly created replication slots".
When the origin of a local tuple cannot be determined (e.g., data loaded
via pg_dump, frozen transactions, or truncated commit timestamps), store
NULL in spock.resolutions.local_origin instead of the value 0
(InvalidRepOriginId).

Also change conflict log messages to print "unknown" instead of the
origin ID when the origin is InvalidRepOriginId or not available.

Bug fixes found during testing:
- Fixed off-by-one error in spock_conflict_row_to_json() calls that were
  incorrectly passing &nulls[7] instead of &nulls[8] for local_tuple,
  and &nulls[11] instead of &nulls[12] for remote_tuple. This was
  overwriting the local_origin NULL flag.

Includes test case (014_pgdump_restore_conflict.pl) that reproduces the
pg_dump/restore scenario where rows have no replication origin tracking,
verifying that:
- Conflicts are logged with local_origin = NULL
- Resolution is apply_remote (correct behavior)
- Data converges correctly despite conflicts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants