Feature/spoc 266/resolutions table cleanup#3
Conversation
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Duplication | 0 |
TIP This summary will be updated as you push new changes. Give us feedback
d0a5e22 to
053a7b7
Compare
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a GUC Changes
Sequence DiagramsequenceDiagram
participant AW as Apply Worker
participant CF as spock_cleanup_resolutions()
participant SPI as SPI Interface
participant DB as Database
loop apply_work() iterations
AW->>AW: Not in transaction? and retention_days > 0 and interval due?
alt cleanup due
AW->>CF: invoke cleanup
CF->>CF: StartTransactionCommand & PushActiveSnapshot
CF->>SPI: SPI_connect + SPI_execute(DELETE WHERE log_time < cutoff)
SPI->>DB: execute DELETE
DB-->>SPI: return affected rows
SPI-->>CF: SPI_processed count
CF->>CF: PopSnapshot & CommitTransaction
CF-->>AW: return deleted count
else skip
AW->>AW: continue processing
end
AW->>AW: update last_cleanup_timestamp (if run)
end
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
src/spock_conflict.c (2)
962-967: AbortCurrentTransaction should precede SPI_finish when aborting.When an error occurs and the transaction needs to be aborted,
AbortCurrentTransaction()should be called beforeSPI_finish(). After an error, SPI connection state may be inconsistent, and callingSPI_finish()on a failed transaction can lead to issues. Additionally, the snapshot should only be popped if we pushed it (i.e., ifstarted_txis true), but in the error path this is correctly checked.However, the more robust pattern is to let
AbortCurrentTransaction()handle the cleanup:♻️ Proposed fix for cleanup ordering
MemoryContextSwitchTo(CurrentMemoryContext); edata = CopyErrorData(); FlushErrorState(); - SPI_finish(); if (started_tx) { PopActiveSnapshot(); AbortCurrentTransaction(); } + else + { + /* Only finish SPI if we didn't abort - abort handles cleanup */ + SPI_finish(); + } ereport(WARNING, (errcode(edata->sqlerrcode), errmsg("%s", edata->message)));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_conflict.c` around lines 962 - 967, The cleanup ordering is wrong: when aborting you must call AbortCurrentTransaction() before SPI_finish() and only PopActiveSnapshot() if started_tx is true; update the block that currently calls SPI_finish(); if (started_tx) { PopActiveSnapshot(); AbortCurrentTransaction(); } to instead perform AbortCurrentTransaction() first (so AbortCurrentTransaction(); if (started_tx) PopActiveSnapshot();) and then call SPI_finish() after the abort to let the transaction abort handle SPI cleanup safely (adjust code around SPI_finish, AbortCurrentTransaction, PopActiveSnapshot and started_tx accordingly).
919-921: Consider using quote_identifier for the table name.While
CATALOG_LOGTABLEis a compile-time constant and currently safe, usingquote_identifier()for SQL identifiers is a defensive practice that protects against future changes where the table name might contain special characters.♻️ Optional improvement
initStringInfo(&cmd); appendStringInfo(&cmd, - "DELETE FROM spock.%s WHERE log_time < now() - '%d days'::interval", - CATALOG_LOGTABLE, spock_resolutions_retention_days); + "DELETE FROM spock.%s WHERE log_time < CURRENT_TIMESTAMP - INTERVAL '%d days'", + quote_identifier(CATALOG_LOGTABLE), spock_resolutions_retention_days);Note: Also changed
now()toCURRENT_TIMESTAMPfor SQL standard compliance (though both work identically in PostgreSQL).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_conflict.c` around lines 919 - 921, The SQL being built with appendStringInfo uses the raw identifier CATALOG_LOGTABLE; replace that with a properly quoted identifier by using quote_identifier(CATALOG_LOGTABLE) when constructing the SQL string (keep appendStringInfo and cmd, and still interpolate spock_resolutions_retention_days as before) so the table name is safely escaped; optionally also consider switching now() to CURRENT_TIMESTAMP if you want SQL-standard timestamp usage.tests/regress/sql/resolutions_retention.sql (2)
99-101: Consider resetting session-level GUC at cleanup.The test sets
spock.resolutions_retention_daysat the session level (lines 49, 68, 80) but doesn't reset it at the end. While this may not affect subsequent tests due to session isolation, adding aRESET spock.resolutions_retention_days;would make the cleanup more complete.♻️ Optional: Reset session GUC
\c :subscriber_dsn +RESET spock.resolutions_retention_days; ALTER SYSTEM SET spock.save_resolutions = off; SELECT pg_reload_conf();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/regress/sql/resolutions_retention.sql` around lines 99 - 101, Add a session-level RESET for the GUC that the test modified: after the test cleanup block (where you currently run ALTER SYSTEM SET spock.save_resolutions and pg_reload_conf), add a RESET spock.resolutions_retention_days; to explicitly clear the session-level setting changed earlier (set commands that used spock.resolutions_retention_days) so the session is returned to defaults for subsequent tests.
82-83: pg_sleep(0.5) may be insufficient for reliable config reload.The
pg_reload_conf()call is asynchronous - it signals the postmaster to reload configuration, but the actual reload in backend processes happens asynchronously. A 0.5 second sleep may be insufficient under load or on slower CI systems, potentially causing test flakiness.Consider using a polling approach or increasing the sleep duration for reliability.
♻️ More robust config reload wait
ALTER SYSTEM SET spock.save_resolutions = off; SELECT pg_reload_conf(); -SELECT pg_sleep(0.5); +-- Wait for config reload to take effect +SELECT pg_sleep(1);Alternatively, if Spock provides a way to check the current effective value of
spock.save_resolutionsfrom the apply worker's perspective, polling that would be more reliable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/regress/sql/resolutions_retention.sql` around lines 82 - 83, The fixed pg_sleep(0.5) after calling pg_reload_conf() is flaky; change this to a polling loop that calls pg_reload_conf() once then repeatedly checks the effective setting (e.g. SELECT current_setting('spock.save_resolutions') or SELECT setting FROM pg_settings WHERE name='spock.save_resolutions') until it matches the expected value, sleeping briefly (pg_sleep(0.1)) between attempts and failing after a timeout (e.g. 5s); replace the pg_sleep(0.5) line with that polling logic to reliably wait for the config to take effect.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/spock_conflict.c`:
- Around line 948-972: The PG_CATCH block in spock_cleanup_resolutions uses
MemoryContextSwitchTo(MessageContext), but MessageContext is only created by
apply_work() and may not exist when called from the SQL wrapper; change the
switch to a guaranteed context (e.g.,
MemoryContextSwitchTo(CurrentMemoryContext) or
MemoryContextSwitchTo(TopMemoryContext)) so the error handling cannot crash,
ensure the rest of the cleanup (CopyErrorData, FlushErrorState, SPI_finish,
PopActiveSnapshot/AbortCurrentTransaction) runs in that safe context, and keep
using FreeErrorData(edata) and ereport(WARNING, ...) as before.
---
Nitpick comments:
In `@src/spock_conflict.c`:
- Around line 962-967: The cleanup ordering is wrong: when aborting you must
call AbortCurrentTransaction() before SPI_finish() and only PopActiveSnapshot()
if started_tx is true; update the block that currently calls SPI_finish(); if
(started_tx) { PopActiveSnapshot(); AbortCurrentTransaction(); } to instead
perform AbortCurrentTransaction() first (so AbortCurrentTransaction(); if
(started_tx) PopActiveSnapshot();) and then call SPI_finish() after the abort to
let the transaction abort handle SPI cleanup safely (adjust code around
SPI_finish, AbortCurrentTransaction, PopActiveSnapshot and started_tx
accordingly).
- Around line 919-921: The SQL being built with appendStringInfo uses the raw
identifier CATALOG_LOGTABLE; replace that with a properly quoted identifier by
using quote_identifier(CATALOG_LOGTABLE) when constructing the SQL string (keep
appendStringInfo and cmd, and still interpolate spock_resolutions_retention_days
as before) so the table name is safely escaped; optionally also consider
switching now() to CURRENT_TIMESTAMP if you want SQL-standard timestamp usage.
In `@tests/regress/sql/resolutions_retention.sql`:
- Around line 99-101: Add a session-level RESET for the GUC that the test
modified: after the test cleanup block (where you currently run ALTER SYSTEM SET
spock.save_resolutions and pg_reload_conf), add a RESET
spock.resolutions_retention_days; to explicitly clear the session-level setting
changed earlier (set commands that used spock.resolutions_retention_days) so the
session is returned to defaults for subsequent tests.
- Around line 82-83: The fixed pg_sleep(0.5) after calling pg_reload_conf() is
flaky; change this to a polling loop that calls pg_reload_conf() once then
repeatedly checks the effective setting (e.g. SELECT
current_setting('spock.save_resolutions') or SELECT setting FROM pg_settings
WHERE name='spock.save_resolutions') until it matches the expected value,
sleeping briefly (pg_sleep(0.1)) between attempts and failing after a timeout
(e.g. 5s); replace the pg_sleep(0.5) line with that polling logic to reliably
wait for the config to take effect.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 620d9eca-4f8f-4d4b-8fa6-b5da682eb0af
⛔ Files ignored due to path filters (1)
tests/regress/expected/resolutions_retention.outis excluded by!**/*.out
📒 Files selected for processing (10)
Makefiledocs/configuring.mddocs/spock_functions/functions/spock_cleanup_resolutions.mdinclude/spock_conflict.hsql/spock--6.0.0-devel.sqlsrc/spock.csrc/spock_apply.csrc/spock_conflict.csrc/spock_functions.ctests/regress/sql/resolutions_retention.sql
Cover all three modes (off, local, all), superuser exemption, backward-compatible 'user' alias, user-imposed transaction_read_only preservation, and ALTER SYSTEM with separate non-superuser connection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Utility commands: verify VACUUM and ANALYZE pass through (native Postgres behavior, no PreventCommandIfReadOnly for maintenance ops) - SECURITY DEFINER/INVOKER: both blocked — XactReadOnly is a transaction-level flag, SECURITY DEFINER only switches effective user for permission checks (also native Postgres behavior) - Replication: apply worker continues in 'local' mode, pauses gracefully in 'all' mode, resumes after reset with no data loss - Auto DDL: blocked DDL not replicated when read-only is active - repair_mode(): superuser can make local-only writes that are not replicated to subscribers - Remove pg_sleep after pg_reload_conf in ALTER SYSTEM tests - Change test table ownership to nonsuper for VACUUM/ANALYZE tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ALTER SYSTEM for 'all' mode now runs on subscriber
- Add walsender PID check to prove apply worker survives readonly
(wait, not crash-and-restart)
- Combine VACUUM and ANALYZE into single VACUUM ANALYZE command
- Reword utility command comment to explain reasoning (no schema change,
no WAL that affects replication) rather than implementation details
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…QL log - New docs/conflict_types.md covering all 7 conflict types, resolution strategies, frozen tuple handling, and comparison with PostgreSQL 18 - Detect secondary unique constraint violations during UPDATE apply and report them as update_exists to the PostgreSQL log and PG18 stats - Origin-differs events (update_origin_differs, delete_origin_differs) are no longer written to spock.resolutions when the local tuple wins; previously only the remote-wins case was excluded - Add conflict_types.md to mkdocs.yml nav and cross-link with conflicts.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests that a secondary unique constraint violation during UPDATE apply is detected, logged to spock.exception_log, and counted in PG18 conflict statistics (confl_update_exists). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When check_all_uc_indexes resolves the first conflict and a second unique constraint violation occurs, the error is recorded in spock.exception_log, not spock.resolutions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
053a7b7 to
5dad6a7
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/spock_apply.c`:
- Around line 3057-3074: Only invoke spock_cleanup_resolutions() when the worker
is truly idle (no open upstream/user transaction); add a guard before the
existing cleanup block that detects whether a transaction is active (e.g. check
transaction state/transaction block flag used in your codebase) and skip the
cleanup if a transaction is open, and only advance last_cleanup_timestamp after
spock_cleanup_resolutions() actually ran successfully; reference
spock_cleanup_resolutions(), last_cleanup_timestamp, spock_save_resolutions,
spock_resolutions_retention_days and RESOLUTIONS_CLEANUP_INTERVAL_MS to locate
and update the logic.
In `@src/spock_conflict.c`:
- Around line 898-899: The early return in the cleanup routine currently checks
both spock_save_resolutions and spock_resolutions_retention_days, which prevents
cleanup when logging is disabled; change the condition in the cleanup function
(cleanup_resolutions / spock.cleanup_resolutions handler in
src/spock_conflict.c) to only return when spock_resolutions_retention_days <= 0
so retention_days controls disabling cleanup, and leave spock_save_resolutions
to only gate inserts; apply the same change to the scheduler gate in
src/spock_apply.c so it also checks only spock_resolutions_retention_days <= 0
and not spock_save_resolutions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7b3b589f-20fe-4f38-82e9-0f348a41931f
⛔ Files ignored due to path filters (1)
tests/regress/expected/resolutions_retention.outis excluded by!**/*.out
📒 Files selected for processing (10)
Makefiledocs/configuring.mddocs/spock_functions/functions/spock_cleanup_resolutions.mdinclude/spock_conflict.hsql/spock--6.0.0-devel.sqlsrc/spock.csrc/spock_apply.csrc/spock_conflict.csrc/spock_functions.ctests/regress/sql/resolutions_retention.sql
✅ Files skipped from review due to trivial changes (3)
- Makefile
- docs/configuring.md
- tests/regress/sql/resolutions_retention.sql
🚧 Files skipped from review as they are similar to previous changes (4)
- src/spock.c
- include/spock_conflict.h
- sql/spock--6.0.0-devel.sql
- docs/spock_functions/functions/spock_cleanup_resolutions.md
5dad6a7 to
20b704b
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/spock_conflict.c (1)
911-917:⚠️ Potential issue | 🟠 MajorDon't hide cleanup failures behind a zero-row return.
When
started_txis false, this is executing inside the caller's transaction. Catching aSPI_execute()failure here, emittingWARNING, and returning0makes failure indistinguishable from "nothing to delete," so the apply-worker scheduler has no way to avoid treating a failed run as successful. More importantly, this path doesn't create a subtransaction around the SPI work, so continuing after an ERROR is unsafe for manual SQL calls. I'd keep the best-effort swallow only for the self-managed worker path, and otherwise either rethrow or wrap the SPI section in an internal subtransaction and return success separately fromndeleted.In PostgreSQL extension C code, can a function safely catch an ERROR from SPI_execute inside an existing transaction without using BeginInternalSubTransaction()/RollbackAndReleaseCurrentSubTransaction(), and then continue execution?Also applies to: 931-980
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_conflict.c` around lines 911 - 917, The current SPI call path in spock_conflict.c that runs when started_tx is false must not swallow SPI_execute errors and return 0, because that hides failures and is unsafe without a subtransaction; modify the code so that for the self-managed path (when started_tx is true) you keep the best-effort swallow behavior, but when started_tx is false either let the ERROR propagate (do not catch/suppress) or wrap the SPI work in an internal subtransaction using BeginInternalSubTransaction()/RollbackAndReleaseCurrentSubTransaction() (and ReleaseCurrentSubTransaction() on success) around the SPI_execute call so you can safely catch errors and distinguish a real error from ndeleted; update the logic in the functions/blocks around StartTransactionCommand(), PushActiveSnapshot(), and the SPI execute/cleanup sections referenced in this diff so ndeleted is only used for row counts and errors are handled via subtransaction rollback or rethrow.
🧹 Nitpick comments (1)
src/spock_apply.c (1)
2953-2953: Seed the first cleanup independently of worker uptime.Initializing
last_cleanup_timestampfromGetCurrentTimestamp()means the first automatic purge won't happen until this worker has stayed alive for 24 hours. In environments where apply workers restart regularly, the periodic cleanup can be postponed indefinitely. If the intent is truly "daily cleanup," this needs either a persisted last-successful-run timestamp or an initial value that lets the first pass run immediately.Also applies to: 3068-3074
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_apply.c` at line 2953, The code sets TimestampTz last_cleanup_timestamp = GetCurrentTimestamp(), which delays the first daily purge until the worker has been alive for the full interval; change initialization so the first cleanup can run immediately — e.g. initialize last_cleanup_timestamp to 0 (or to GetCurrentTimestamp() - cleanup_interval) instead of GetCurrentTimestamp() in the spock_apply worker code (the last_cleanup_timestamp variable and the cleanup loop logic), and make the same change to the other occurrence(s) of last_cleanup_timestamp initialization found in this file so the first automatic purge is not postponed by worker restarts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/spock_conflict.c`:
- Around line 911-917: The current SPI call path in spock_conflict.c that runs
when started_tx is false must not swallow SPI_execute errors and return 0,
because that hides failures and is unsafe without a subtransaction; modify the
code so that for the self-managed path (when started_tx is true) you keep the
best-effort swallow behavior, but when started_tx is false either let the ERROR
propagate (do not catch/suppress) or wrap the SPI work in an internal
subtransaction using
BeginInternalSubTransaction()/RollbackAndReleaseCurrentSubTransaction() (and
ReleaseCurrentSubTransaction() on success) around the SPI_execute call so you
can safely catch errors and distinguish a real error from ndeleted; update the
logic in the functions/blocks around StartTransactionCommand(),
PushActiveSnapshot(), and the SPI execute/cleanup sections referenced in this
diff so ndeleted is only used for row counts and errors are handled via
subtransaction rollback or rethrow.
---
Nitpick comments:
In `@src/spock_apply.c`:
- Line 2953: The code sets TimestampTz last_cleanup_timestamp =
GetCurrentTimestamp(), which delays the first daily purge until the worker has
been alive for the full interval; change initialization so the first cleanup can
run immediately — e.g. initialize last_cleanup_timestamp to 0 (or to
GetCurrentTimestamp() - cleanup_interval) instead of GetCurrentTimestamp() in
the spock_apply worker code (the last_cleanup_timestamp variable and the cleanup
loop logic), and make the same change to the other occurrence(s) of
last_cleanup_timestamp initialization found in this file so the first automatic
purge is not postponed by worker restarts.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 74b95163-8b1b-4d00-ba7e-628352483ac2
⛔ Files ignored due to path filters (1)
tests/regress/expected/resolutions_retention.outis excluded by!**/*.out
📒 Files selected for processing (10)
Makefiledocs/configuring.mddocs/spock_functions/functions/spock_cleanup_resolutions.mdinclude/spock_conflict.hsql/spock--6.0.0-devel.sqlsrc/spock.csrc/spock_apply.csrc/spock_conflict.csrc/spock_functions.ctests/regress/sql/resolutions_retention.sql
✅ Files skipped from review due to trivial changes (3)
- docs/configuring.md
- docs/spock_functions/functions/spock_cleanup_resolutions.md
- tests/regress/sql/resolutions_retention.sql
🚧 Files skipped from review as they are similar to previous changes (4)
- Makefile
- sql/spock--6.0.0-devel.sql
- src/spock.c
- src/spock_functions.c
20b704b to
d7fecc1
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
src/spock_apply.c (1)
3072-3073:⚠️ Potential issue | 🟠 MajorDon't stamp the cleanup interval after warning-only failures.
spock_cleanup_resolutions()currently downgrades SPI failures to aWARNINGand returns0, so Line 3073 suppresses retries even when no purge actually ran. Please have the helper report success separately from the deleted-row count and only advancelast_cleanup_timestampon success.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_apply.c` around lines 3072 - 3073, spock_cleanup_resolutions currently returns a deleted-row count but hides SPI failures by returning 0, causing the caller to advance last_cleanup_timestamp even when no purge ran; modify spock_cleanup_resolutions to return a struct or out-parameter (or set a boolean) indicating true success vs. failure distinct from the deleted-row count, update the caller that uses last_cleanup_timestamp and GetCurrentTimestamp() to only set last_cleanup_timestamp when the helper reports success, and preserve the deleted-row count semantics for reporting when success is true (refer to spock_cleanup_resolutions and last_cleanup_timestamp).tests/regress/sql/resolutions_retention.sql (1)
77-91:⚠️ Potential issue | 🟠 MajorThis regression case is asserting the old
save_resolutionsbehavior.
spock.cleanup_resolutions()no longer checksspock.save_resolutions, so after backdating the row and settingspock.resolutions_retention_days = 30, Line 88 should still delete it. As written, this test will expect the wrong outcome (rows_deleted = 1,remaining = 0with the current implementation).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/regress/sql/resolutions_retention.sql` around lines 77 - 91, The test assumes old behavior where spock.cleanup_resolutions() respected the spock.save_resolutions setting; since cleanup no longer checks spock.save_resolutions, update the regression test in resolutions_retention.sql to expect deletion: after backdating the row and setting spock.resolutions_retention_days = 30, assert that spock.cleanup_resolutions() returns 1 and the COUNT(*) remaining from spock.resolutions WHERE relname = 'public.retention_test' is 0, and remove or adjust the comment that claims "save_resolutions=off suppresses cleanup". Ensure the test uses the function name spock.cleanup_resolutions() and the configuration spock.save_resolutions in its updated expectations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@sql/spock--6.0.0-devel.sql`:
- Around line 354-359: Add the missing CREATE INDEX ON spock.resolutions
(log_time) and the CREATE FUNCTION spock.cleanup_resolutions() ...
spock_cleanup_resolutions_sql plus the REVOKE ALL ON FUNCTION
spock.cleanup_resolutions() FROM PUBLIC into the upgrade script so upgrades from
5.0.6 create the same objects as fresh installs; locate the upgrade SQL block
that applies changes for 5.0.6->6.0.0 in sql/spock--5.0.6--6.0.0-devel.sql and
insert the index creation and the function definition (with LANGUAGE c,
MODULE_PATHNAME and spock_cleanup_resolutions_sql symbol) and the revoke
statement in the appropriate transactional section so the upgrade creates the
index and function when run.
In `@src/spock_conflict.c`:
- Line 90: The global default enables automatic purging; change the retention
GUC to be opt-in by setting spock_resolutions_retention_days default to 0
instead of 100 and ensure any code that checks this variable
(spock_resolutions_retention_days) treats 0 as “disabled” (no automatic purge)
so only clusters that explicitly set the GUC enable retention; update related
logic that schedules or performs daily cleanup to skip when
spock_resolutions_retention_days == 0.
In `@src/spock_functions.c`:
- Around line 3143-3151: The SQL wrapper spock_cleanup_resolutions_sql calls
spock_cleanup_resolutions which can reuse the caller's transaction (started_tx
== false) and currently converts SPI errors to warnings with FlushErrorState(),
leaving the caller's transaction ABORTED; instead, when started_tx is false wrap
the SPI work in a subtransaction via BeginInternalSubTransaction() and on error
RollbackAndReleaseCurrentSubTransaction() (and then surface/ereport the error or
return an error status), or alternatively stop converting errors to warnings so
the caller's transaction is not silently poisoned; update
spock_cleanup_resolutions (and its PG_CATCH path) to use
BeginInternalSubTransaction()/RollbackAndReleaseCurrentSubTransaction() when
started_tx == false and avoid FlushErrorState()+ereport(WARNING) that hides the
aborted transaction state.
---
Duplicate comments:
In `@src/spock_apply.c`:
- Around line 3072-3073: spock_cleanup_resolutions currently returns a
deleted-row count but hides SPI failures by returning 0, causing the caller to
advance last_cleanup_timestamp even when no purge ran; modify
spock_cleanup_resolutions to return a struct or out-parameter (or set a boolean)
indicating true success vs. failure distinct from the deleted-row count, update
the caller that uses last_cleanup_timestamp and GetCurrentTimestamp() to only
set last_cleanup_timestamp when the helper reports success, and preserve the
deleted-row count semantics for reporting when success is true (refer to
spock_cleanup_resolutions and last_cleanup_timestamp).
In `@tests/regress/sql/resolutions_retention.sql`:
- Around line 77-91: The test assumes old behavior where
spock.cleanup_resolutions() respected the spock.save_resolutions setting; since
cleanup no longer checks spock.save_resolutions, update the regression test in
resolutions_retention.sql to expect deletion: after backdating the row and
setting spock.resolutions_retention_days = 30, assert that
spock.cleanup_resolutions() returns 1 and the COUNT(*) remaining from
spock.resolutions WHERE relname = 'public.retention_test' is 0, and remove or
adjust the comment that claims "save_resolutions=off suppresses cleanup". Ensure
the test uses the function name spock.cleanup_resolutions() and the
configuration spock.save_resolutions in its updated expectations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f264876d-c193-477b-aac5-b26fde5cc199
⛔ Files ignored due to path filters (1)
tests/regress/expected/resolutions_retention.outis excluded by!**/*.out
📒 Files selected for processing (10)
Makefiledocs/configuring.mddocs/spock_functions/functions/spock_cleanup_resolutions.mdinclude/spock_conflict.hsql/spock--6.0.0-devel.sqlsrc/spock.csrc/spock_apply.csrc/spock_conflict.csrc/spock_functions.ctests/regress/sql/resolutions_retention.sql
✅ Files skipped from review due to trivial changes (3)
- docs/configuring.md
- Makefile
- docs/spock_functions/functions/spock_cleanup_resolutions.md
🚧 Files skipped from review as they are similar to previous changes (1)
- src/spock.c
| int spock_conflict_resolver = SPOCK_RESOLVE_LAST_UPDATE_WINS; | ||
| int spock_conflict_log_level = LOG; | ||
| bool spock_save_resolutions = false; | ||
| int spock_resolutions_retention_days = 100; |
There was a problem hiding this comment.
Make the retention policy opt-in.
With the daily worker cleanup added in this PR, Line 90 turns automatic purging on for any upgraded cluster that already has spock.save_resolutions = on. That is a destructive behavior change for conflict/audit history; default this GUC to 0 and require explicit enablement instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/spock_conflict.c` at line 90, The global default enables automatic
purging; change the retention GUC to be opt-in by setting
spock_resolutions_retention_days default to 0 instead of 100 and ensure any code
that checks this variable (spock_resolutions_retention_days) treats 0 as
“disabled” (no automatic purge) so only clusters that explicitly set the GUC
enable retention; update related logic that schedules or performs daily cleanup
to skip when spock_resolutions_retention_days == 0.
d7fecc1 to
3b634b3
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/spock_conflict.c (1)
90-90:⚠️ Potential issue | 🟠 MajorMake retention opt-in.
This still turns automatic purging on for upgraded clusters that already rely on
spock.resolutionshistory. A default of100makes the new retention policy destructive by default;0is the safer default and keeps cleanup behind explicit enablement.🛡️ Safer default
-int spock_resolutions_retention_days = 100; +int spock_resolutions_retention_days = 0;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_conflict.c` at line 90, The global variable spock_resolutions_retention_days is initialized to 100 which enables automatic purging by default; change its default to 0 to make retention opt-in (no automatic purge unless explicitly configured). Locate the declaration/initialization of spock_resolutions_retention_days in src/spock_conflict.c and update the initializer to 0, and optionally adjust any accompanying comment or documentation around spock_resolutions_retention_days to state that 0 disables automatic retention/purging.tests/regress/sql/resolutions_retention.sql (1)
77-91:⚠️ Potential issue | 🟠 MajorThis scenario is still asserting the old
save_resolutionsbehavior.
spock_cleanup_resolutions_sql()now only short-circuits whenspock.resolutions_retention_days <= 0. Turningspock.save_resolutionsoff stops new rows from being logged, but it does not keep already logged rows from being deleted, so this block will fail as written. Please invert/remove this case and sync the matching paragraph indocs/spock_functions/functions/spock_cleanup_resolutions.md.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/regress/sql/resolutions_retention.sql` around lines 77 - 91, The test assumes spock.save_resolutions = off prevents deletion, but the code now only short-circuits when spock.resolutions_retention_days <= 0; update the test in resolutions_retention.sql to remove or invert the "save_resolutions=off" case and instead exercise the correct behavior (e.g., set spock.resolutions_retention_days <= 0 to suppress cleanup and assert rows remain, or assert that with save_resolutions=off rows are still deleted by spock.cleanup_resolutions()), and update the matching documentation paragraph in spock_cleanup_resolutions.md to reflect that spock.save_resolutions only stops new logging, while spock.resolutions_retention_days controls retention.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/spock_conflict.c`:
- Around line 962-975: The catch block is unconditionally calling
PopActiveSnapshot() and SPI_finish() which can double-cleanup if those were
already executed (e.g., PopActiveSnapshot() right before
CommitTransactionCommand() and SPI_finish() inside
spock_cleanup_resolutions_core()); modify the function to track state with
booleans (e.g., snapshot_pushed and spi_connected) set true when
PopActiveSnapshot()/SPI_connect()/SPI_finish()-relevant regions are performed,
and in PG_CATCH only call PopActiveSnapshot() and SPI_finish() if the
corresponding flag is true (or alternatively remove those calls and rely on
AbortCurrentTransaction() for transaction-owned cleanup), updating references in
spock_cleanup_resolutions_core(), the try block around
CommitTransactionCommand(), and the PG_CATCH accordingly.
In `@tests/regress/sql/resolutions_retention.sql`:
- Around line 6-13: After calling pg_reload_conf() on the subscriber, wait until
the subscriber apply worker observes the new GUC by polling the setting (e.g.,
repeatedly SHOW spock.save_resolutions or query pg_settings for
name='spock.save_resolutions') and only proceed when it equals 'on'; this
replaces the single asynchronous pg_reload_conf() call with a short loop (with
small sleeps) that confirms the subscriber's runtime config has been updated
before continuing so the test is not flaky.
---
Duplicate comments:
In `@src/spock_conflict.c`:
- Line 90: The global variable spock_resolutions_retention_days is initialized
to 100 which enables automatic purging by default; change its default to 0 to
make retention opt-in (no automatic purge unless explicitly configured). Locate
the declaration/initialization of spock_resolutions_retention_days in
src/spock_conflict.c and update the initializer to 0, and optionally adjust any
accompanying comment or documentation around spock_resolutions_retention_days to
state that 0 disables automatic retention/purging.
In `@tests/regress/sql/resolutions_retention.sql`:
- Around line 77-91: The test assumes spock.save_resolutions = off prevents
deletion, but the code now only short-circuits when
spock.resolutions_retention_days <= 0; update the test in
resolutions_retention.sql to remove or invert the "save_resolutions=off" case
and instead exercise the correct behavior (e.g., set
spock.resolutions_retention_days <= 0 to suppress cleanup and assert rows
remain, or assert that with save_resolutions=off rows are still deleted by
spock.cleanup_resolutions()), and update the matching documentation paragraph in
spock_cleanup_resolutions.md to reflect that spock.save_resolutions only stops
new logging, while spock.resolutions_retention_days controls retention.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 66df2aa0-99b7-410e-bdba-2ed259150bf7
⛔ Files ignored due to path filters (1)
tests/regress/expected/resolutions_retention.outis excluded by!**/*.out
📒 Files selected for processing (10)
Makefiledocs/configuring.mddocs/spock_functions/functions/spock_cleanup_resolutions.mdinclude/spock_conflict.hsql/spock--5.0.6--6.0.0-devel.sqlsql/spock--6.0.0-devel.sqlsrc/spock.csrc/spock_apply.csrc/spock_conflict.ctests/regress/sql/resolutions_retention.sql
✅ Files skipped from review due to trivial changes (1)
- docs/configuring.md
🚧 Files skipped from review as they are similar to previous changes (3)
- src/spock.c
- src/spock_apply.c
- Makefile
9f166ae to
e83a5ef
Compare
|
@CodeRabbit full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/spock_apply.c (1)
2953-3074:⚠️ Potential issue | 🟠 MajorPersist the cleanup throttle outside the worker process.
Line 2953 initializes
last_cleanup_timestampto0, so every apply-worker restart makes Lines 3068-3074 treat cleanup as immediately due again. That breaks the "at most once per day" behavior and can re-run the retentionDELETE/index scan after each restart. Move the last-run timestamp into shared or durable state, or designate a single cleanup owner instead of tracking it in a per-process local.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_apply.c` around lines 2953 - 3074, The per-process variable last_cleanup_timestamp being initialized to 0 causes spock_cleanup_resolutions() to run after every worker restart; persist the cleanup throttle outside the worker or centralize ownership: replace the local TimestampTz last_cleanup_timestamp with a shared/durable last-run value (e.g. a column in a control table or a shared memory slot) that is read before the WL_TIMEOUT check and updated after spock_cleanup_resolutions() completes, or elect a single cleanup owner (leader election using MyApplyWorker->apply_group or MySpockWorker identity) to perform the timestamp checks and call spock_cleanup_resolutions() so that RESOLUTIONS_CLEANUP_INTERVAL_MS and spock_resolutions_retention_days enforce "at most once per day" across restarts.
♻️ Duplicate comments (1)
src/spock_conflict.c (1)
90-90:⚠️ Potential issue | 🟠 MajorMake retention cleanup opt-in by default.
Line 90 turns automatic purging on for upgraded clusters immediately, which is a destructive behavior change for conflict/audit history. Since the rest of the code already treats
<= 0as “disabled,” default this GUC to0and require explicit enablement.Suggested change
-int spock_resolutions_retention_days = 100; +int spock_resolutions_retention_days = 0;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_conflict.c` at line 90, The global default for automatic purging is currently enabled by initializing spock_resolutions_retention_days to 100; change the default to 0 so retention cleanup is opt-in instead of destructive on upgrade by initializing spock_resolutions_retention_days to 0 (the code already treats <= 0 as disabled), and update any related initialization comments or documentation near the spock_resolutions_retention_days declaration to reflect that explicit enablement is required.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/spock_conflict.c`:
- Around line 954-968: The transaction/snapshot setup (StartTransactionCommand()
and PushActiveSnapshot(GetTransactionSnapshot())) must be moved inside the
PG_TRY() block so that any ERROR during startup is handled by the exception
handler; update the code to call SetCurrentStatementStartTimestamp() before
entering PG_TRY(), then inside PG_TRY() call StartTransactionCommand() and
PushActiveSnapshot(GetTransactionSnapshot()) before invoking
spock_cleanup_resolutions_core(); add a local boolean (e.g., startup_succeeded)
set to true only after both startup calls succeed and use that flag in the
exception/cleanup path to decide whether to call
PopActiveSnapshot()/CommitTransactionCommand() or to call
AbortCurrentTransaction(), ensuring AbortCurrentTransaction() is only invoked if
the transaction was actually started.
---
Outside diff comments:
In `@src/spock_apply.c`:
- Around line 2953-3074: The per-process variable last_cleanup_timestamp being
initialized to 0 causes spock_cleanup_resolutions() to run after every worker
restart; persist the cleanup throttle outside the worker or centralize
ownership: replace the local TimestampTz last_cleanup_timestamp with a
shared/durable last-run value (e.g. a column in a control table or a shared
memory slot) that is read before the WL_TIMEOUT check and updated after
spock_cleanup_resolutions() completes, or elect a single cleanup owner (leader
election using MyApplyWorker->apply_group or MySpockWorker identity) to perform
the timestamp checks and call spock_cleanup_resolutions() so that
RESOLUTIONS_CLEANUP_INTERVAL_MS and spock_resolutions_retention_days enforce "at
most once per day" across restarts.
---
Duplicate comments:
In `@src/spock_conflict.c`:
- Line 90: The global default for automatic purging is currently enabled by
initializing spock_resolutions_retention_days to 100; change the default to 0 so
retention cleanup is opt-in instead of destructive on upgrade by initializing
spock_resolutions_retention_days to 0 (the code already treats <= 0 as
disabled), and update any related initialization comments or documentation near
the spock_resolutions_retention_days declaration to reflect that explicit
enablement is required.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 83022042-ff1a-4959-80d5-ca4510f0dd60
⛔ Files ignored due to path filters (1)
tests/regress/expected/resolutions_retention.outis excluded by!**/*.out
📒 Files selected for processing (10)
Makefiledocs/configuring.mddocs/spock_functions/functions/spock_cleanup_resolutions.mdinclude/spock_conflict.hsql/spock--5.0.6--6.0.0-devel.sqlsql/spock--6.0.0-devel.sqlsrc/spock.csrc/spock_apply.csrc/spock_conflict.ctests/regress/sql/resolutions_retention.sql
Introduces spock.resolutions_retention_days (int, default 100, min 0) to control how long rows are kept in spock.resolutions. Rows older than the configured window are deleted automatically by the apply worker at most once per day. Setting the GUC to 0 disables automatic cleanup entirely. Also adds spock.cleanup_resolutions(), a superuser-only SQL function that returns the number of rows deleted, for manual invocation. A log_time index is added to both the fresh-install and upgrade scripts to keep the daily DELETE efficient on large resolutions tables. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ns() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Change resolutions_retention_days from PGC_SUSET to PGC_SIGHUP to prevent session-level SET; apply worker picks up changes via reload - Add optional days arg to cleanup_resolutions(); overrides GUC when set - Fix docs: remove incorrect save_resolutions claim; clarify precedence - Update regression test to use ALTER SYSTEM SET + pg_reload_conf() instead of session-level SET/RESET Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PGC_SIGHUP GUC changes require ALTER SYSTEM + pg_reload_conf() and a pg_sleep(1) to ensure the SIGHUP is processed by the backend before cleanup_resolutions() reads the C variable. Without the sleep the reload arrives one statement late, causing wrong rows_deleted counts. Also fix trailing-space alignment in pg_reload_conf and pg_sleep column headers to match actual psql aligned-format output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
e83a5ef to
2afc55b
Compare
Summary by CodeRabbit
New Features
Documentation
Tests
Performance