Make password history replication-aware via a custom WAL-rmgr#68
Merged
Merged
Conversation
Contributor
Author
|
Hi! I'd be glad to get any thoughts about the patch. Regards |
3d92d0f to
3f03c31
Compare
Prior to this patch, the password history file was a purely local, in-memory/on-disk structure with no WAL coverage. On a standby the file was never updated, so any replication-capable setup had an inconsistent password history between primary and replicas. This patch introduces a custom WAL resource manager for credcheck (registered under RM_EXPERIMENTAL_ID until a stable ID is reserved in access/rmgrlist.h) available on PostgreSQL 15 and later, where the Custom WAL Resource Manager API was introduced. Six WAL record types are defined: XLOG_CREDCHECK_PWD_ADD – a password entry was appended XLOG_CREDCHECK_PWD_REMOVE – a single entry was removed XLOG_CREDCHECK_PWD_REMOVE_USER – all entries for a role were removed XLOG_CREDCHECK_PWD_RENAME – a role was renamed XLOG_CREDCHECK_PWD_RESET – history was truncated (optionally per-user) XLOG_CREDCHECK_PWD_TIMESTAMP – a password date was updated Every mutation of the password history now emits a corresponding WAL record. RESET and TIMESTAMP records are immediately flushed with XLogFlush() to guarantee the standby receives them before the originating transaction completes. The redo routine credcheck_rmgr_redo() replays each record type by applying the same operation to the standby's local password history file. On PostgreSQL versions prior to 15 the code compiles and behaves exactly as before; the new paths are guarded by A TAP regression test (t/001_history_replication.pl) is added to verify end-to-end replication of all password history operations across a primary/standby pair.
3f03c31 to
052dd5e
Compare
Contributor
Author
|
Hi!
It seems that there should not be problems in CI now. |
Contributor
|
Hi Aidar, Yes I have already found that the t/ directory must be on top of the project and review the patch. I have also tested on PG version from 16 to 18. It is an excellent work, thanks you very much! Your help is much appreciated. I'm applying the PR, the installcheck works well on my computer too but I think that the CI file might need to be fixed with the TAP test but your patch works as expected. I will have a look today. Thanks again! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem --
The password history file is credcheck's persistent record of previously used password hashes. Before this patch, every write to it was a direct local file operation with no WAL coverage. Standbys never received those changes — meaning that in any streaming-replication setup, the password history was silently inconsistent between the primary and its replicas. After a failover, the promoted standby would enforce a completely different (or empty) password history, undermining the entire point of the
password_reuse_history/password_reuse_interval policy.Approach --
PostgreSQL 15 introduced the Custom WAL Resource Manager API (
RegisterCustomRmgr), which allows extensions to plug into the WAL machinery with their own record types and redo routines. This patch uses that API to emit a WAL record for every mutation of the in-memory password history hash, so that standbys can replay those mutations and stay in sync.The feature is compiled only on PG ≥ 15 and is entirely gated by
#if PG_VERSION_NUM >= 150000. On older versions the code builds and behaves exactly as before.Implementation details --
A new resource manager
credcheck_rmgris registered underRM_EXPERIMENTAL_ID— a stable ID will be requested from the PostgreSQL project when the patch matures. It exposes the three required callbacks:credcheck_rmgr_redofor replaying records during recovery,credcheck_rmgr_descfor producing human-readable descriptions forpg_waldump, andcredcheck_rmgr_identifyfor mapping opcodes to symbolic names.Six WAL record types cover every operation that mutates the history.
XLOG_CREDCHECK_PWD_ADDis emitted when a new password hash is appended onCREATE/ALTER ROLE ... PASSWORD.XLOG_CREDCHECK_PWD_REMOVEfires when the oldest entry is evicted to stay within the history limit.XLOG_CREDCHECK_PWD_REMOVE_USERcoversDROP ROLE, removing all entries for a given role at once.XLOG_CREDCHECK_PWD_RENAMEhandlesALTER ROLE ... RENAME TO.XLOG_CREDCHECK_PWD_RESETis written when history is truncated viapg_password_history_reset(), optionally scoped to one user. Finally,XLOG_CREDCHECK_PWD_TIMESTAMPis emitted when a password date is updated during password_reuse_interval enforcement.RESETandTIMESTAMPrecords are immediately followed byXLogFlush()to ensure the standby has received them before the originating transaction is considered complete. The remaining record types rely on normal WAL flushing behavior, which is sufficient for history consistency.credcheck_rmgr_redo()mirrors the primary's in-memory hash operations: it inserts, removes, renames, or resets entries inpgph_hashand then calls the existingpgph_write()helper to persist the result to history file on the standby's data directory.One operational requirement: credcheck must be listed in
shared_preload_librarieson the standby so thatRegisterCustomRmgr()is called before recovery begins reading WAL records withRM_CREDCHECK_ID. This is the standard requirement for any extension using Custom WAL RMgrsThis patch introduces replication support only. The underlying extension logic still does not provide transactional behavior for password history writes.
Testing --
A new TAP
001_history_replication.pltests the full end-to-end replication path. It sets up a primary and streaming standby pair with credcheck preloaded on both, then verifies that credcheck appears inpg_get_wal_resource_managers()on both nodes. It creates roles, generates several password changes, and confirms the standby'spg_password_historyview matches the primary afterwait_for_catchup. Each operation type is tested individually: rename, drop, and explicit reset. The standby is then restarted to verify that history file written during redo is correctly reloaded from disk. Finally the standby is promoted, and the test confirms that the replicated history survives promotion: a reused password is rejected and a fresh one is accepted.