Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 7 additions & 23 deletions docs/features/physical.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,21 +12,21 @@
| [2.3.0](../release-notes/2.3.0.md) | Physical backups in mixed deployments |
| [2.10.0](../release-notes/2.10.0.md) | Physical restore with a fallback directory |

**Physical backup** is copying of physical files from the Percona Server for MongoDB `dbPath` data directory to the remote backup storage. These files include data files, journal, index files, etc. Percona Backup for MongoDB also copies the WiredTiger storage options to the backup's metadata.
**Physical backup** refers to the process of copying physical files from the Percona Server for MongoDB `dbPath` data directory to the remote backup storage. These files include data files, journal, index files, etc. Percona Backup for MongoDB also copies the WiredTiger storage options to the backup's metadata.

**Physical restore** is the reverse process: `pbm-agents` shut down the `mongod` nodes, clean up the `dbPath` data directory and copy the physical files from the storage to it.

The following diagram shows the physical restore flow:

![image](../_images/pbm-phys-restore-shard.png)

During the restore, the ``pbm-agents`` temporarily start the ``mongod`` nodes using the WiredTiger storage options retrieved from the backup's metadata. The logs for these starts are saved to the ``pbm.restore.log`` file inside the ``dbPath``. Upon successful restore, this file is deleted. However, it remains for debugging if the restore were to fail.
During the restore, the ``pbm-agents`` start a temporary non-user reachable instance of each ``mongod`` nodes using the WiredTiger storage options retrieved from the backup's metadata. The logs for these "intermediate" starts are saved to the ``pbm.restore.log`` file inside the ``dbPath``. Upon successful restore, these files are deleted. However, they remain for debugging if the restore were to fail.
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar/clarity issues in this sentence make the restore flow hard to follow: "instance of each mongod nodes" mixes singular/plural, and the text says logs are saved to a single "pbm.restore.log" file but later refers to "these files" being deleted. Consider rephrasing to consistently describe whether there is one temporary instance/log per node and use consistent singular/plural (e.g., "each mongod node" and "log file(s)").

Suggested change
During the restore, the ``pbm-agents`` start a temporary non-user reachable instance of each ``mongod`` nodes using the WiredTiger storage options retrieved from the backup's metadata. The logs for these "intermediate" starts are saved to the ``pbm.restore.log`` file inside the ``dbPath``. Upon successful restore, these files are deleted. However, they remain for debugging if the restore were to fail.
During the restore, the ``pbm-agents`` start a temporary, non-user-reachable instance for each ``mongod`` node using the WiredTiger storage options retrieved from the backup's metadata. The logs for these "intermediate" starts are saved to a ``pbm.restore.log`` file inside each node's ``dbPath``. Upon successful restore, these log files are deleted. However, they remain for debugging if the restore were to fail.

Copilot uses AI. Check for mistakes.

During physical backups and restores, ``pbm-agents`` don't export / import data from / to the database. This significantly reduces the backup / restore time compared to logical ones and is the recommended backup method for big (multi-terabyte) databases.
During physical backups and restores, ``pbm-agents`` don't export / import data from / to the database. This significantly reduces the backup / restore time compared to logical backups, and is the recommended backup method for big (multi-terabyte) databases.

| Advantages | Disadvantages |
| ------------------------------ | ------------------------------- |
|- Faster backup and restore speed <br> - Recommended for big, multi-terabyte datasets <br> - No database overhead | - The backup size is bigger than for logical backups due to data fragmentation extra cost of keeping data and indexes in appropriate data structures <br> - Extra manual operations are required after the restore <br> - Point-in-time recovery requires manual operations | Sharded clusters and non-sharded replica sets |
|- Faster backup and restore speed <br> - Recommended for big, multi-TB datasets <br> - No overhead at database level | - The backup size could be bigger than for logical backups due to data fragmentation, and the cost of storing the files of each index <br> - Extra manual post-restore steps are required |
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc uses both "multi-terabyte" (line 25) and the abbreviated "multi-TB" (line 29) to describe the same scale. Standardizing on one term in this section would improve readability and avoid potential ambiguity for readers unfamiliar with the abbreviation.

Suggested change
|- Faster backup and restore speed <br> - Recommended for big, multi-TB datasets <br> - No overhead at database level | - The backup size could be bigger than for logical backups due to data fragmentation, and the cost of storing the files of each index <br> - Extra manual post-restore steps are required |
|- Faster backup and restore speed <br> - Recommended for big, multi-terabyte datasets <br> - No overhead at database level | - The backup size could be bigger than for logical backups due to data fragmentation, and the cost of storing the files of each index <br> - Extra manual post-restore steps are required |

Copilot uses AI. Check for mistakes.

## Availability and system requirements

Expand All @@ -44,22 +44,6 @@ During physical backups and restores, ``pbm-agents`` don't export / import data
* [Physical Backup Support in Percona Backup for MongoDB :octicons-link-external-16:](https://www.percona.com/blog/physical-backup-support-in-percona-backup-for-mongodb/)
* [$backupCursorExtend in Percona Server for MongoDB :octicons-link-external-16:](https://www.percona.com/blog/2021/06/07/experimental-feature-backupcursorextend-in-percona-server-for-mongodb/)

Physical backup consists of copying the files from the Percona Server for MongoDB `dbPath` data directory to the remote backup storage. These files include data files, journal, index files, etc. Starting with version 2.0.0, Percona Backup for MongoDB also saves the WiredTiger storage options to the backup's metadata.

Physical restore is the reverse process: each `pbm-agent` shuts down their local `mongod` node, cleans up the `dbPath` data directory and copies back the physical files from the backup storage location.

The following diagram shows the physical restore flow:

![image](../_images/pbm-phys-restore-shard.png)

During the restore, the ``pbm-agents`` temporarily restart the ``mongod`` nodes two times using the WiredTiger storage options retrieved from the backup's metadata. The logs for these starts are saved to the ``pbm.restore.log`` file inside the ``dbPath``. Upon successful restore, this file is deleted. However, it is kept for debugging if the restore fails.

During physical backups and restores, ``pbm-agents`` don't export / import data from / to the database. This significantly reduces the backup / restore time compared to logical ones and is the recommended backup method for big (> 100 GB) databases.

| Advantages | Disadvantages |
| ------------------------------ | ------------------------------- |
|- Faster backup and restore speed <br> - Recommended for big, multi-terabyte datasets <br> - No database overhead | - The backup size is bigger than for logical backups due to data fragmentation extra cost of keeping data and indexes in appropriate data structures <br> - Extra manual operations are required after the restore | Sharded clusters and non-sharded replica sets |

[Make a backup](../usage/backup-physical.md){ .md-button }
[Restore a backup](../usage/restore-physical.md){ .md-button }

Expand Down Expand Up @@ -88,15 +72,15 @@ You can back up and restore data which is encrypted at rest. Thereby you ensure

During a backup, Percona Backup for MongoDB stores the encryption settings in the backup metadata. You can verify them using the [`pbm describe-backup`](../reference/pbm-commands.md#pbm-describe-backup) command. Note that the encryption key is not stored nor shown as part of the backup.

!!! important
!!! warning

Make sure that you know which master encryption key was used and keep it safe, as this key is required for the restore.
The master encryption key is not stored as part of the backup. Make sure that you know which encryption key was used at the time of each backup (and back it up separately). If using a key management system, make sure to enable key versioning (Vault does this by default). Do NOT delete old key versions as these are still required for restoring "old" backups that happened before key rotation.
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this warning, the terminology switches from "master encryption key" to the more generic "encryption key", which can be ambiguous (e.g., readers may not realize you mean the master key and/or its specific version at backup time). Consider using consistent terminology throughout and explicitly calling out that the restore requires the same master key version/ID that was active when the backup was taken.

Copilot uses AI. Check for mistakes.

!!! note

Starting with [Percona Server for MongoDB version 4.4.19-19 :octicons-link-external-16:](https://docs.percona.com/percona-server-for-mongodb/4.4/release_notes/4.4.19-19.html), [5.0.15-13 :octicons-link-external-16:](https://docs.percona.com/percona-server-for-mongodb/5.0/release_notes/5.0.15-13.html), [6.0.5-4 :octicons-link-external-16:](https://docs.percona.com/percona-server-for-mongodb/6.0/release_notes/6.0.5-4.html) and higher, the master key rotation for data-at-rest encrypted with HashiCorp Vault has been improved to use the same secret key path on every server in your entire deployment. For the restore with earlier versions of Percona Server for MongoDB and PBM 2.0.5 and earlier, see the [Restore for Percona Server for MongoDB **before** 4.4.19-19, 5.0.15-13, 6.0.5-4 using HashiCorpVault](#restore-for-percona-server-for-mongodb-before-4419-19-5015-13-605-4-using-hashicorpvault) section.

To restore the encrypted data from the backup, configure the same data-at-rest encryption settings on all nodes of your destination cluster or replica set to match the settings of the original cluster where you made the backup.
To restore the encrypted data from the backup, configure all nodes of your cluster or replica set with the same data-at-rest encryption settings that match the settings of the original cluster where you made the backup.

During the restore, Percona Backup for MongoDB restores the data to all nodes using the same master key. We recommend to rotate the master encryption key afterwards for extra security.

Expand Down
Loading