Skip to content

NAS backup: automated infrastructure backup (database, configs, certificates)#12900

Open
jmsperu wants to merge 1 commit into
apache:4.22from
jmsperu:feature/infrastructure-backup-to-nas
Open

NAS backup: automated infrastructure backup (database, configs, certificates)#12900
jmsperu wants to merge 1 commit into
apache:4.22from
jmsperu:feature/infrastructure-backup-to-nas

Conversation

@jmsperu
Copy link
Copy Markdown
Contributor

@jmsperu jmsperu commented Mar 26, 2026

Summary

Adds a new InfrastructureBackupTask to the NAS backup plugin that performs daily backups of CloudStack infrastructure (database, management/agent configs, SSL certs) to NAS storage.

Problem

CloudStack's NAS backup provider only backs up VM disks. The management server database, agent configurations, SSL certificates, and global settings are not backed up. If the management server fails, all metadata is lost unless someone manually configured mysqldump cron.

Solution

A new background poll task that automatically backs up:

  1. MySQL databases (cloud + optionally cloud_usage) using mysqldump --single-transaction for InnoDB-consistent hot backups
  2. Management server configs (/etc/cloudstack/management/)
  3. Agent configs (/etc/cloudstack/agent/) if present
  4. SSL certificates (/etc/cloudstack/management/cert/) if present
  5. Automatic retention management (removes old backup sets)

Database credentials are read from /etc/cloudstack/management/db.properties at runtime (no secrets stored in global config).

Configuration

Setting Scope Default Description
nas.infra.backup.enabled Global false Master switch for infrastructure backup
nas.infra.backup.location Global (empty) NAS mount path (e.g. /mnt/nas-backup)
nas.infra.backup.retention Global 7 Number of backup sets to keep
nas.infra.backup.include.usage.db Global true Include cloud_usage database

Backup Structure

/mnt/nas-backup/infra-backup/
├── 20260327-020000/
│   ├── cloud-20260327-020000.sql.gz
│   ├── cloud_usage-20260327-020000.sql.gz
│   ├── management-config.tar.gz
│   ├── agent-config.tar.gz
│   └── ssl-certs.tar.gz
├── 20260326-020000/
│   └── ...

Changes

  • New: InfrastructureBackupTask.java - background task implementing BackgroundPollTask
  • Modified: NASBackupProvider.java - added 4 ConfigKeys, scheduled the backup task

Test plan

  • Enable backup framework and set provider to NAS
  • Configure nas.infra.backup.enabled=true and nas.infra.backup.location=/mnt/test-backup
  • Verify backup directory structure is created
  • Verify cloud-*.sql.gz database dump is created and restorable
  • Verify management-config.tar.gz contains /etc/cloudstack/management/ files
  • Verify agent-config.tar.gz is created only when agent config dir exists
  • Set retention=2, trigger multiple cycles, verify old backups are cleaned up
  • Verify task is a no-op when disabled
  • Verify task handles missing db.properties gracefully
  • Verify cloud_usage backup is included/excluded based on config

Adds automated backup of CloudStack infrastructure to NAS storage:
- MySQL databases (cloud + cloud_usage if enabled)
- Management server configuration (/etc/cloudstack/management/)
- Agent configuration (/etc/cloudstack/agent/)
- SSL certificates and keystores

Backup runs daily via BackgroundPollManager, configurable via global settings:
- nas.infra.backup.enabled (default: false)
- nas.infra.backup.location (NAS mount path)
- nas.infra.backup.retention (default: 7 backup sets)
- nas.infra.backup.include.usage.db (default: true)

Backups are stored in the NAS backup storage under infra-backup/
with automatic retention management.

Uses mysqldump --single-transaction for hot database backup
(no table locks, InnoDB consistent snapshot). Database credentials
are read from /etc/cloudstack/management/db.properties.
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 27, 2026

Codecov Report

❌ Patch coverage is 5.29801% with 143 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.60%. Comparing base (c1af36f) to head (6a8e1d1).
⚠️ Report is 123 commits behind head on 4.22.

Files with missing lines Patch % Lines
...he/cloudstack/backup/InfrastructureBackupTask.java 0.00% 138 Missing ⚠️
...rg/apache/cloudstack/backup/NASBackupProvider.java 61.53% 5 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.22   #12900      +/-   ##
============================================
- Coverage     17.61%   17.60%   -0.01%     
+ Complexity    15676    15674       -2     
============================================
  Files          5917     5918       +1     
  Lines        531537   531688     +151     
  Branches      64985    65008      +23     
============================================
- Hits          93610    93608       -2     
- Misses       427369   427525     +156     
+ Partials      10558    10555       -3     
Flag Coverage Δ
uitests 3.70% <ø> (ø)
unittests 18.67% <5.29%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@weizhouapache
Copy link
Copy Markdown
Member

Code LGTM. However, I wonder whether backing up the CloudStack database from within CloudStack itself is the right approach.

In most deployments, database backups are usually handled externally by administrators (e.g., via cron jobs or existing backup tooling). That approach may be simpler and more flexible operationally.

@jmsperu
Copy link
Copy Markdown
Contributor Author

jmsperu commented May 9, 2026

@weizhouapache fair point — the intent here was never to replace external/operational DB backup tooling, but to give small and edge deployments a one-knob "backup the management plane to the same NAS that already holds my VM backups" option for disaster recovery (rebuild from a fresh management server using only what's on the NAS).

Concrete adjustments I can make:

  1. Make the DB component opt-in via an explicit infrastructure.backup.include.database global setting, defaulting to false — so by default we only ship configs + certs (where there's no real ops alternative anyway).
  2. Document explicitly that for production deployments with existing backup tooling, the DB component should stay disabled and be handled externally.
  3. Keep the unified target (same NAS as VM backups) for the configs+certs path so recovery is in one place.

Does that split land closer to where you'd want it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants