Skip to content

Inc_backup: add new case to shutdown vm when backup job is running#6773

Open
meinaLi wants to merge 1 commit intoautotest:masterfrom
meinaLi:inc_backup_shutdown
Open

Inc_backup: add new case to shutdown vm when backup job is running#6773
meinaLi wants to merge 1 commit intoautotest:masterfrom
meinaLi:inc_backup_shutdown

Conversation

@meinaLi
Copy link
Contributor

@meinaLi meinaLi commented Jan 22, 2026

Automate case:
VIRT-306599 - [incremental_backup]shutdown vm when backup job is running
During the backup process, shutdown the guest and check the guest status. After that, resume the guest or directly abort the blockjob, check the guest status and blockjob process again.

Summary by CodeRabbit

  • Tests
    • Added a new incremental-backup test variant covering VM shutdown with push-based backups and optional resume behavior. Requires libvirt >= 11.10.0.
  • Chores
    • Improved test readiness and cleanup flows; added extra logging for checkpoint content and stronger checkpoint cleanup to ensure VM state is preserved after tests.

@coderabbitai
Copy link

coderabbitai bot commented Jan 22, 2026

Walkthrough

This PR introduces a new test variant shutdown_vm for the incremental backup lifecycle workflow. The config adds a push-mode backup entry with a qcow2 target and --preserve-domain-on-shutdown, with nested with_resume and without_resume variants. The test suite adds test_shutdown_vm (new imports and helpers) to monitor blockjob progress and VM state across shutdown, and to either resume or abort the blockjob based on the resume_after_shutdown parameter. Minor logging and cleanup adjustments were also added.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the primary change: adding a new test case for shutting down a VM during an incremental backup job.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@meinaLi
Copy link
Contributor Author

meinaLi commented Jan 22, 2026

# avocado run --vt-type libvirt --vt-omit-data-loss --vt-machine-type q35 incremental_backup.lifecycle.shutdown_vm
JOB ID     : 22c69b284d36d77e95b75c0a8080a5664fc25b2a
JOB LOG    : /var/log/avocado/job-results/job-2026-01-22T02.03-22c69b2/job.log
 (1/2) type_specific.io-github-autotest-libvirt.incremental_backup.lifecycle.shutdown_vm.with_resume: STARTED
 (1/2) type_specific.io-github-autotest-libvirt.incremental_backup.lifecycle.shutdown_vm.with_resume: PASS (57.07 s)
 (2/2) type_specific.io-github-autotest-libvirt.incremental_backup.lifecycle.shutdown_vm.without_resume: STARTED
 (2/2) type_specific.io-github-autotest-libvirt.incremental_backup.lifecycle.shutdown_vm.without_resume: PASS (62.58 s)
RESULTS    : PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML   : /var/log/avocado/job-results/job-2026-01-22T02.03-22c69b2/results.html
JOB TIME   : 122.28 s

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@libvirt/tests/src/incremental_backup/incremental_backup_lifecycle.py`:
- Around line 216-276: The test_shutdown_vm function never records the backup
target produced by prepare_backup_xml (the scratch_file_full target), so cleanup
in run() never removes it and ARG001 lint fails; update test_shutdown_vm to
accept/ensure the backup_file_list is initialized (use backup_file_list or
create a local list if None), capture the target path returned or exposed by
prepare_backup_xml/its side-effect (the scratch_file_full target), append that
path to backup_file_list, and ensure the function returns or mutates
backup_file_list so run() can remove it during cleanup (refer to
test_shutdown_vm, prepare_backup_xml, backup_file_list, and run).
🧹 Nitpick comments (1)
libvirt/tests/src/incremental_backup/incremental_backup_lifecycle.py (1)

278-299: Consider polling instead of fixed sleeps for state/progress checks.

Hard sleeps (2/8/5s) plus a strict “progress must increase” check can be flaky on slow or fast hosts. A bounded poll with timeout would be more resilient.

Comment on lines +216 to +276
def test_shutdown_vm(test, params, backup_file_list):
"""
Test shutdown vm with incremental backup using --preserve-domain-on-shutdown.

:param backup_file_list: Placeholder parameter for interface compatibility, not used in this function.
"""
def get_blockjob_progress(vm_name, target_disk):
"""
Get blockjob progress percentage.

:param vm_name: VM name
:param target_disk: target disk name
:return: progress percentage as float
"""
blockjob_result = virsh.blockjob(vm_name, target_disk, debug=True)
if blockjob_result.exit_status != 0:
return None
output = blockjob_result.stdout.strip() + " " + blockjob_result.stderr.strip()
match = re.search(r'\[\s*(\d+\.?\d*)\s*%\s*\]', output)
if match:
progress = float(match.group(1))
test.log.info("Current blockjob progress: %.2f%%" % progress)
return progress
return None

def check_vm_state_and_progress(vm_name, target_disk, expected_state, previous_progress, stage_name):
"""
Check VM state and verify blockjob progress increases.

:param vm_name: VM name
:param target_disk: target disk name
:param expected_state: expected VM state (e.g., "paused", "running")
:param previous_progress: previous blockjob progress percentage
:param stage_name: description of the current stage
:return: current blockjob progress percentage
"""
if not libvirt.check_vm_state(vm_name, expected_state):
current_state = virsh.domstate(vm_name).stdout.strip()
test.fail("Expected VM to be %s at '%s' stage, but got state: %s"
% (expected_state, stage_name, current_state))

current_progress = get_blockjob_progress(vm_name, target_disk)
if current_progress is None:
test.fail("Blockjob should still be active at '%s' stage" % stage_name)

if current_progress <= previous_progress:
test.fail("Blockjob progress should increase at '%s' stage. Previous: %.2f%%, Current: %.2f%%"
% (stage_name, previous_progress, current_progress))

return current_progress

libvirt_version.is_libvirt_feature_supported(params)
vm_name = params.get("main_vm")
target_disk = params.get("target_disk")
backup_options_extra = params.get("backup_options_extra", "")
resume_after_shutdown = params.get_boolean("resume_after_shutdown", False)

test.log.info("Start full backup with --preserve-domain-on-shutdown.")
backup_options = prepare_backup_xml(test, params, backup_type="full")
backup_options += backup_options_extra
virsh.backup_begin(vm_name, backup_options, debug=True, ignore_status=False)
Copy link

@coderabbitai coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Track the push backup target for cleanup (and use backup_file_list).

prepare_backup_xml(..., backup_type="full") creates a target file (scratch_file_full), but test_shutdown_vm never records it, so the file won’t be removed in run() cleanup and the ARG001 lint triggers. Consider initializing the list, appending the target path, and returning it.

♻️ Proposed fix (cleanup + lint)
 def test_shutdown_vm(test, params, backup_file_list):
     """
     Test shutdown vm with incremental backup using --preserve-domain-on-shutdown.

     :param backup_file_list: Placeholder parameter for interface compatibility, not used in this function.
     """
+    if backup_file_list is None:
+        backup_file_list = []
+    # Keep in sync with prepare_backup_xml() scratch_file naming for backup_type="full"
+    backup_target = data_dir.get_data_dir() + '/scratch_file_full'
+    backup_file_list.append(backup_target)
+
     def get_blockjob_progress(vm_name, target_disk):
         """
         Get blockjob progress percentage.
@@
     else:
         test.log.info("Abort the blockjob and check the guest status.")
         virsh.domjobabort(vm_name, debug=True, ignore_status=False)
         time.sleep(5)
         if not libvirt.check_vm_state(vm_name, "shut off"):
             test.fail("Expected VM to be shut off after aborting blockjob, but got unexpected state: %s"
                       % virsh.domstate(vm_name).stdout.strip())
+    return backup_file_list
🧰 Tools
🪛 Ruff (0.14.13)

216-216: Unused function argument: backup_file_list

(ARG001)

🤖 Prompt for AI Agents
In `@libvirt/tests/src/incremental_backup/incremental_backup_lifecycle.py` around
lines 216 - 276, The test_shutdown_vm function never records the backup target
produced by prepare_backup_xml (the scratch_file_full target), so cleanup in
run() never removes it and ARG001 lint fails; update test_shutdown_vm to
accept/ensure the backup_file_list is initialized (use backup_file_list or
create a local list if None), capture the target path returned or exposed by
prepare_backup_xml/its side-effect (the scratch_file_full target), append that
path to backup_file_list, and ensure the function returns or mutates
backup_file_list so run() can remove it during cleanup (refer to
test_shutdown_vm, prepare_backup_xml, backup_file_list, and run).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to remove it because it's a temporary file which will removed automatically.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@meinaLi, thank you for the clarification! If the push-mode backup target is cleaned up automatically by the framework or libvirt, then explicit tracking isn't necessary. I appreciate you explaining the difference in cleanup behavior for this scenario.

/)_/)
(•ᴗ•) Noted for future!
/>📝


✏️ Learnings added
Learnt from: meinaLi
Repo: autotest/tp-libvirt PR: 6773
File: libvirt/tests/src/incremental_backup/incremental_backup_lifecycle.py:216-276
Timestamp: 2026-01-22T07:22:04.670Z
Learning: In `test_shutdown_vm` for incremental backup with push mode and `--preserve-domain-on-shutdown`, the scratch_file backup target created by `prepare_backup_xml` is automatically removed and doesn't require explicit tracking in `backup_file_list` for cleanup, unlike pull-mode backups in other test functions.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: hholoubk
Repo: autotest/tp-libvirt PR: 6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Learnt from: meinaLi
Repo: autotest/tp-libvirt PR: 6628
File: libvirt/tests/src/virtual_disks/virtual_disks_io_error.py:128-129
Timestamp: 2025-10-24T06:35:25.679Z
Learning: In libvirt tests, `virsh.snapshot_delete()` without the `--metadata` flag automatically deletes both snapshot metadata and external disk files. Explicit `os.remove()` is only needed when using `virsh.snapshot_delete()` with the `--metadata` flag, which removes only metadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add description in commit msg to describe the test content. There may be some user don't have permission to view the test case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Automte case:
VIRT-306599 - [incremental_backup]shutdown vm when backup job is running
During the backup process, shutdown the guest and check the guest status.
After that, resume the guest or directly abort the blockjob, check the guest
status and blockjob process again.

Signed-off-by: Meina Li <meili@redhat.com>
@meinaLi meinaLi force-pushed the inc_backup_shutdown branch from e679efd to d33bbad Compare March 12, 2026 06:27
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
libvirt/tests/src/incremental_backup/incremental_backup_lifecycle.py (1)

261-263: Consider using < instead of <= for progress comparison to handle edge cases.

The condition current_progress <= previous_progress will fail the test if progress is exactly the same between checks. This could happen with timing variations (e.g., if the sleep interval is too short for progress to increment). Consider using < to only fail when progress regresses, or add a tolerance for equal values.

♻️ Optional: Allow equal progress to avoid flaky failures
-        if current_progress <= previous_progress:
-            test.fail("Blockjob progress should increase at '%s' stage. Previous: %.2f%%, Current: %.2f%%"
-                      % (stage_name, previous_progress, current_progress))
+        if current_progress < previous_progress:
+            test.fail("Blockjob progress should not decrease at '%s' stage. Previous: %.2f%%, Current: %.2f%%"
+                      % (stage_name, previous_progress, current_progress))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libvirt/tests/src/incremental_backup/incremental_backup_lifecycle.py` around
lines 261 - 263, The check in incremental_backup_lifecycle.py that calls
test.fail when progress doesn't increase uses "current_progress <=
previous_progress", which will fail on equal values and can cause flakiness;
update the comparison in the progress check (the block that references
stage_name, previous_progress, current_progress and calls test.fail) to use "<"
instead of "<=" or alternatively implement a small tolerance (e.g.,
current_progress + epsilon < previous_progress) so only regressions trigger
failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@libvirt/tests/src/incremental_backup/incremental_backup_lifecycle.py`:
- Around line 261-263: The check in incremental_backup_lifecycle.py that calls
test.fail when progress doesn't increase uses "current_progress <=
previous_progress", which will fail on equal values and can cause flakiness;
update the comparison in the progress check (the block that references
stage_name, previous_progress, current_progress and calls test.fail) to use "<"
instead of "<=" or alternatively implement a small tolerance (e.g.,
current_progress + epsilon < previous_progress) so only regressions trigger
failure.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9a924cc9-4004-47a5-ae1f-2b5364024f1a

📥 Commits

Reviewing files that changed from the base of the PR and between e679efd and d33bbad.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/incremental_backup/incremental_backup_lifecycle.cfg
  • libvirt/tests/src/incremental_backup/incremental_backup_lifecycle.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • libvirt/tests/cfg/incremental_backup/incremental_backup_lifecycle.cfg

backup_options += backup_options_extra
virsh.backup_begin(vm_name, backup_options, debug=True, ignore_status=False)
virsh.blockjob(vm_name, target_disk, options="--bandwidth 100", debug=True, ignore_status=False)
time.sleep(2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use wait_until to replace sleep?


test.log.info("Shutdown the guest and check the guest status.")
virsh.shutdown(vm_name, debug=True, ignore_status=False)
time.sleep(8)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as before

Copy link
Contributor

@liang-cong-red-hat liang-cong-red-hat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just comments about func sleep replacement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants