Garbage snapshots are left on Primary and Secondary#3649
Garbage snapshots are left on Primary and Secondary#3649GabrielBrascher wants to merge 6 commits into
Conversation
|
@andrijapanicsb Can you please test this PR and check if this fixes the snapshot deletion garbage issue on XenServer + NFS? Thanks! |
|
@andrijapanicsb can you please check this commit? It worked with KVM + Ceph and might also fix the reported snapshot deletion issue for XenServer + NFS. |
|
thx @GabrielBrascher. Will do, just to get some free time for it. |
|
Alright - so far tested only KVM (centos 7) with NFS: While backup to secondary = true, results as follows:
In general, looking good, except that those records for snap reference on Prim Store are not deleted from the snapshots_store_ref table - not sure what was the original behaviour here, years ago. While backup to secondary = false, results as follows:
Here is a more detailed error when deleting a snap that exists only on Prim Store:
@GabrielBrascher I'm wondering, is the behaviour SAME in your tests with Ceph (you obviously do NOT have issues deleting the snap from the Ceph only ( backup to Sec Store = false) |
|
I would say that you just fix, if possible, that deletion from Primary (when backup to Sec Store = false) and we are mostly good to go? Let me briefly test with XS with NFS (but I expect identical stuff) |
|
/CC @rhtyd |
|
Well, things are a bit complicated on XenServer (7.1.0), need some more fix (nothing major I hope) While backup to secondary = TRUE, results as follows:
If garbage that is left on Primary Storage could be fixed, it would be great. Otherwise, I can live with the current situation and I will then raise a new blocker issue in the GitHub (which boils to "let's fix it now") /CC @rhtyd @nvazquez While backup to secondary = FALSE, results as follows:
|
|
Thanks for the detailed review/test @andrijapanicsb. I am going to check that exception. |
|
@andrijapanicsb code has been updated with the last commit; that exception should not be launched now. |
|
ping @GabrielBrascher , do we still want this in before 4.13.1/4.14? |
|
@DaanHoogland The idea of this PR is to solve it for KVM/Ceph and XenServer/NFS as well. I have been able to test it on Ceph + KVM. However, I was not able to test if it fixes for XenServer. That is why it is a "WIP". I can close this one and aim a fix isolated for Ceph only. |
|
well, the question is is it feasible (cc @andrijapanicsb ) if you want a fix in 4.13 your safest bet is of course not rely on us to have time, @GabrielBrascher ;) |
|
@DaanHoogland a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✖centos6 ✔centos7 ✔debian. JID-900 |
|
@blueorangutan test matrix |
|
@DaanHoogland a Trillian-Jenkins matrix job (centos7 mgmt + xs71, centos7 mgmt + vmware65, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests |
|
@rhtyd @andrijapanicsb @DaanHoogland I have not been able to solve those issues yet, I am still investigating and hammering it :-( |
|
tnx @GabrielBrascher Note that the travis run has some errors as well, maybe this can help. |
|
Trillian test result (tid-1044)
|
|
Ping - update on this @GabrielBrascher @andrijapanicsb @DaanHoogland ? |
|
@rhtyd not much progress so far from my side. |
|
Quick update: today I have been digging on the code and debugging. This week I will be addressing more time here to solve it as soon as possible. |
|
thx @GabrielBrascher |
|
@GabrielBrascher did you manage? as code freeze is upon us, will you need more time or resources? |
|
@DaanHoogland I am working on it, the fix that I am implementing is working for KVM + NFS (primary) It still needs more testing (not yet updated here the fix). I am also Implementing now the KVM +NFS (secondary) snapshot deletion. |
d25b4eb to
c9c1888
Compare
|
@blueorangutan package |
|
@DaanHoogland a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✖centos6 ✖centos7 ✖debian. JID-1049 |
|
you have trailing whitespaces in |
|
joint effort continuing in #3669 |
Description
Tested both cases:
This fix works for KVM and XenServer.
Note: I changed the name of the strategy from XenserverSnapshotStrategy to DefaultSnapshotStrategy due to the fact that the "Xenserver" strategy handles also Ceph, KVM, and do return
StrategyPriority.DEFAULTin any case except REVERT.Fixes: #3646
Types of changes