Skip to content

[Veeam] restored VMs without NICs#6282

Merged
DaanHoogland merged 14 commits intoapache:mainfrom
scclouds:veeam-fix-restored-backup-vm-without-nics
Jul 3, 2023
Merged

[Veeam] restored VMs without NICs#6282
DaanHoogland merged 14 commits intoapache:mainfrom
scclouds:veeam-fix-restored-backup-vm-without-nics

Conversation

@SadiJr
Copy link
Copy Markdown
Contributor

@SadiJr SadiJr commented Apr 18, 2022

Description

Using the VMWare hypervisor with Veeam integration enabled, restoring a VM from a backup removes the VM NICs. This happens because the NicVO object does not override the equals and hashCode methods; thereforen, when ACS tries to sync the NICs of the restored VM with the NICs saved in the database, the comparison always returns false, and ACS removes the VM NIC.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

How Has This Been Tested?

It's tested in a local lab:

  1. I created a new VM and assigned it to a Backup Offering;
  2. I performed a manual backup;
  3. I restored the VM to this backup;
  4. Before, the restored VM don't have any NIC;
  5. Now, the restored VM have the correctly NIC.
    Also, I added more logs.

@acs-robot
Copy link
Copy Markdown

Found Java/XML changes, kicking packaging job
@blueorangutan package

@acs-robot
Copy link
Copy Markdown

PR Coverage Report

CLASS INSTRUCTION MISSED INSTRUCTION COVERED BRANCH MISSED BRANCH COVERED LINE MISSED LINE COVERED
NicVO 187 78 0 0 76 28
NicDaoImpl 1335 0 12 0 186 0
VMwareGuru 2876 52 222 0 520 5

@nvazquez
Copy link
Copy Markdown
Contributor

@blueorangutan package

Comment thread engine/schema/src/main/java/com/cloud/vm/dao/NicDaoImpl.java
public void setNsxLogicalSwitchPortUuid(String nsxLogicalSwitchPortUuid) {
this.nsxLogicalSwitchPortUuid = nsxLogicalSwitchPortUuid;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SadiJr
can you fix the issue in [VMwareGuru.java] ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand. Can you please explain?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SadiJr
can you point out where two NicVO are compared in the source code ? is it possible to compare using id ?

one more question, if the nic is removed in cloudstack, then restore VM from veeam, will the nic be marked as not removed ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weizhouapache

I'm sorry for the late reply. On line 909 of the VMWareGuru class, the remove method of a list object is called, passing a NicVO object as a reference. This is done to remove the NIC in question from the list of NICs to be removed from the VM (line 919); that is, NICs that are not in the list will be kept in the VM. It turns out that the NicVO object doesn't have the equals and hashcode methods overridden. Thus, the native method of Java is used, which uses the object reference to check if they are the same. With this, even if the NICs found in the database are the same discovered in VMware after a restore process, ACS will think they are different due to being two different object instances (objects of the same class), and will remove them of the VM.

About removed NICs, actually the NIC was not marked as not removed. I fixed that and thank you for pointing out this error.

PS: I don't know if I was able to express myself correctly (communication is not exactly one of my strong points), let me know if you don't understand something.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SadiJr
my suggestion is, we could replace allNics with allNicIds which contains the id of the vm nics,

the other lines can be changed to

            if (nicVO != null) {
                allNicIds.remove(nicVO.getId());
            }

and

        for (final Long unMappedNicId : allNicIds) {
            vmManager.removeNicFromVm(vm, _nicDao.findById(unMappedNicId));
        }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weizhouapache yes, I can do that, but I think is better to override the equals and hashCode in NicVO, to make sure this error don't appears in other parts of code. What you think about?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SadiJr I get what you are saying and I'm up for both the solutions as you and @weizhouapache suggested.
My only concern with the equals method override is if it causes any regression in other places. To avoid such scenarios @weizhouapache 's solution makes it safer.

@weizhouapache any suggestions here!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harikrishna-patnala I don't see how this change could break other parts of the code, if two nicVO objects were being compared and they came out as equal before, they'll still continue doing so.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree with what @harikrishna-patnala said.

this is a very large project with many features.
if there are two good options to fix a bug, it would be better to choose the one with less scope and impact.
If you will test all related features with different setup (hypervisors, network types and elements,etc), then go ahead with your solution, I am fine with it.

cc @SadiJr @JoaoJandre @DaanHoogland

@weizhouapache weizhouapache requested a review from nvazquez April 22, 2022 10:41
@nvazquez nvazquez modified the milestones: 4.17.0.0, 4.17.1.0 Apr 22, 2022
@acs-robot
Copy link
Copy Markdown

Found UI changes, kicking a new UI QA build
@blueorangutan ui

@blueorangutan
Copy link
Copy Markdown

@acs-robot a Jenkins job has been kicked to build UI QA env. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

UI build: ✔️
Live QA URL: http://qa.cloudstack.cloud:8080/client/pr/6282 (SL-JID-1441)

@acs-robot
Copy link
Copy Markdown

PR Coverage Report

CLASS INSTRUCTION MISSED INSTRUCTION COVERED BRANCH MISSED BRANCH COVERED LINE MISSED LINE COVERED
Network 554 0 42 0 107 0
Volume 109 0 2 0 44 0
VirtualMachineGuru 78 0 4 0 15 0
NetworkOrchestrationService 0 101 0 0 0 10
ConnectedAgentAttache 149 0 20 0 40 0
VirtualMachineManagerImpl 15659 0 1500 0 3079 0
NetworkOrchestrator 9806 0 1206 0 1928 0
DataCenterVnetVO 58 0 0 0 24 0
VlanVO 186 0 2 0 72 0
HostDaoImpl 4983 0 180 0 803 0
AccountGuestVlanMapVO 46 0 0 0 19 0
NetworkOfferingVO 414 31 0 0 126 12
VolumeVO 529 133 4 0 179 39
SystemVmTemplateRegistration 1773 0 86 0 376 0
Upgrade41520to41600 262 17 16 0 62 5
Upgrade41610to41700 159 7 6 0 43 2
DomainRouterVO 116 50 0 0 41 14
NicVO 187 78 0 0 76 28
ConsoleProxyDaoImpl 635 0 20 0 144 0
DomainRouterDaoImpl 1600 0 22 0 228 0
NicDaoImpl 1363 0 12 0 190 0
VolumeObject 773 665 77 39 178 117
CloudStackContextLoaderListener 77 0 2 0 21 0
LibvirtComputingResource 8888 1988 1087 147 1983 451
LibvirtStartCommandWrapper 75 295 16 22 15 72
VMwareGuru 2876 52 222 0 520 5
VmwareResource 20395 0 2274 0 4397 0
VmwareStorageProcessor 9892 9 940 0 2122 2
CitrixResourceBase 14658 557 1456 34 3169 122
CitrixCheckSshCommandWrapper 15 51 3 3 3 14
CitrixNetworkElementCommandWrapper 0 14 0 0 0 4
CitrixRebootRouterCommandWrapper 28 25 3 1 5 7
CitrixStartCommandWrapper 690 62 81 1 116 16
KubernetesClusterManagerImpl 4724 0 480 0 760 0
KubernetesClusterActionWorker 1540 0 114 0 281 0
KubernetesClusterResourceModifierActionWorker 1843 0 140 0 325 0
MetricsServiceImpl 2130 0 134 0 446 0
ClusterMetricsResponse 523 0 132 0 60 0
VmMetricsResponse 142 0 22 0 25 0
VolumeMetricsResponse 69 0 8 0 10 0
ZoneMetricsResponse 501 0 126 0 56 0
DomainChecker 1206 0 300 0 238 0
ApiDBUtils 2374 0 210 0 592 0
ApiResponseHelper 12144 0 1274 0 2779 0
ResponseObjectTypeAdapter 161 8 14 0 37 2
ParamProcessWorker 1050 0 155 0 241 0
QueryManagerImpl 14257 0 1248 0 2409 0
DomainRouterJoinDaoImpl 801 0 80 0 195 0
VolumeJoinDaoImpl 770 0 94 0 171 0
DomainRouterJoinVO 237 0 0 0 80 0
VolumeJoinVO 267 0 0 0 93 0
ConfigurationManagerImpl 18104 0 3032 0 3570 0
ConsoleProxyManagerImpl 3717 0 423 0 726 0
LibvirtServerDiscoverer 994 0 116 0 218 0
IpAddressManagerImpl 4045 0 461 0 806 0
NetworkModelImpl 6182 0 838 0 1300 0
NetworkServiceImpl 13383 0 1864 0 2551 0
GuestNetworkGuru 622 298 98 34 124 64
PrivateNetworkGuru 394 0 46 0 88 0
LoadBalancingRulesManagerImpl 6024 0 666 0 1254 0
NetworkHelperImpl 2073 0 274 0 441 0
VirtualNetworkApplianceManagerImpl 7585 0 780 0 1521 0
RulesManagerImpl 4074 0 492 0 790 0
VpcManagerImpl 6883 0 762 0 1314 0
ConfigurationServerImpl 2061 0 176 0 495 0
ManagementServerImpl 12159 0 1076 0 2428 0
StatsCollector 2178 0 124 0 359 0
StorageManagerImpl 8580 0 976 0 1710 0
VolumeApiServiceImpl 10851 0 1500 0 2035 0
AccountManagerImpl 6446 0 906 0 1351 0
UserVmManagerImpl 20798 0 2566 0 3869 0
CAManagerImpl 633 0 80 0 129 0
MockNetworkManagerImpl 485 0 22 0 83 0
SecondaryStorageManagerImpl 3544 149 343 11 629 32
VirtualMachineMO 8953 135 1063 17 1982 31

@github-actions
Copy link
Copy Markdown

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

@SadiJr SadiJr force-pushed the veeam-fix-restored-backup-vm-without-nics branch from 1e58000 to 44f3e0a Compare April 27, 2022 10:52
@github-actions
Copy link
Copy Markdown

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

Copy link
Copy Markdown
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@blueorangutan
Copy link
Copy Markdown

@DaanHoogland a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 5290

@DaanHoogland
Copy link
Copy Markdown
Contributor

@weizhouapache @SadiJr are you clear on how we should proceed with this?

@SadiJr
Copy link
Copy Markdown
Contributor Author

SadiJr commented Feb 14, 2023

@DaanHoogland sorry for the delay in answer, about the discussion of the usage of override of methods equals and hashCode in NicVO, I stand by my belief that this is the best approach and, in particular, I can't think of any scenario in which this change could cause any regression. Please do not take what I say offensively, but if any of the participants can describe a concrete case of such a regression, rather than just speculation, I would be immensely grateful. cc @harikrishna-patnala @weizhouapache.

In addition, I will change this PR to draft, until this point of disagreement is resolved, thus avoiding an early merge.

@SadiJr SadiJr marked this pull request as draft February 14, 2023 09:47
@weizhouapache
Copy link
Copy Markdown
Member

@DaanHoogland sorry for the delay in answer, about the discussion of the usage of override of methods equals and hashCode in NicVO, I stand by my belief that this is the best approach and, in particular, I can't think of any scenario in which this change could cause any regression. Please do not take what I say offensively, but if any of the participants can describe a concrete case of such a regression, rather than just speculation, I would be immensely grateful. cc @harikrishna-patnala @weizhouapache.

In addition, I will change this PR to draft, until this point of disagreement is resolved, thus avoiding an early merge.

@SadiJr
I agree with you, if the object does not have a unique key (id).

Anyway, this change should be ok. cc @DaanHoogland @harikrishna-patnala

@DaanHoogland
Copy link
Copy Markdown
Contributor

@DaanHoogland sorry for the delay in answer, about the discussion of the usage of override of methods equals and hashCode in NicVO, I stand by my belief that this is the best approach and, in particular, I can't think of any scenario in which this change could cause any regression. Please do not take what I say offensively, but if any of the participants can describe a concrete case of such a regression, rather than just speculation, I would be immensely grateful. cc @harikrishna-patnala @weizhouapache.
In addition, I will change this PR to draft, until this point of disagreement is resolved, thus avoiding an early merge.

@SadiJr I agree with you, if the object does not have a unique key (id).

Anyway, this change should be ok. cc @DaanHoogland @harikrishna-patnala

Sorry @SadiJr , I am not offended and I don ´t even recall the discussion. Though the subject you describe is a known cause for bugs, I don´t recall seeing this happening in cloudstack yet.
Also I lgtm´d this PR, but please ask me for a re-review, when you are done.

@SadiJr SadiJr marked this pull request as ready for review February 21, 2023 17:18
@sonarqubecloud
Copy link
Copy Markdown

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

0.0% 0.0% Coverage
0.0% 0.0% Duplication

@yadvr yadvr closed this May 23, 2023
@yadvr yadvr reopened this May 23, 2023
@yadvr
Copy link
Copy Markdown
Member

yadvr commented May 23, 2023

ping @harikrishna-patnala can you review this? Thanks.
@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@rohityadavcloud a [SF] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 6107

@DaanHoogland
Copy link
Copy Markdown
Contributor

@blueorangutan test rocky8 vmware-72

@blueorangutan
Copy link
Copy Markdown

@DaanHoogland [SF] unsupported parameters provided. Supported mgmt server os are: centos7, centos6, suse15, alma8, ubuntu18, ubuntu22, ubuntu20, rocky8, alma9. Supported hypervisors are: kvm-centos6, kvm-centos7, kvm-rocky8, kvm-alma8, kvm-alma9, kvm-ubuntu18, kvm-ubuntu20, kvm-ubuntu22, kvm-suse15, vmware-55u3, vmware-60u2, vmware-65u2, vmware-67u3, vmware-70u1, vmware-70u2, vmware-70u3, vmware-80, vmware-80u1, xenserver-65sp1, xenserver-71, xenserver-74, xcpng74, xcpng76, xcpng80, xcpng81, xcpng82

@DaanHoogland
Copy link
Copy Markdown
Contributor

@blueorangutan test rocky8 vmware-70u2

@blueorangutan
Copy link
Copy Markdown

@DaanHoogland a [SF] Trillian-Jenkins test job (rocky8 mgmt + vmware-70u2) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian Build Failed (tid-6570)

@DaanHoogland
Copy link
Copy Markdown
Contributor

@blueorangutan test rocky8 vmware-70u3

@blueorangutan
Copy link
Copy Markdown

@DaanHoogland a [SF] Trillian-Jenkins test job (rocky8 mgmt + vmware-70u3) has been kicked to run smoke tests

@weizhouapache
Copy link
Copy Markdown
Member

@blueorangutan test rocky8 vmware-70u2

@DaanHoogland
it looks vmware 7.0.2 templates have been removed ..

@blueorangutan
Copy link
Copy Markdown

@weizhouapache [SF] unsupported parameters provided. Supported mgmt server os are: centos7, centos6, suse15, alma8, ubuntu18, ubuntu22, ubuntu20, rocky8, alma9. Supported hypervisors are: kvm-centos6, kvm-centos7, kvm-rocky8, kvm-alma8, kvm-alma9, kvm-ubuntu18, kvm-ubuntu20, kvm-ubuntu22, kvm-suse15, vmware-55u3, vmware-60u2, vmware-65u2, vmware-67u3, vmware-70u1, vmware-70u2, vmware-70u3, vmware-80, vmware-80u1, xenserver-65sp1, xenserver-71, xenserver-74, xcpng74, xcpng76, xcpng80, xcpng81, xcpng82

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-6572)
Environment: vmware-70u3 (x2), Advanced Networking with Mgmt server r8
Total time taken: 43552 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr6282-t6572-vmware-70u3.zip
Smoke tests completed. 110 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

Copy link
Copy Markdown
Member

@harikrishna-patnala harikrishna-patnala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM
since there are no test failures, I will go with @SadiJr comment over equals methods. Thanks @SadiJr

@DaanHoogland DaanHoogland merged commit 3c5fdea into apache:main Jul 3, 2023
weizhouapache pushed a commit to shapeblue/cloudstack that referenced this pull request Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.