Skip to content

Ignore opensaml's slf4j dependencies#6693

Merged
shwstppr merged 1 commit intoapache:4.17from
scclouds:fix-logging-problem-on-agents-and-usage
Sep 5, 2022
Merged

Ignore opensaml's slf4j dependencies#6693
shwstppr merged 1 commit intoapache:4.17from
scclouds:fix-logging-problem-on-agents-and-usage

Conversation

@GutoVeronezi
Copy link
Copy Markdown
Contributor

Description

Commit f27de63 introduced a new version of opensaml. That version brought jcl-over-slf4j-1.7.5.jar, jul-to-slf4j-1.7.5.jar, and log4j-over-slf4j-1.7.5.jar as dependencies, which causes Agents and Usages to not generate logs:

image

In order to make the logs to work again, this PR intends to exclude these dependencies while building the packages.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

How Has This Been Tested?

After applying the commit f27de63, the logs on the Agents and Usages stopped being written. After applying this PR, the logs started to be written again.

@GutoVeronezi GutoVeronezi changed the title Ignore opensaml dependencies Ignore opensaml's slf4j dependencies Aug 30, 2022
@yadvr
Copy link
Copy Markdown
Member

yadvr commented Aug 30, 2022

@GutoVeronezi good find, can you edit PR base to 4.17 and rebase your PR branch to 4.17?

@GutoVeronezi GutoVeronezi changed the base branch from main to 4.17 August 30, 2022 19:37
@boring-cyborg boring-cyborg Bot added component:agent component:api Python Warning... Python code Ahead! labels Aug 30, 2022
@GutoVeronezi GutoVeronezi force-pushed the fix-logging-problem-on-agents-and-usage branch from b13f79a to b7bb44e Compare August 30, 2022 19:40
@acs-robot
Copy link
Copy Markdown

Found UI changes, kicking a new UI QA build
@blueorangutan ui

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Aug 30, 2022

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@rohityadavcloud a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 30, 2022

Codecov Report

❗ No coverage uploaded for pull request base (4.17@1c7efcb). Click here to learn what that means.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             4.17    #6693   +/-   ##
=======================================
  Coverage        ?    5.87%           
  Complexity      ?     3918           
=======================================
  Files           ?     2451           
  Lines           ?   242282           
  Branches        ?    37911           
=======================================
  Hits            ?    14223           
  Misses          ?   226488           
  Partials        ?     1571           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 4091

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Aug 30, 2022

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@rohityadavcloud a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@sonarqubecloud
Copy link
Copy Markdown

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@DaanHoogland DaanHoogland added this to the 4.17.1.0 milestone Aug 31, 2022
@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-4784)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 39573 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr6693-t4784-kvm-centos7.zip
Smoke tests completed. 101 look OK, 0 have errors
Only failed tests results shown below:

Test Result Time (s) Test File

@shwstppr shwstppr requested review from DaanHoogland and yadvr August 31, 2022 10:31
@shwstppr
Copy link
Copy Markdown
Contributor

shwstppr commented Sep 1, 2022

@GutoVeronezi I tried reproducing agent logs issue but don't see any error there in a main branch env. KVM hosts are,

[root@ref-trl-3519-k-M7-abhishek-kumar-kvm1 ~]# hostnamectl
   Static hostname: n/a
Transient hostname: ref-trl-3519-k-M7-abhishek-kumar-kvm1
         Icon name: computer-vm
           Chassis: vm
        Machine ID: 76b17df4966743ce9a20fe9a7098e2b6
           Boot ID: f186792149f445e894b67811406769f0
    Virtualization: vmware
  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 3.10.0-957.12.2.el7.x86_64
      Architecture: x86-64
[root@ref-trl-3519-k-M7-abhishek-kumar-kvm1 ~]# virsh version
Compiled against library: libvirt 4.5.0
Using library: libvirt 4.5.0
Using API: QEMU 4.5.0
Running hypervisor: QEMU 1.5.3
[root@ref-trl-3519-k-M7-abhishek-kumar-kvm2 ~]# systemctl status cloudstack-agent
● cloudstack-agent.service - CloudStack Agent
   Loaded: loaded (/usr/lib/systemd/system/cloudstack-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2022-09-01 07:49:31 UTC; 16min ago
     Docs: http://www.cloudstack.org/
 Main PID: 14337 (java)
    Tasks: 58
   CGroup: /system.slice/cloudstack-agent.service
           └─14337 /usr/bin/java -Djava.io.tmpdir=/usr/share/cloudstack-agent/tmp -Xms256m -Xmx2048m -cp /usr/share/cloudstack-agent/lib/*:/usr/share/cloudstack-agent/plugins/*:/etc/cloudstack/agent:/usr/share/cloudstack-common/scripts com.cloud.agent.AgentShell

Sep 01 08:04:46 ref-trl-3519-k-M7-abhishek-kumar-kvm2 java[14337]: WARN  [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-2:) (logid:7d8f2008) Expected 1 answers while executing VmDataCommand but received 2
Sep 01 08:04:56 ref-trl-3519-k-M7-abhishek-kumar-kvm2 java[14337]: INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-4:) (logid:7d8f2008) Trying to fetch storage pool 63324aa1-8eb3-3100-b7e0-ee28693b442c from libvirt
Sep 01 08:04:56 ref-trl-3519-k-M7-abhishek-kumar-kvm2 java[14337]: INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-4:) (logid:7d8f2008) Trying to fetch storage pool 63324aa1-8eb3-3100-b7e0-ee28693b442c from libvirt
Sep 01 08:04:56 ref-trl-3519-k-M7-abhishek-kumar-kvm2 java[14337]: libvirt: Storage Driver error : Storage volume not found: no storage vol with matching name '6d9d86aa-29c9-11ed-9e50-1e00a4000983'
Sep 01 08:04:56 ref-trl-3519-k-M7-abhishek-kumar-kvm2 java[14337]: INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-4:) (logid:7d8f2008) Creating volume c37b02fa-40a0-4353-b90f-ecb902c24713 from template 6d9d86aa-29c9-11ed-9e50-1e00a40009....00 GB) 8589934592
Sep 01 08:04:56 ref-trl-3519-k-M7-abhishek-kumar-kvm2 java[14337]: INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-4:) (logid:7d8f2008) Attempting to create volume c37b02fa-40a0-4353-b90f-ecb902c24713 (NetworkFilesystem) in pool 63324aa1....00 GB) 8589934592
Sep 01 08:05:14 ref-trl-3519-k-M7-abhishek-kumar-kvm2 java[14337]: INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-3:) (logid:87f1683c) Trying to fetch storage pool 63324aa1-8eb3-3100-b7e0-ee28693b442c from libvirt
Sep 01 08:05:14 ref-trl-3519-k-M7-abhishek-kumar-kvm2 java[14337]: INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-3:) (logid:87f1683c) Asking libvirt to refresh storage pool 63324aa1-8eb3-3100-b7e0-ee28693b442c
Sep 01 08:06:15 ref-trl-3519-k-M7-abhishek-kumar-kvm2 java[14337]: INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-2:) (logid:9694aab1) Trying to fetch storage pool 63324aa1-8eb3-3100-b7e0-ee28693b442c from libvirt
Sep 01 08:06:15 ref-trl-3519-k-M7-abhishek-kumar-kvm2 java[14337]: INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-2:) (logid:9694aab1) Asking libvirt to refresh storage pool 63324aa1-8eb3-3100-b7e0-ee28693b442c
Hint: Some lines were ellipsized, use -l to show in full.
[root@ref-trl-3519-k-M7-abhishek-kumar-kvm2 ~]# tail -f /var/log/cloudstack/agent/agent.log
2022-09-01 08:15:17,247 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-4:null) (logid:29a93b4d) Trying to fetch storage pool 12ae4dc2-63ce-359b-ae80-e1036b2d34d8 from libvirt
2022-09-01 08:15:17,250 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-4:null) (logid:29a93b4d) Asking libvirt to refresh storage pool 12ae4dc2-63ce-359b-ae80-e1036b2d34d8
2022-09-01 08:16:17,503 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-3:null) (logid:2586da6c) Trying to fetch storage pool 12ae4dc2-63ce-359b-ae80-e1036b2d34d8 from libvirt
2022-09-01 08:16:17,507 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-3:null) (logid:2586da6c) Asking libvirt to refresh storage pool 12ae4dc2-63ce-359b-ae80-e1036b2d34d8
2022-09-01 08:17:17,746 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-5:null) (logid:6f1a9673) Trying to fetch storage pool 12ae4dc2-63ce-359b-ae80-e1036b2d34d8 from libvirt
2022-09-01 08:17:17,749 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-5:null) (logid:6f1a9673) Asking libvirt to refresh storage pool 12ae4dc2-63ce-359b-ae80-e1036b2d34d8
2022-09-01 08:18:17,884 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-3:null) (logid:efb32bf5) Trying to fetch storage pool 63324aa1-8eb3-3100-b7e0-ee28693b442c from libvirt
2022-09-01 08:18:17,891 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-3:null) (logid:efb32bf5) Asking libvirt to refresh storage pool 63324aa1-8eb3-3100-b7e0-ee28693b442c
2022-09-01 08:18:17,937 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-2:null) (logid:efb32bf5) Trying to fetch storage pool 12ae4dc2-63ce-359b-ae80-e1036b2d34d8 from libvirt
2022-09-01 08:18:17,941 INFO  [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-2:null) (logid:efb32bf5) Asking libvirt to refresh storage pool 12ae4dc2-63ce-359b-ae80-e1036b2d34d8

Am I missing anything?

Copy link
Copy Markdown
Member

@yadvr yadvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - needs testing (maybe as simple as before this PR do we see slf4j jars say in the kvm/ssvm/cpvm agent and if those are then not seen with this PR)

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Sep 5, 2022

@shwstppr if @GutoVeronezi isn't responding, I don't see any harm in merging this - you can atleast regression test and check logs on ssvm/cpvm/kvm agents.

@shwstppr
Copy link
Copy Markdown
Contributor

shwstppr commented Sep 5, 2022

OKay @rohityadavcloud. I'll verify for any regressions today

@shwstppr
Copy link
Copy Markdown
Contributor

shwstppr commented Sep 5, 2022

Verified logging on systemvms and hosts with following config:

  • centos7
  • rocky8
  • ubuntu20
    No regressions.

@shwstppr shwstppr merged commit e76df16 into apache:4.17 Sep 5, 2022
@GutoVeronezi
Copy link
Copy Markdown
Contributor Author

@shwstppr, sorry for the delay.

The occurence of this error depends on how Java loads the four libraries mentioned.

The classpath is defined as /usr/share/cloudstack-agent/lib/*; therefore, it will load everything in the directory. Java uses File#list to load the JARs, which does not guarantee an order in the listing. With the JARs loaded, the classloader will try to find the first appropriate class between the libraries (as explained in this documentation); therefore, in some cases (environments) it can load the dependencies in different order.

We upgraded more than 40 services (Agents and Usages - in CentOS 7 and Ubuntu 20.04) to the opensaml upgrade patch and around 10 presented the error (around 25%), meaning that most of the time the libraries are loaded correctly; however, sometimes they are loaded in a different order that causes the error.

Due to lack of time we could not find the specificity that triggers the loading order difference between the environments; however, we do understand that they are caused by the dependencies introduced with the new opensaml version.

shwstppr pushed a commit to shapeblue/cloudstack that referenced this pull request Mar 13, 2023
Fixes apache#194

Commit f27de63 introduced a new version of opensaml. That version brought jcl-over-slf4j-1.7.5.jar, jul-to-slf4j-1.7.5.jar, and log4j-over-slf4j-1.7.5.jar as dependencies, which causes Agents and Usages to not generate logs.
In order to make the logs to work again, this PR intends to exclude these dependencies while building the packages.

(cherry picked from commit e76df16)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
@vishesh92 vishesh92 mentioned this pull request Jul 25, 2023
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants