Skip to content

CKS Enhancements and SystemVM template upgrade improvements#5863

Merged
sureshanaparti merged 24 commits into
apache:4.16from
shapeblue:systemvm-improv-cks-enhance
Feb 15, 2022
Merged

CKS Enhancements and SystemVM template upgrade improvements#5863
sureshanaparti merged 24 commits into
apache:4.16from
shapeblue:systemvm-improv-cks-enhance

Conversation

@Pearl1594
Copy link
Copy Markdown
Contributor

@Pearl1594 Pearl1594 commented Jan 14, 2022

Description

This PR comprises of the following:

  • Support to fallback on the older systemVM template in case of no change in template across ACS versions
  • Update core user to cloud in CKS
  • Display details of accessing CKS nodes in the UI - K8s Access tab
  • Update systemvm template from debian 11 to debian 11.2
  • Support to configure containerd with private image repository configuration
  • Update letsencrypt certificate in the SystemVMs
  • Remove docker dependency as from ACS 4.16 onward k8s has deprecated support for docker - use containerd as container runtime
  • Update old systemvm templates to USER type via UI

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Screenshots (if appropriate):

How Has This Been Tested?

- Support to fallback on the older systemVM template in case of no change in template across ACS versions
- Update core user to cloud in CKS
- Display details of accessing CKS nodes in the UI - K8s Access tab
- Update systemvm template from debian 11 to debian 11.2
- Update letsencrypt cert
- Remove docker dependency as from ACS 4.16 onward k8s has deprecated support for docker - use containerd as container runtime
@Pearl1594 Pearl1594 added this to the 4.16.1.0 milestone Jan 14, 2022
@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 2209

@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan test matrix

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a Trillian-Jenkins matrix job (centos7 mgmt + xs71, centos7 mgmt + vmware65, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-2881)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 43709 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5863-t2881-kvm-centos7.zip
Smoke tests completed. 91 look OK, 1 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_invalid_upgrade_kubernetes_cluster Failure 3608.23 test_kubernetes_clusters.py
test_02_upgrade_kubernetes_cluster Failure 3608.64 test_kubernetes_clusters.py
test_03_deploy_and_scale_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_04_autoscale_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_05_basic_lifecycle_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_06_delete_kubernetes_cluster Failure 0.03 test_kubernetes_clusters.py
test_07_deploy_kubernetes_ha_cluster Failure 0.03 test_kubernetes_clusters.py
test_08_upgrade_kubernetes_ha_cluster Failure 0.04 test_kubernetes_clusters.py
test_09_delete_kubernetes_ha_cluster Failure 0.03 test_kubernetes_clusters.py
ContextSuite context=TestKubernetesCluster>:teardown Error 84.74 test_kubernetes_clusters.py

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-2880)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 43897 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5863-t2880-xenserver-71.zip
Smoke tests completed. 90 look OK, 2 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_invalid_upgrade_kubernetes_cluster Failure 3608.15 test_kubernetes_clusters.py
test_02_upgrade_kubernetes_cluster Failure 3603.64 test_kubernetes_clusters.py
test_03_deploy_and_scale_kubernetes_cluster Failure 0.06 test_kubernetes_clusters.py
test_04_autoscale_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_05_basic_lifecycle_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_06_delete_kubernetes_cluster Failure 0.03 test_kubernetes_clusters.py
test_07_deploy_kubernetes_ha_cluster Failure 0.03 test_kubernetes_clusters.py
test_08_upgrade_kubernetes_ha_cluster Failure 0.04 test_kubernetes_clusters.py
test_09_delete_kubernetes_ha_cluster Failure 0.05 test_kubernetes_clusters.py
ContextSuite context=TestKubernetesCluster>:teardown Error 42.41 test_kubernetes_clusters.py
test_01_sys_vm_start Failure 0.09 test_secondary_storage.py

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-2882)
Environment: vmware-65u2 (x2), Advanced Networking with Mgmt server 7
Total time taken: 46722 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5863-t2882-vmware-65u2.zip
Smoke tests completed. 91 look OK, 1 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_invalid_upgrade_kubernetes_cluster Failure 3632.15 test_kubernetes_clusters.py
test_02_upgrade_kubernetes_cluster Failure 3616.01 test_kubernetes_clusters.py
test_03_deploy_and_scale_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_04_autoscale_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_05_basic_lifecycle_kubernetes_cluster Failure 0.03 test_kubernetes_clusters.py
test_06_delete_kubernetes_cluster Failure 0.03 test_kubernetes_clusters.py
test_07_deploy_kubernetes_ha_cluster Failure 0.04 test_kubernetes_clusters.py
test_08_upgrade_kubernetes_ha_cluster Failure 0.04 test_kubernetes_clusters.py
test_09_delete_kubernetes_ha_cluster Failure 0.03 test_kubernetes_clusters.py
ContextSuite context=TestKubernetesCluster>:teardown Error 40.82 test_kubernetes_clusters.py

@Pearl1594
Copy link
Copy Markdown
Contributor Author

failures wrt k8s tests are mostly due to systemvm template - it requires a new systemVM template

@Pearl1594
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@Pearl1594 a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✖️ debian ✔️ suse15. SL-JID 2223

@Pearl1594
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@Pearl1594 a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@weizhouapache
Copy link
Copy Markdown
Member

@Pearl1594
regarding the last commit 'On successful upgrade, update old systemvm templates to USER type, so that they can be deleted', we can add the option on UI (updating template to USER type is already available for SYSTEM template via API) and let the users decide.

@Pearl1594
Copy link
Copy Markdown
Contributor Author

Thanks for letting me know @weizhouapache - wasn't aware that we supported updating templates of system type. I'll revert this change then and enable via UI.

@Pearl1594 Pearl1594 force-pushed the systemvm-improv-cks-enhance branch from eeedeb0 to 36ebfc2 Compare January 17, 2022 08:15
@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 2572

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3294)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 30692 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5863-t3294-kvm-centos7.zip
Smoke tests completed. 92 look OK, 0 have errors
Only failed tests results shown below:

Test Result Time (s) Test File

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3293)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 33071 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5863-t3293-xenserver-71.zip
Smoke tests completed. 91 look OK, 1 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_cancel_host_maintenace_with_no_migration_jobs Error 306.34 test_host_maintenance.py
test_02_cancel_host_maintenace_with_migration_jobs Error 306.24 test_host_maintenance.py
test_03_cancel_host_maintenace_with_migration_jobs_failure Error 0.20 test_host_maintenance.py

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3295)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server 7
Total time taken: 34278 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5863-t3295-vmware-67u3.zip
Smoke tests completed. 92 look OK, 0 have errors
Only failed tests results shown below:

Test Result Time (s) Test File

@vladimirpetrov
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@vladimirpetrov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 2579

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3301)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 32938 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5863-t3301-xenserver-71.zip
Smoke tests completed. 92 look OK, 0 have errors
Only failed tests results shown below:

Test Result Time (s) Test File

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3300)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 34490 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5863-t3300-kvm-centos7.zip
Smoke tests completed. 91 look OK, 1 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_disable_oobm_ha_state_ineligible Error 1512.19 test_hostha_kvm.py

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3299)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server 7
Total time taken: 37851 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5863-t3299-vmware-67u3.zip
Smoke tests completed. 92 look OK, 0 have errors
Only failed tests results shown below:

Test Result Time (s) Test File

Comment thread engine/schema/src/main/resources/META-INF/db/schema-41600to41610.sql Outdated
Pearl1594 and others added 2 commits February 15, 2022 14:26
…10.sql

Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
@Pearl1594
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@Pearl1594 a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

Copy link
Copy Markdown
Contributor

@vladimirpetrov vladimirpetrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on manual testing. Here are the cases I have covered:

  • Display details of accessing CKS nodes in the UI - K8s Access tab - OK
  • Update core user to cloud in CKS - OK
  • Update systemvm template from debian 11 to debian 11.2 - OK
  • Update old systemvm templates to USER type via UI - OK
  • Remove docker dependency as from ACS 4.16 onward k8s has deprecated support for docker - use containerd as container runtime - OK
  • Fixes part 2 of the issue mentioned in the ticket CS-2132 - OK
  • Support to configure containerd with private image repository configuration - OK
  • Create kube cluster
    • v1.22.6 - OK
  • Manual scale up/down kube cluster
    • 1.22.6 - OK
  • Auto scale up/down kube cluster
    • 1.22.6 - OK
  • Upgrade from previous version, multi-hypervisor environment:
    • from 4.16.0 - OK
  • Upgrade from previous version with existing Kubernetes cluster:
    • from 4.16.0 (vmware) - OK
  • Upgrade from previous versions tested:
    • from 4.15.2 (vmware) - OK
    • from 4.14.1 (KVM) - OK

@sureshanaparti
Copy link
Copy Markdown
Contributor

LGTM based on manual testing. Here are the cases I have covered:

  • Display details of accessing CKS nodes in the UI - K8s Access tab - OK

  • Update core user to cloud in CKS - OK

  • Update systemvm template from debian 11 to debian 11.2 - OK

  • Update old systemvm templates to USER type via UI - OK

  • Remove docker dependency as from ACS 4.16 onward k8s has deprecated support for docker - use containerd as container runtime - OK

  • Fixes part 2 of the issue mentioned in the ticket CS-2132 - OK

  • Support to configure containerd with private image repository configuration - OK

  • Create kube cluster

    • v1.22.6 - OK
  • Manual scale up/down kube cluster

    • 1.22.6 - OK
  • Auto scale up/down kube cluster

    • 1.22.6 - OK
  • Upgrade from previous version, multi-hypervisor environment:

    • from 4.16.0 - OK
  • Upgrade from previous version with existing Kubernetes cluster:

    • from 4.16.0 (vmware) - OK
  • Upgrade from previous versions tested:

    • from 4.15.2 (vmware) - OK
    • from 4.14.1 (KVM) - OK

Thanks for testing @vladimirpetrov

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 2612

@Pearl1594 Pearl1594 marked this pull request as ready for review February 15, 2022 10:59
@sureshanaparti sureshanaparti merged commit e0a5df5 into apache:4.16 Feb 15, 2022
@alexandremattioli
Copy link
Copy Markdown
Contributor

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

Due to Letsencrypt CA cert change, many https:// template url may not work with 4.16 systemvmtemplate

9 participants