Skip to content

{AKS} Fix DCR create/update for container network logs and high log scale mode#9667

Open
carlotaarvela wants to merge 26 commits intoAzure:mainfrom
carlotaarvela:cnl-hslm-update
Open

{AKS} Fix DCR create/update for container network logs and high log scale mode#9667
carlotaarvela wants to merge 26 commits intoAzure:mainfrom
carlotaarvela:cnl-hslm-update

Conversation

@carlotaarvela
Copy link
Copy Markdown
Contributor

@carlotaarvela carlotaarvela commented Mar 10, 2026

Related command

az aks create, az aks update

General Guidelines

  • Have you run azdev style <YOUR_EXT> locally? (pip install azdev required)
  • Have you run python scripts/ci/test_index.py -q locally? (pip install wheel==0.30.0 required)
  • My extension version conforms to the Extension version schema

For new extensions:

About Extension Publish

There is a pipeline to automatically build, upload and publish extension wheels.
Once your pull request is merged into main branch, a new pull request will be created to update src/index.json automatically.
You only need to update the version information in file setup.py and historical information in file HISTORY.rst in your PR but do not modify src/index.json.


Description

Fixes the az aks create and az aks update commands so that the Data Collection Rule (DCR) is properly created or updated when container network logs (CNL) or high log scale mode (HLSM) flags are passed. Previously, the DCR was not being created/updated in these scenarios, causing streams like Microsoft-ContainerLogV2-HighScale to be out of sync.

Also fixes --disable-azure-monitor-logs so it actually disables monitoring by setting azureMonitorProfile.containerInsights.enabled = False (the RP's source of truth), and handles omsAgent vs omsagent key casing inconsistencies from the Azure API.

Changes

Shared utilities:

  • Added _get_monitoring_addon_key() helper to resolve the monitoring addon key from a cluster's addon_profiles, handling both omsagent and omsAgent key variants returned by the API.

Context (AKSPreviewManagedClusterContext):

  • Fixed get_enable_msi_auth_for_monitoring() to detect MSI auth on clusters where service_principal_profile.client_id is "msi" by checking the addon's useAADAuth config instead of returning False.
  • Fixed get_container_network_logs() and get_enable_high_log_scale_mode() to handle both omsagent and omsAgent addon key variants.
  • Added validation: disabling HLSM while CNL is still enabled now raises a MutuallyExclusiveArgumentError.
  • Added enable_high_log_scale_mode to the special parameter defaults list so standalone --enable-high-log-scale-mode passes the check_raw_parameters gate.

Create path (AKSPreviewManagedClusterCreateDecorator):

  • Moved DCR creation from postprocessing to _setup_azure_monitor_logs() (pre-PUT), matching the base class build_monitoring_addon_profile pattern. DCRA is still created in postprocessing.
  • Added _should_create_dcra() and _is_cnl_or_hlsm_changing() helpers to detect when CNL/HLSM flags require DCR/DCRA creation.
  • Refactored _postprocess_monitoring_enable() and _postprocess_monitoring_disable() into reusable methods.

Update path (AKSPreviewManagedClusterUpdateDecorator):

  • update_monitoring_profile_flow_logs(): Added base class delegation (super().update_monitoring_profile_flow_logs(mc)) for CLI >= 2.84.0 compatibility, and guarded the call site in update_mc_profile_preview() to avoid double execution.
  • When --enable-container-network-logs or --enable-retina-flow-logs is passed, the monitoring_addon_postprocessing_required intermediate is now set, triggering the DCR update via ensure_container_insights_for_monitoring.
  • Added validation for standalone --enable-high-log-scale-mode: requires the monitoring addon with MSI auth to be enabled, with an escape hatch when --enable-azure-monitor-logs is passed simultaneously.
  • Added validation for --enable-high-log-scale-mode false: prevents disabling HLSM while CNL is still enabled.
  • Fixed _disable_azure_monitor_logs(): now sets mc.azure_monitor_profile.container_insights.enabled = False (the RP source of truth) and uses config = None instead of config = {} to match the base CLI behavior.
  • Fixed monitoring postprocessing to use _get_monitoring_addon_key() for correct key resolution.

Tests:

  • 6 new live integration tests: test_aks_create_with_azuremonitorlogs_and_cnl, test_aks_update_enable_azuremonitorlogs_with_hlsm, test_aks_create_with_retina_flow_logs_alias, test_aks_update_enable_cnl_via_azuremonitorlogs, test_aks_update_disable_azuremonitorlogs, test_aks_update_standalone_enable_high_log_scale_mode.
  • 38 new unit tests covering context methods, create decorator, update decorator, postprocessing, key casing, and validation error paths.

Testing

  • Unit tests: pytest src/aks-preview/azext_aks_preview/tests/latest/test_managed_cluster_decorator.py -k "container_network_logs or high_log_scale_mode or enable_container_network or standalone_high_log or disable_azure_monitor or postprocessing_create_dcr or update_monitoring_profile_flow_logs"
  • Live integration tests: AZURE_TEST_RUN_LIVE=true pytest src/aks-preview/azext_aks_preview/tests/latest/test_aks_commands.py -k "test_aks_create_with_azuremonitorlogs_and_cnl or test_aks_update_enable_azuremonitorlogs_with_hlsm or test_aks_create_with_retina_flow_logs_alias or test_aks_update_enable_cnl_via_azuremonitorlogs or test_aks_update_disable_azuremonitorlogs or test_aks_update_standalone_enable_high_log_scale_mode"

@azure-client-tools-bot-prd
Copy link
Copy Markdown

azure-client-tools-bot-prd bot commented Mar 10, 2026

️✔️Azure CLI Extensions Breaking Change Test
️✔️Non Breaking Changes

@azure-client-tools-bot-prd
Copy link
Copy Markdown

Hi @carlotaarvela,
Please write the description of changes which can be perceived by customers into HISTORY.rst.
If you want to release a new extension version, please update the version in setup.py as well.

@yonzhan
Copy link
Copy Markdown
Collaborator

yonzhan commented Mar 10, 2026

Thank you for your contribution! We will review the pull request and get back to you soon.

@github-actions
Copy link
Copy Markdown
Contributor

The git hooks are available for azure-cli and azure-cli-extensions repos. They could help you run required checks before creating the PR.

Please sync the latest code with latest dev branch (for azure-cli) or main branch (for azure-cli-extensions).
After that please run the following commands to enable git hooks:

pip install azdev --upgrade
azdev setup -c <your azure-cli repo path> -r <your azure-cli-extensions repo path>

@github-actions
Copy link
Copy Markdown
Contributor

CodeGen Tools Feedback Collection

Thank you for using our CodeGen tool. We value your feedback, and we would like to know how we can improve our product. Please take a few minutes to fill our codegen survey

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 10, 2026

@yonzhan yonzhan requested a review from yanzhudd March 11, 2026 01:09
@carlotaarvela carlotaarvela changed the title Fix update enable cnl hlsm {AKS} Fix DCR create/update for container network logs and high log scale mode Mar 12, 2026
@github-actions github-actions bot added the release-version-block Updates do not qualify release version rules. NOTE: please do not edit it manually. label Mar 13, 2026
@github-actions github-actions bot removed the release-version-block Updates do not qualify release version rules. NOTE: please do not edit it manually. label Mar 13, 2026
@carlotaarvela carlotaarvela marked this pull request as ready for review March 16, 2026 19:38
Copilot AI review requested due to automatic review settings March 16, 2026 19:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes AKS az aks create / az aks update behavior so the Azure Monitor Data Collection Rule (DCR) stays in sync when container network logs (CNL) and/or high log scale mode (HLSM) flags are used, and adds unit tests to cover those scenarios.

Changes:

  • Update create/update decorators to correctly trigger DCR/DCRA creation/update when CNL/HLSM-related flags are provided.
  • Add/extend validation and casing handling for monitoring addon profiles (omsagent vs omsAgent) in relevant paths.
  • Add comprehensive unit tests for create/update flows covering CNL/HLSM combinations and edge cases.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
src/aks-preview/setup.py Bumps extension version to 19.0.0b26.
src/aks-preview/azext_aks_preview/managed_cluster_decorator.py Adjusts create/update postprocessing and validation so DCR/DCRA are updated for CNL/HLSM scenarios, including addon key casing handling.
src/aks-preview/azext_aks_preview/tests/latest/test_managed_cluster_decorator.py Adds/extends unit tests for the new create/update behaviors and edge cases.
src/aks-preview/HISTORY.rst Documents the fix and validation changes in release notes.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Copy Markdown
Member

@FumingZhang FumingZhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some integration tests and run the existing tests to confirm that the change works as expected?

@github-actions github-actions bot added release-version-block Updates do not qualify release version rules. NOTE: please do not edit it manually. and removed release-version-block Updates do not qualify release version rules. NOTE: please do not edit it manually. labels Mar 18, 2026
Copy link
Copy Markdown
Member

@FumingZhang FumingZhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Queued live test to validate the change.

  • test_aks_create_with_azuremonitorlogs_and_cnl
  • test_aks_update_enable_azuremonitorlogs_with_hlsm
  • test_aks_create_with_retina_flow_logs_alias
  • test_aks_update_enable_cnl_via_azuremonitorlogs
  • test_aks_update_disable_azuremonitorlogs
  • test_aks_update_standalone_enable_high_log_scale_mode

Could you please schedule another live test for cases involving the monitoring addon to ensure the change does not lead to any regression? I am relying on the test results to approve the change. @carlotaarvela

@FumingZhang
Copy link
Copy Markdown
Member

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 2 pipeline(s).

@FumingZhang
Copy link
Copy Markdown
Member

Re-queued live test

@FumingZhang
Copy link
Copy Markdown
Member

Please rebase from main to pick up the fix for linter failure about az aks delete, cc @carlotaarvela

The CI GH Azdev Linter / azdev-linter (pull_request_target) sets up the test environment based on your branch's HEAD, not the merge reference, so it won't automatically pick up the fix.

@carlotaarvela
Copy link
Copy Markdown
Contributor Author

Live tests are passing

@FumingZhang
Copy link
Copy Markdown
Member

Hi @carlotaarvela, please resolve merge conflicts

Copy link
Copy Markdown
Member

@FumingZhang FumingZhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-queued live test

Comment on lines +19 to +20
* `az aks create/update`: Fix DCR not being created or updated when `--enable-container-network-logs`, `--enable-retina-flow-logs`, or `--enable-high-log-scale-mode` flags are used, ensuring the Data Collection Rule streams (e.g. `Microsoft-ContainerLogV2-HighScale`) are kept in sync.
* `az aks update`: Add validation for `--enable-high-log-scale-mode` on the update path requiring the monitoring addon with MSI authentication to be enabled.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add the history notes under the Pending section in your new release version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants