Skip to content

Conversation

@sureshanaparti
Copy link
Contributor

Description

The PRs improves the following in storage pool monitor disconnect process.

  • Removes unwanted synchronize as it blocks disconnect for all hosts on management server
  • Improves logs for the disconnect process

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

- Removes unwanted synchronize as it blocks disconnect for all hosts on management server
- Improves logs for the disconnect process
@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link

codecov bot commented Jan 9, 2026

Codecov Report

❌ Patch coverage is 0% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 16.23%. Comparing base (9a38e75) to head (050f47d).
⚠️ Report is 1 commits behind head on 4.20.

Files with missing lines Patch % Lines
...com/cloud/storage/listener/StoragePoolMonitor.java 0.00% 20 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.20   #12398      +/-   ##
============================================
- Coverage     16.23%   16.23%   -0.01%     
- Complexity    13375    13376       +1     
============================================
  Files          5657     5657              
  Lines        498933   498946      +13     
  Branches      60553    60552       -1     
============================================
+ Hits          81012    81014       +2     
- Misses       408889   408901      +12     
+ Partials       9032     9031       -1     
Flag Coverage Δ
uitests 4.00% <ø> (ø)
unittests 17.09% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16313

Copy link
Member

@vishesh92 vishesh92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the storage pool monitor disconnect process by removing synchronization that blocks disconnects for all hosts and adding comprehensive logging and profiling.

Changes:

  • Removed synchronized keyword from processDisconnect methods to prevent blocking
  • Added detailed debug logging throughout the disconnect flow to track each step
  • Implemented profiling to measure disconnect operation duration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +206 to +207
long disconnectDuration = disconnectProfiler.getDurationInMillis() / 1000;
logger.debug("Finished disconnect with result {} from Storage Pool {} ({}) ({} of {}) for host: {}, duration: {} secs", disconnectResult, pool.getName(), pool.getUuid(), i, storagePoolHostsSize, host, disconnectDuration);
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The profiling duration calculation uses integer division which will truncate to 0 for operations taking less than 1000 milliseconds. This makes the timing information less useful since most disconnect operations likely complete in under a second. Consider using getDurationInMillis() directly and logging in milliseconds, or convert to a double/float for fractional seconds.

Suggested change
long disconnectDuration = disconnectProfiler.getDurationInMillis() / 1000;
logger.debug("Finished disconnect with result {} from Storage Pool {} ({}) ({} of {}) for host: {}, duration: {} secs", disconnectResult, pool.getName(), pool.getUuid(), i, storagePoolHostsSize, host, disconnectDuration);
long disconnectDurationMillis = disconnectProfiler.getDurationInMillis();
logger.debug("Finished disconnect with result {} from Storage Pool {} ({}) ({} of {}) for host: {}, duration: {} ms", disconnectResult, pool.getName(), pool.getUuid(), i, storagePoolHostsSize, host, disconnectDurationMillis);

Copilot uses AI. Check for mistakes.
Comment on lines +178 to +207
logger.debug("Processing disconnect from Storage Pool {} ({} of {}) for host: {}", storagePoolHost.getPoolId(), i, storagePoolHostsSize, host);
StoragePoolVO pool = _poolDao.findById(storagePoolHost.getPoolId());
if (pool == null) {
logger.debug("No Storage Pool found with id {} ({} of {}) for host: {}", storagePoolHost.getPoolId(), i, storagePoolHostsSize, host);
continue;
}

if (!pool.isShared()) {
logger.debug("Storage Pool {} ({}) ({} of {}) is not shared for host: {}, ignore disconnect", pool.getName(), pool.getUuid(), i, storagePoolHostsSize, host);
continue;
}

// Handle only PowerFlex pool for now, not to impact other pools behavior
if (pool.getPoolType() != StoragePoolType.PowerFlex) {
logger.debug("Storage Pool {} ({}) ({} of {}) is not of type {} for host: {}, ignore disconnect", pool.getName(), pool.getUuid(), i, storagePoolHostsSize, pool.getPoolType(), host);
continue;
}

logger.debug("Sending disconnect to Storage Pool {} ({}) ({} of {}) for host: {}", pool.getName(), pool.getUuid(), i, storagePoolHostsSize, host);
Profiler disconnectProfiler = new Profiler();
try {
disconnectProfiler.start();
_storageManager.disconnectHostFromSharedPool(host, pool);
} catch (Exception e) {
logger.error("Unable to disconnect host {} from storage pool {} due to {}", host, pool, e.toString());
disconnectResult = false;
} finally {
disconnectProfiler.stop();
long disconnectDuration = disconnectProfiler.getDurationInMillis() / 1000;
logger.debug("Finished disconnect with result {} from Storage Pool {} ({}) ({} of {}) for host: {}, duration: {} secs", disconnectResult, pool.getName(), pool.getUuid(), i, storagePoolHostsSize, host, disconnectDuration);
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop index message format is inconsistent. The message shows "i of storagePoolHostsSize" which displays as "0 of 5", "1 of 5", etc. This is confusing because the index is 0-based while the total count is 1-based. Consider using "i+1" to display "1 of 5", "2 of 5" for better readability in log messages.

Copilot uses AI. Check for mistakes.
Comment on lines 198 to 208
try {
disconnectProfiler.start();
_storageManager.disconnectHostFromSharedPool(host, pool);
} catch (Exception e) {
logger.error("Unable to disconnect host {} from storage pool {} due to {}", host, pool, e.toString());
disconnectResult = false;
} finally {
disconnectProfiler.stop();
long disconnectDuration = disconnectProfiler.getDurationInMillis() / 1000;
logger.debug("Finished disconnect with result {} from Storage Pool {} ({}) ({} of {}) for host: {}, duration: {} secs", disconnectResult, pool.getName(), pool.getUuid(), i, storagePoolHostsSize, host, disconnectDuration);
}
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The disconnectResult variable logged in the finally block at line 207 does not accurately represent the result of the current disconnect operation. The variable is set to false only when an exception occurs, but it retains its previous value (from previous iterations or the initial true value) when the current operation succeeds. This makes the log message misleading. Consider using a local boolean variable to track the result of the current disconnect operation specifically.

Suggested change
try {
disconnectProfiler.start();
_storageManager.disconnectHostFromSharedPool(host, pool);
} catch (Exception e) {
logger.error("Unable to disconnect host {} from storage pool {} due to {}", host, pool, e.toString());
disconnectResult = false;
} finally {
disconnectProfiler.stop();
long disconnectDuration = disconnectProfiler.getDurationInMillis() / 1000;
logger.debug("Finished disconnect with result {} from Storage Pool {} ({}) ({} of {}) for host: {}, duration: {} secs", disconnectResult, pool.getName(), pool.getUuid(), i, storagePoolHostsSize, host, disconnectDuration);
}
boolean currentDisconnectResult = true;
try {
disconnectProfiler.start();
_storageManager.disconnectHostFromSharedPool(host, pool);
} catch (Exception e) {
logger.error("Unable to disconnect host {} from storage pool {} due to {}", host, pool, e.toString());
currentDisconnectResult = false;
} finally {
disconnectProfiler.stop();
long disconnectDuration = disconnectProfiler.getDurationInMillis() / 1000;
logger.debug("Finished disconnect with result {} from Storage Pool {} ({}) ({} of {}) for host: {}, duration: {} secs", currentDisconnectResult, pool.getName(), pool.getUuid(), i, storagePoolHostsSize, host, disconnectDuration);
}
if (!currentDisconnectResult) {
disconnectResult = false;
}

Copilot uses AI. Check for mistakes.

// Handle only PowerFlex pool for now, not to impact other pools behavior
if (pool.getPoolType() != StoragePoolType.PowerFlex) {
logger.debug("Storage Pool {} ({}) ({} of {}) is not of type {} for host: {}, ignore disconnect", pool.getName(), pool.getUuid(), i, storagePoolHostsSize, pool.getPoolType(), host);
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message has an incorrect argument order. The message template shows "Storage Pool {} ({}) ({} of {}) is not of type {} for host: {}", but the arguments provided are: pool.getName(), pool.getUuid(), i, storagePoolHostsSize, pool.getPoolType(), host. The fifth placeholder expects the pool type that the pool is NOT, but pool.getPoolType() returns the actual type of the pool. The expected type (StoragePoolType.PowerFlex) should be included in the log message instead.

Suggested change
logger.debug("Storage Pool {} ({}) ({} of {}) is not of type {} for host: {}, ignore disconnect", pool.getName(), pool.getUuid(), i, storagePoolHostsSize, pool.getPoolType(), host);
logger.debug("Storage Pool {} ({}) ({} of {}) is not of type {} for host: {}, ignore disconnect", pool.getName(), pool.getUuid(), i, storagePoolHostsSize, StoragePoolType.PowerFlex, host);

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants