Skip to content

Conversation

@zstack-robot-2
Copy link
Collaborator

Resolves: ZSTAC-80202

Change-Id: I6b77656f6668796677626b6c6b7361696e6a6274

附属于 premium的一个MR 见

http://dev.zstack.io:9080/zstackio/premium/-/merge_requests/12525

sync from gitlab !8912

@coderabbitai
Copy link

coderabbitai bot commented Dec 24, 2025

Walkthrough

新增公共接口 BeforeCallZWatchExtensionPoint 并在核心错误处理引入全局错误码传播(Platform.err 系列方法签名变更,ErrorCode.globalErrorCode 字段);大量模块将原始文本错误替换为 CloudOperationsErrorCode 常量以统一错误码输出。

Changes

Cohort / File(s) 更改说明
新增扩展点接口
header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java
新增公共接口,声明 supports(Class<?>)beforeCallZWatch(Class<?>, List<String>),包含 Javadoc,无实现。
核心错误 API 与 ErrorCode 扩展
core/src/main/java/org/zstack/core/Platform.java, header/src/main/java/org/zstack/header/errorcode/ErrorCode.java, core/src/main/java/org/zstack/core/With.java
修改并新增 Platform.err/operr/inerr/... 等重载方法,首参增加 globalErrorCode 字符串并在构建的 ErrorCode 上设置该全局码;ErrorCode 新增字段与 getter/setter (globalErrorCode)。若干调用点适配。
全局:将文本错误替换为 CloudOperationsErrorCode 常量
多模块批量修改(示例路径:compute/..., core/..., plugin/..., network/..., image/..., identity/..., configuration/..., storage/..., plugin/kvm/..., plugin/ceph/..., plugin/localstorage/..., 等) — 见长名单
在大量文件中引入静态导入 CloudOperationsErrorCode 并将原本只带文本的 err/operr/argerr/inerr/… 调用替换为以 ORG_ZSTACK_* 常量为首参的调用,保留原始消息文本,仅统一并标准化错误码。变更范围广但多为一致性替换,控制流与方法签名通常未变。
示例模块(汇总若干代表性文件)
compute/src/main/java/org/zstack/compute/..., core/src/main/java/org/zstack/core/cloudbus/..., plugin/kvm/src/main/java/org/zstack/kvm/..., plugin/ceph/src/main/java/org/zstack/storage/ceph/..., plugin/localstorage/src/main/java/org/zstack/storage/primary/local/..., network/src/main/java/org/zstack/network/l3/..., image/src/main/java/org/zstack/image/..., identity/src/main/java/org/zstack/identity/...
各模块内多个错误路径改为使用模块/场景对应的 ORG_ZSTACK_* 错误码常量(static import),并将相关异常构造或 trigger.fail/operr 等调用以新签名调用,逻辑分支与行为保持不变。
API/拦截器与校验器调整
*ApiInterceptor.java, *ApiParamChecker.java(散布于各模块)
多处 API 校验抛错由原文案改为带特定 CloudOperationsErrorCode 的 argerr/operr,输入校验语义不变。
GC / Job / Workflow / 单元/运行时路径
core/.../gc/*, core/.../job/*, core/.../workflow/*, longjob/...
在垃圾回收、作业、工作流等关键运行路径的异常处理处,错误构造改为使用指定全局错误码常量,保持现有失败/重试/回滚逻辑。

Sequence Diagram(s)

(无)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 新径披霜月微明,
我在 ZWatch 前轻声听,
为每个 UUID 挂上名,
错误有码不再迷雾里行,
插件与核心共舞清。


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)
Check name Status Explanation Resolution
Title check ❌ Error PR标题长度为107字符,超过了72字符的限制,不符合格式要求。 将标题缩短至72字符以内,例如:[plugin-premium]: Fix GPU/VM page loading issue in Zaku cluster
Docstring Coverage ⚠️ Warning Docstring coverage is 3.98% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Description check ✅ Passed PR描述与变更集内容相关,涉及Zaku集群中GPU/VM页面加载问题,但描述过于简洁且包含外部链接参考。
✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3d7f186 and 8d8abb8.

📒 Files selected for processing (2)
  • header/src/main/java/org/zstack/header/zql/BeforeApiCallExtensionPoint.java
  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchReturnWithExtensionPoint.java
🧰 Additional context used
📓 Path-based instructions (2)
**/*.*

⚙️ CodeRabbit configuration file

**/*.*: - 代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写

Files:

  • header/src/main/java/org/zstack/header/zql/BeforeApiCallExtensionPoint.java
  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchReturnWithExtensionPoint.java
**/*.java

⚙️ CodeRabbit configuration file

**/*.java: ## 1. API 设计要求

  • API 命名:
    • API 名称必须唯一,不能重复。
    • API 消息类需要继承 APIMessage;其返回类必须继承 APIReplyAPIEvent,并在注释中用 @RestResponse 进行标注。
    • API 消息上必须添加注解 @RestRequest,并满足如下规范:
      • path:
        • 针对资源使用复数形式。
        • 当 path 中引用消息类变量时,使用 {variableName} 格式。
      • HTTP 方法对应:
        • 查询操作 → HttpMethod.GET
        • 更新操作 → HttpMethod.PUT
        • 创建操作 → HttpMethod.POST
        • 删除操作 → HttpMethod.DELETE
    • API 类需要实现 __example__ 方法以便生成 API 文档,并确保生成对应的 Groovy API Template 与 API Markdown 文件。

2. 命名与格式规范

  • 类名:

    • 使用 UpperCamelCase 风格。
    • 特殊情况:
      • VO/AO/EO 类型类除外。
      • 抽象类采用 AbstractBase 前缀/后缀。
      • 异常类应以 Exception 结尾。
      • 测试类需要以 TestCase 结尾。
  • 方法名、参数名、成员变量和局部变量:

    • 使用 lowerCamelCase 风格。
  • 常量命名:

    • 全部大写,使用下划线分隔单词。
    • 要求表达清楚,避免使用含糊或不准确的名称。
  • 包名:

    • 统一使用小写,使用点分隔符,每个部分应是一个具有自然语义的英文单词(参考 Spring 框架的结构)。
  • 命名细节:

    • 避免在父子类或同一代码块中出现相同名字的成员或局部变量,防止混淆。
    • 命名缩写:
      • 不允许使用不必要的缩写,如:AbsSchedulerJobcondiFu 等。应使用完整单词提升可读性。

3. 编写自解释代码

  • 意图表达:

    • 避免使用布尔型参数造成含义不明确。例如:
      • 对于 stopAgent(boolean ignoreError),建议拆分为不同函数(如 stopAgentIgnoreError()),或使用枚举表达操作类型。
    • 命名应尽量用完整的单词组合表达意图,并在名称中体现数据类型或用途(例如在常量与变量名称中,将类型词放在末尾)。
    • 避免使用魔法值(Magic Value):
      • 直接使用未经定义的数值或字符串(如 if (status == 5))应替换为枚举或常量。

      • 示例:

      • // 错误示例:魔法值

      • if (user.getStatus() == 5) { ... }

      • // 正确示例:常量或枚举

      • public static final int STATUS_ACTIVE = 5;

      • if (user.getStatus() == STATUS_ACTIVE) { ... }

      • // 或使用枚举

      • enum UserStatus { ACTIVE, INACTIVE }

  • 注释:

    • 代码应尽量做到自解释,对少于两行的说明可以直接写在代码中。
    • 对于较长的注释,需要仔细校对并随代码更新,确保内容正确。
    • 接口方法不应有多余的修饰符(例如 public),且必须配有有效的 Javadoc 注释。

4. 流程控制和结构优化

  • if...else 的使用:

    • 应尽量减少 if...else 结构的使用,建议:
      • 限制嵌套层级最多为两层,且内层不应再出现 else 分支。
      • 尽早返回(Early Return),将条件判断中的处理逻辑提前结束或抽成独立方法。
      • 使用 Java Stream 或 Lambda 表达式代替冗长的循环与条件判断。
  • 条件判断:

    • if 条件表达不宜过长或过于复杂,必要时可以将条件抽成 boolean 变量描述。
  • 代码块长度:
    ...

Files:

  • header/src/main/java/org/zstack/header/zql/BeforeApiCallExtensionPoint.java
  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchReturnWithExtensionPoint.java
🧠 Learnings (2)
📓 Common learnings
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2360
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:449-490
Timestamp: 2025-08-04T04:48:19.103Z
Learning: ZStack项目在cherry-pick操作中,即使发现了性能优化机会(如IP地址批量保存的内存优化),也严格遵循不做额外修改的政策,优先保证cherry-pick的完整性和一致性。
📚 Learning: 2025-08-12T03:35:21.034Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2419
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:287-297
Timestamp: 2025-08-12T03:35:21.034Z
Learning: In the ZStack codebase, extension points may be designed for different execution phases/orders in the same operation. For example, IpRangeDeletionExtensionPoint executes during IP range deletion process, while AfterDeleteIpRangeExtensionPoint executes after the entire deletion process is complete. This allows different plugins to hook into appropriate lifecycle stages.

Applied to files:

  • header/src/main/java/org/zstack/header/zql/BeforeApiCallExtensionPoint.java
  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchReturnWithExtensionPoint.java
🧬 Code graph analysis (1)
header/src/main/java/org/zstack/header/zql/BeforeCallZWatchReturnWithExtensionPoint.java (1)
header/src/main/java/org/zstack/header/storage/primary/PrimaryStorageHistoricalUsageBaseVO.java (1)
  • org (11-45)
🔇 Additional comments (1)
header/src/main/java/org/zstack/header/zql/BeforeApiCallExtensionPoint.java (1)

3-25: 该扩展点接口定义规范,但实现与使用未找到。

对扩展点接口定义的审查表明:

  • Javadoc 注释完整且使用了正确的英文
  • 命名遵循 UpperCamelCase 规范
  • 方法签名清晰

但搜索结果表明存在以下问题:

  1. 未找到实现类 - 扩展点接口已定义但在可搜索范围内无具体实现
  2. 未找到调用位置 - 代码库中无处调用 beforeApiCall 方法
  3. 未找到 Zaku 相关代码 - 与 PR 目标"GPU/VM 页面在 Zaku 集群关机或错误时的持续加载问题"的关联无法确认

需要确认该扩展点是否已在其他模块实现,以及实现是否能有效解决页面持续加载的根本原因。

Comment on lines 5 to 23
/**
* BeforeCallZWatchReturnWithExtensionPoint is an extension point that allows plugins
* to perform custom operations before executing ZQL return with queries.
*
* This provides a hook for business modules to execute pre-processing logic
* (such as health checks, validation, etc.) before monitoring data is retrieved.
*/
public interface BeforeCallZWatchReturnWithExtensionPoint {
/**
* Check if this extension supports the given VO class
* @param voClass the VO class to check
* @return true if this extension supports the VO class, false otherwise
*/
boolean supports(Class<?> voClass);

/**
* Perform operations before executing ZQL return with query.
* This allows business modules to trigger health checks or other pre-processing logic.
*
* @param voClass the VO class type
* @param uuids the list of resource UUIDs to process
*/
void beforeCallZWatch(Class<?> voClass, List<String> uuids);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for implementations of BeforeCallZWatchReturnWithExtensionPoint
echo "=== 查找 BeforeCallZWatchReturnWithExtensionPoint 的实现类 ==="
rg -nP --type=java 'implements.*BeforeCallZWatchReturnWithExtensionPoint' -g '!**/test/**'

echo -e "\n=== 查找扩展点的调用和注册位置 ==="
rg -nP --type=java 'BeforeCallZWatchReturnWithExtensionPoint' -g '!**/test/**' -A 2 -B 2

echo -e "\n=== 查找 BeforeApiCallExtensionPoint 的定义和使用 ==="
fd -e java -type f | xargs grep -l 'BeforeApiCallExtensionPoint' | head -5

echo -e "\n=== 查找 ZWatch 相关代码 ==="
rg -nP --type=java 'ZWatch|returnWith' -g '!**/test/**' | grep -E '(class|interface|return|with)' | head -30

Repository: MatheMatrix/zstack

Length of output: 5736


🏁 Script executed:

#!/bin/bash
# Get more details on the extension point interface itself
echo "=== 查看扩展点接口定义所在文件 ==="
cat -n header/src/main/java/org/zstack/header/zql/BeforeCallZWatchReturnWithExtensionPoint.java

echo -e "\n=== 查找同目录下的其他扩展点接口 ==="
find header/src/main/java/org/zstack/header/zql -name '*ExtensionPoint.java' -type f

Repository: MatheMatrix/zstack

Length of output: 1880


🏁 Script executed:

#!/bin/bash
# Search for where extension points are registered/collected
echo "=== 查找可能的扩展点管理器或注册机制 ==="
rg -nP --type=java 'plugin.*extension|Extension.*manager|getPluginManager' -g '!**/test/**' | head -20

echo -e "\n=== 查找调用 beforeCallZWatch 的位置 ==="
rg -nP --type=java 'beforeCallZWatch' -g '!**/test/**'

Repository: MatheMatrix/zstack

Length of output: 845


🏁 Script executed:

#!/bin/bash
# Search for BeforeApiCallExtensionPoint
echo "=== 查找 BeforeApiCallExtensionPoint 的定义 ==="
cat -n header/src/main/java/org/zstack/header/zql/BeforeApiCallExtensionPoint.java 2>/dev/null || echo "File not found"

echo -e "\n=== 查找 BeforeApiCallExtensionPoint 的实现 ==="
rg -nP 'implements.*BeforeApiCallExtensionPoint'

echo -e "\n=== 查找 ZQL.java 中的 callReturnWithExtensions 方法 ==="
rg -nP 'callReturnWithExtensions' -A 20 search/src/main/java/org/zstack/zql/ZQL.java

Repository: MatheMatrix/zstack

Length of output: 3846


🏁 Script executed:

#!/bin/bash
# Check PR context and git history
echo "=== 查找 BeforeCallZWatchReturnWithExtensionPoint 相关的 git 信息 ==="
git log --oneline --all | grep -i 'zwatch\|returnwith\|extension' | head -10

echo -e "\n=== 查找 BeforeCallZWatchReturnWithExtensionPoint 在 git 中首次出现的 commit ==="
git log -p --all -- '**/BeforeCallZWatchReturnWithExtensionPoint.java' | head -100

Repository: MatheMatrix/zstack

Length of output: 197


🏁 Script executed:

#!/bin/bash
# Check if there's a callReturnWithExtensions implementation
echo "=== 查找涉及 BeforeCallZWatchReturnWithExtensionPoint 的完整代码段 ==="
rg -nP 'callReturnWithExtensions|BeforeCallZWatch' -B 5 -A 15

echo -e "\n=== 查找 ZQL.java 中处理 returnWith 的位置 ==="
rg -nP 'returnWith\s*=' search/src/main/java/org/zstack/zql/ZQL.java -B 3 -A 3

Repository: MatheMatrix/zstack

Length of output: 8420


🏁 Script executed:

#!/bin/bash
# Final verification - check if there are any TODOs or FIXMEs related to this
echo "=== 查找相关的 TODO 或 FIXME ==="
rg -nP 'TODO|FIXME' header/src/main/java/org/zstack/header/zql/BeforeCallZWatchReturnWithExtensionPoint.java header/src/main/java/org/zstack/header/zql/BeforeApiCallExtensionPoint.java

echo -e "\n=== 查找测试文件中的相关引用 ==="
find . -path '*/test/**' -name '*.java' -type f | xargs grep -l 'BeforeCallZWatchReturnWithExtensionPoint\|BeforeApiCallExtensionPoint' 2>/dev/null || echo "No test files found"

echo -e "\n=== 查找 PR 相关文件或描述 ==="
find . -name 'COMMIT_EDITMSG' -o -name '.git/COMMIT_EDITMSG' 2>/dev/null | xargs cat 2>/dev/null || echo "No commit message found"

Repository: MatheMatrix/zstack

Length of output: 153


扩展点接口未实现且未集成,需要完成实现或移除。

经验证,BeforeCallZWatchReturnWithExtensionPointBeforeApiCallExtensionPoint 两个扩展点存在重大问题:

  1. 完全未实现:整个代码库中不存在任何实现类
  2. 完全未集成beforeCallZWatch() 方法在任何地方都未被调用过
  3. 未注册扩展机制:与 pluginRgty.getExtensionList() 无关联
  4. 接口具有误导性:接口声称提供"执行 ZQL return with 查询前的钩子"功能,但实际上该功能并不存在

实际的 return with 查询处理(ZQL.java 第 351 行和 537-557 行)使用的是不同的 ReturnWithExtensionPoint,而非这两个新扩展点。

需要采取以下任一行动:

  • 在 ZQL 查询处理流程中正式集成这两个扩展点,包括实现注册和调用机制
  • 或在确认不需要后完全移除这两个接口定义
🤖 Prompt for AI Agents
In
header/src/main/java/org/zstack/header/zql/BeforeCallZWatchReturnWithExtensionPoint.java
lines 5-28, the extension interface is unused and misleading; either integrate
it into the ZQL return-with flow or remove it. To fix: if keeping, implement at
least one concrete extension, register the extension with the plugin registry
(pluginRgty.getExtensionList/registration), and invoke beforeCallZWatch(...)
from the ZQL return-with handling code paths (around the existing return-with
logic at ZQL.java lines ~351 and 537-557) passing the VO class and UUID list;
ensure supports(Class<?>) is consulted before calling. If removing, delete this
interface and any related unused BeforeApiCallExtensionPoint stubs and update
documentation and imports accordingly.

@MatheMatrix MatheMatrix force-pushed the sync/chao.he/fix-ZSTAC-80202-zaku-loading branch 4 times, most recently from 8b66080 to 46eaf32 Compare December 26, 2025 03:44
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java (1)

9-23: 验证扩展点的实现和集成情况。

根据之前的评审记录,类似的扩展点接口(BeforeCallZWatchReturnWithExtensionPointBeforeApiCallExtensionPoint)存在以下严重问题:

  1. 代码库中不存在任何实现类
  2. 扩展点方法从未被调用
  3. 未与插件注册机制(pluginRgty.getExtensionList())关联
  4. 接口功能声称但实际不存在

当前新增的 BeforeCallZWatchExtensionPoint 接口需要确认:

  • 是否有具体的实现类(在本 PR 或关联的 premium MR 中)?
  • 该扩展点是否在 ZQL/ZWatch 处理流程中被实际调用?
  • 是否已与插件注册机制集成?

如果没有实现和集成计划,建议移除此接口以避免产生误导性的死代码。

执行以下脚本验证扩展点的实现和使用情况:

#!/bin/bash
echo "=== 查找 BeforeCallZWatchExtensionPoint 的实现类 ==="
rg -nP --type=java 'implements.*BeforeCallZWatchExtensionPoint' -g '!**/test/**'

echo -e "\n=== 查找扩展点方法的调用位置 ==="
rg -nP --type=java 'beforeCallZWatch\s*\(' -g '!**/test/**' -C 3

echo -e "\n=== 查找插件注册机制的集成 ==="
rg -nP --type=java 'BeforeCallZWatchExtensionPoint' -g '!**/test/**' -C 3

echo -e "\n=== 检查 ZQL.java 中是否调用了该扩展点 ==="
rg -nP --type=java 'BeforeCallZWatchExtensionPoint|beforeCallZWatch' search/src/main/java/org/zstack/zql/ZQL.java -C 5

基于历史经验教训。

📜 Review details

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b66080 and 46eaf32.

📒 Files selected for processing (1)
  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java
🧰 Additional context used
📓 Path-based instructions (2)
**/*.*

⚙️ CodeRabbit configuration file

**/*.*: - 代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写

Files:

  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java
**/*.java

⚙️ CodeRabbit configuration file

**/*.java: ## 1. API 设计要求

  • API 命名:
    • API 名称必须唯一,不能重复。
    • API 消息类需要继承 APIMessage;其返回类必须继承 APIReplyAPIEvent,并在注释中用 @RestResponse 进行标注。
    • API 消息上必须添加注解 @RestRequest,并满足如下规范:
      • path:
        • 针对资源使用复数形式。
        • 当 path 中引用消息类变量时,使用 {variableName} 格式。
      • HTTP 方法对应:
        • 查询操作 → HttpMethod.GET
        • 更新操作 → HttpMethod.PUT
        • 创建操作 → HttpMethod.POST
        • 删除操作 → HttpMethod.DELETE
    • API 类需要实现 __example__ 方法以便生成 API 文档,并确保生成对应的 Groovy API Template 与 API Markdown 文件。

2. 命名与格式规范

  • 类名:

    • 使用 UpperCamelCase 风格。
    • 特殊情况:
      • VO/AO/EO 类型类除外。
      • 抽象类采用 AbstractBase 前缀/后缀。
      • 异常类应以 Exception 结尾。
      • 测试类需要以 TestCase 结尾。
  • 方法名、参数名、成员变量和局部变量:

    • 使用 lowerCamelCase 风格。
  • 常量命名:

    • 全部大写,使用下划线分隔单词。
    • 要求表达清楚,避免使用含糊或不准确的名称。
  • 包名:

    • 统一使用小写,使用点分隔符,每个部分应是一个具有自然语义的英文单词(参考 Spring 框架的结构)。
  • 命名细节:

    • 避免在父子类或同一代码块中出现相同名字的成员或局部变量,防止混淆。
    • 命名缩写:
      • 不允许使用不必要的缩写,如:AbsSchedulerJobcondiFu 等。应使用完整单词提升可读性。

3. 编写自解释代码

  • 意图表达:

    • 避免使用布尔型参数造成含义不明确。例如:
      • 对于 stopAgent(boolean ignoreError),建议拆分为不同函数(如 stopAgentIgnoreError()),或使用枚举表达操作类型。
    • 命名应尽量用完整的单词组合表达意图,并在名称中体现数据类型或用途(例如在常量与变量名称中,将类型词放在末尾)。
    • 避免使用魔法值(Magic Value):
      • 直接使用未经定义的数值或字符串(如 if (status == 5))应替换为枚举或常量。

      • 示例:

      • // 错误示例:魔法值

      • if (user.getStatus() == 5) { ... }

      • // 正确示例:常量或枚举

      • public static final int STATUS_ACTIVE = 5;

      • if (user.getStatus() == STATUS_ACTIVE) { ... }

      • // 或使用枚举

      • enum UserStatus { ACTIVE, INACTIVE }

  • 注释:

    • 代码应尽量做到自解释,对少于两行的说明可以直接写在代码中。
    • 对于较长的注释,需要仔细校对并随代码更新,确保内容正确。
    • 接口方法不应有多余的修饰符(例如 public),且必须配有有效的 Javadoc 注释。

4. 流程控制和结构优化

  • if...else 的使用:

    • 应尽量减少 if...else 结构的使用,建议:
      • 限制嵌套层级最多为两层,且内层不应再出现 else 分支。
      • 尽早返回(Early Return),将条件判断中的处理逻辑提前结束或抽成独立方法。
      • 使用 Java Stream 或 Lambda 表达式代替冗长的循环与条件判断。
  • 条件判断:

    • if 条件表达不宜过长或过于复杂,必要时可以将条件抽成 boolean 变量描述。
  • 代码块长度:
    ...

Files:

  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java
🧠 Learnings (2)
📓 Common learnings
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2360
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:449-490
Timestamp: 2025-08-04T04:48:19.103Z
Learning: ZStack项目在cherry-pick操作中,即使发现了性能优化机会(如IP地址批量保存的内存优化),也严格遵循不做额外修改的政策,优先保证cherry-pick的完整性和一致性。
📚 Learning: 2025-08-12T03:35:21.034Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2419
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:287-297
Timestamp: 2025-08-12T03:35:21.034Z
Learning: In the ZStack codebase, extension points may be designed for different execution phases/orders in the same operation. For example, IpRangeDeletionExtensionPoint executes during IP range deletion process, while AfterDeleteIpRangeExtensionPoint executes after the entire deletion process is complete. This allows different plugins to hook into appropriate lifecycle stages.

Applied to files:

  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java

vm is displayed as "Paused"
and the control plane is missing
the state transition logic from "Paused" to "NoState"

Resolves: ZSTAC-80897

Change-Id: I697465636f646b6368686d71696d706b6e676e67
@MatheMatrix MatheMatrix force-pushed the sync/chao.he/fix-ZSTAC-80202-zaku-loading branch from 46eaf32 to 752dfc5 Compare January 12, 2026 02:20
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
@header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java:
- Around line 5-9: The Javadoc above the interface incorrectly references
BeforeZQLReturnWithExtensionPoint; update the comment to reference the actual
interface name BeforeCallZWatchExtensionPoint so the doc matches the declaration
(i.e., change the Javadoc text on the comment line that mentions
BeforeZQLReturnWithExtensionPoint to BeforeCallZWatchExtensionPoint).
📜 Review details

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46eaf32 and 752dfc5.

📒 Files selected for processing (1)
  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java
🧰 Additional context used
📓 Path-based instructions (2)
**/*.*

⚙️ CodeRabbit configuration file

**/*.*: - 代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写

Files:

  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java
**/*.java

⚙️ CodeRabbit configuration file

**/*.java: ## 1. API 设计要求

  • API 命名:
    • API 名称必须唯一,不能重复。
    • API 消息类需要继承 APIMessage;其返回类必须继承 APIReplyAPIEvent,并在注释中用 @RestResponse 进行标注。
    • API 消息上必须添加注解 @RestRequest,并满足如下规范:
      • path:
        • 针对资源使用复数形式。
        • 当 path 中引用消息类变量时,使用 {variableName} 格式。
      • HTTP 方法对应:
        • 查询操作 → HttpMethod.GET
        • 更新操作 → HttpMethod.PUT
        • 创建操作 → HttpMethod.POST
        • 删除操作 → HttpMethod.DELETE
    • API 类需要实现 __example__ 方法以便生成 API 文档,并确保生成对应的 Groovy API Template 与 API Markdown 文件。

2. 命名与格式规范

  • 类名:

    • 使用 UpperCamelCase 风格。
    • 特殊情况:
      • VO/AO/EO 类型类除外。
      • 抽象类采用 AbstractBase 前缀/后缀。
      • 异常类应以 Exception 结尾。
      • 测试类需要以 TestCase 结尾。
  • 方法名、参数名、成员变量和局部变量:

    • 使用 lowerCamelCase 风格。
  • 常量命名:

    • 全部大写,使用下划线分隔单词。
    • 要求表达清楚,避免使用含糊或不准确的名称。
  • 包名:

    • 统一使用小写,使用点分隔符,每个部分应是一个具有自然语义的英文单词(参考 Spring 框架的结构)。
  • 命名细节:

    • 避免在父子类或同一代码块中出现相同名字的成员或局部变量,防止混淆。
    • 命名缩写:
      • 不允许使用不必要的缩写,如:AbsSchedulerJobcondiFu 等。应使用完整单词提升可读性。

3. 编写自解释代码

  • 意图表达:

    • 避免使用布尔型参数造成含义不明确。例如:
      • 对于 stopAgent(boolean ignoreError),建议拆分为不同函数(如 stopAgentIgnoreError()),或使用枚举表达操作类型。
    • 命名应尽量用完整的单词组合表达意图,并在名称中体现数据类型或用途(例如在常量与变量名称中,将类型词放在末尾)。
    • 避免使用魔法值(Magic Value):
      • 直接使用未经定义的数值或字符串(如 if (status == 5))应替换为枚举或常量。

      • 示例:

      • // 错误示例:魔法值

      • if (user.getStatus() == 5) { ... }

      • // 正确示例:常量或枚举

      • public static final int STATUS_ACTIVE = 5;

      • if (user.getStatus() == STATUS_ACTIVE) { ... }

      • // 或使用枚举

      • enum UserStatus { ACTIVE, INACTIVE }

  • 注释:

    • 代码应尽量做到自解释,对少于两行的说明可以直接写在代码中。
    • 对于较长的注释,需要仔细校对并随代码更新,确保内容正确。
    • 接口方法不应有多余的修饰符(例如 public),且必须配有有效的 Javadoc 注释。

4. 流程控制和结构优化

  • if...else 的使用:

    • 应尽量减少 if...else 结构的使用,建议:
      • 限制嵌套层级最多为两层,且内层不应再出现 else 分支。
      • 尽早返回(Early Return),将条件判断中的处理逻辑提前结束或抽成独立方法。
      • 使用 Java Stream 或 Lambda 表达式代替冗长的循环与条件判断。
  • 条件判断:

    • if 条件表达不宜过长或过于复杂,必要时可以将条件抽成 boolean 变量描述。
  • 代码块长度:
    ...

Files:

  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java
🧠 Learnings (3)
📓 Common learnings
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2419
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:287-297
Timestamp: 2025-08-12T03:35:21.034Z
Learning: In the ZStack codebase, extension points may be designed for different execution phases/orders in the same operation. For example, IpRangeDeletionExtensionPoint executes during IP range deletion process, while AfterDeleteIpRangeExtensionPoint executes after the entire deletion process is complete. This allows different plugins to hook into appropriate lifecycle stages.
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2360
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:449-490
Timestamp: 2025-08-04T04:48:19.103Z
Learning: ZStack项目在cherry-pick操作中,即使发现了性能优化机会(如IP地址批量保存的内存优化),也严格遵循不做额外修改的政策,优先保证cherry-pick的完整性和一致性。
📚 Learning: 2025-08-12T03:35:21.034Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2419
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:287-297
Timestamp: 2025-08-12T03:35:21.034Z
Learning: In the ZStack codebase, extension points may be designed for different execution phases/orders in the same operation. For example, IpRangeDeletionExtensionPoint executes during IP range deletion process, while AfterDeleteIpRangeExtensionPoint executes after the entire deletion process is complete. This allows different plugins to hook into appropriate lifecycle stages.

Applied to files:

  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java
📚 Learning: 2025-06-19T09:08:57.046Z
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2217
File: header/src/main/java/org/zstack/header/network/l2/L2NetworkCandidateFilterExtensionPoint.java:7-10
Timestamp: 2025-06-19T09:08:57.046Z
Learning: In the zstack project, for simple and self-explanatory interfaces like L2NetworkCandidateFilterExtensionPoint, the team prefers to rely on clear method names and parameters rather than adding detailed Javadoc comments, even when coding guidelines suggest otherwise.

Applied to files:

  • header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java
🔍 Remote MCP AtlassianMCP

Based on the context gathered, here is the relevant information for reviewing this pull request:

Issue Context - ZSTAC-80202

Summary: When Zaku cluster nodes shutdown or encounter errors, the automatic cloud container cluster synchronization mechanism blocks the queue, causing the GPU device page and physical machine pages to hang indefinitely while loading.

Severity & Status:

  • P1 (Critical) priority issue
  • Status: In Progress
  • Assignee: 何超 (chao.he@zstack.io)
  • Fix Version: 5.5.6
  • Created: 2025-11-28, Last Updated: 2026-01-09

Root Cause Analysis (from Jira comments):

The problem stems from a blocking call chain when Zaku nodes are powered off:

  1. ZakuHttpClient.execute() hangs indefinitely (no timeout configured) at ContainerEndpointBase.java:301
  2. This causes provider.listClusters(self) to block
  3. The entire FlowChain sync process gets stuck at the "sync-cluster" phase (lines 290-310)
  4. Downstream critical stages never execute:
    • sync-pod phase (lines 334-349)
    • update-gpu-attachment-status phase (lines 388-407)
  5. GPU device status cannot be updated
  6. UI queries that depend on sync status remain stuck loading
  7. Subsequent sync requests accumulate in the CloudBus queue

Related Issue: ZSTAC-35305 (linked by jin.ma on 2026-01-09)

PR Context

The PR introduces a new extension point interface BeforeCallZWatchExtensionPoint with two key methods:

  • supports(Class<?> voClass): checks if extension supports a given VO class
  • beforeCallZWatch(Class<?> voClass, List<String> uuids): allows pre-call operations before ZWatch invocations

Documentation context: Extension points are a core ZStack plugin management pattern used for lifecycle hooks and behavior customization across various operations (VM operations, storage operations, cluster operations, etc.)

Key Points for Review:

  1. This interface appears designed to intercept ZWatch calls at the extension point level, potentially enabling the shutdown/error handling mechanism needed to prevent queue blocking
  2. The solution aligns with defensive programming patterns already used throughout ZStack for handling cluster state transitions and failures
  3. The new interface fits into ZStack's established extension point architecture for managing lifecycle events
🔇 Additional comments (1)
header/src/main/java/org/zstack/header/zql/BeforeCallZWatchExtensionPoint.java (1)

10-23: LGTM!

接口方法设计清晰,遵循 ZStack 扩展点模式。supports() 方法用于判断是否支持特定 VO 类,beforeCallZWatch() 方法提供 ZWatch 调用前的钩子,符合项目中已有的扩展点架构设计。

lin.ma and others added 5 commits January 12, 2026 12:49
Resolves: ZSTAC-66343

Change-Id: I6e6b6f76676c7671786e79696f70746268717864
<fix>[compute]: vm lacks state transition handling

See merge request zstackio/zstack!9004
<feature>[error code]: global error code

See merge request zstackio/zstack!9005
…r encountering errors in Zaku cluster.

Resolves: ZSTAC-80202
Change-Id: I7778676171646874706164777869707279776172
…r encountering errors in Zaku cluster.

Resolves: ZSTAC-80202
Change-Id: I7778676171646874706164777869707279776172
@MatheMatrix MatheMatrix force-pushed the sync/chao.he/fix-ZSTAC-80202-zaku-loading branch from 049b83e to 63007bb Compare January 13, 2026 05:05
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (32)
core/src/main/java/org/zstack/core/webhook/WebhookApiInterceptor.java (1)

27-41: URL 参数未进行 trim 处理

根据编码规范,来自 Message 的参数应该进行 trim 处理,因为用户可能从浏览器复制粘贴的数据带有空格或换行符。当前 validateUrl 方法直接使用原始 URL 进行验证,如果用户粘贴了带有前后空白的 URL(如 " https://example.com "),会导致验证失败。

建议的修复方案
 private void validateUrl(String url) {
+    if (url != null) {
+        url = url.trim();
+    }
     if (!new UrlValidator().isValid(url)) {
         throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_CORE_WEBHOOK_10000, "Invalid url[%s]", url));
     }
 }

或者在调用处进行 trim:

 private void validate(APIUpdateWebhookMsg msg) {
     if (msg.getUrl() != null) {
-        validateUrl(msg.getUrl());
+        validateUrl(msg.getUrl().trim());
     }
 }

 private void validate(APICreateWebhookMsg msg) {
-    validateUrl(msg.getUrl());
+    validateUrl(msg.getUrl() != null ? msg.getUrl().trim() : null);
 }
core/src/main/java/org/zstack/core/encrypt/EncryptFacadeImpl.java (1)

216-232: 抛出的异常会被立即捕获并吞掉。

第 221 行抛出的 OperationFailureException 会被第 230 行的 catch (Exception e) 捕获,仅以 debug 级别记录后继续处理下一条记录。这使得新增的 throw 语句实际上无效。

如果意图是在加密失败时快速失败并终止整个操作,应在 catch 块中重新抛出 OperationFailureException

🔧 建议修复
                             } catch (Exception e) {
+                                if (e instanceof OperationFailureException) {
+                                    throw e;
+                                }
                                 logger.debug(String.format("decrypt[className: %s, uuid: %s] origin password error because : %s", className, uuid, e.getMessage()));
                             }

如果意图是即使部分记录加密失败也要继续处理,则无需抛出异常,仅保留 logger.error 即可。

console/src/main/java/org/zstack/console/ManagementServerConsoleProxyBackend.java (1)

566-572: 格式化字符串缺少参数,且条件判断不完整。

  1. 严重问题:错误信息包含 %d 占位符,但未传入 rstportRst 参数,会导致格式化异常或输出错误信息。
  2. 逻辑问题:条件仅检查 rst == 0,但 portRst 的返回值被忽略。如果 setConsoleProxyOverridenIp 成功但 setConsoleProxyPort 失败,流程会错误地继续执行。
🐛 建议修复
                     int rst = setConsoleProxyOverridenIp(msg.getConsoleProxyOverriddenIp());
                     int portRst = setConsoleProxyPort(newProxyPort);
-                    if (rst == 0) {
+                    if (rst == 0 && portRst == 0) {
                         trigger.next();
                     } else {
-                        trigger.fail(operr(ORG_ZSTACK_CONSOLE_10003, "failed to configure consoleProxyOverriddenIp[code:%d] or consoleProxyPort[code:%d]"));
+                        trigger.fail(operr(ORG_ZSTACK_CONSOLE_10003, "failed to configure consoleProxyOverriddenIp[code:%d] or consoleProxyPort[code:%d]", rst, portRst));
                     }
core/src/main/java/org/zstack/core/upgrade/UpgradeChecker.java (1)

404-416: 修复英文语法错误。

根据编码规范,错误信息应使用正确的英文。当前存在以下语法问题:

  1. Line 406: "Can not found""Cannot find" (动词形式错误)
  2. Line 415: "before update os""before updating the OS" (动名词形式更准确)
📝 建议修复
         if (agent == null) {
             throw new ApiMessageInterceptionException(
-                    argerr(ORG_ZSTACK_CORE_UPGRADE_10003, "Can not found agent version, upgrade cluster os is not supported during grayscale upgrade")
+                    argerr(ORG_ZSTACK_CORE_UPGRADE_10003, "Cannot find agent version, upgrading cluster OS is not supported during grayscale upgrade")
             );
         }

         if (agent.getCurrentVersion().equals(agent.getExpectVersion())) {
             return;
         }

         throw new ApiMessageInterceptionException(
-                argerr(ORG_ZSTACK_CORE_UPGRADE_10004, "Host[uuid: %s] agent version is not upgraded, please reconnect host before update os", msg.getHostUuid())
+                argerr(ORG_ZSTACK_CORE_UPGRADE_10004, "Host[uuid: %s] agent version is not upgraded, please reconnect host before updating the OS", msg.getHostUuid())
         );
configuration/src/main/java/org/zstack/configuration/ConfigurationManagerImpl.java (1)

1106-1118: 存在复制粘贴错误:DiskOfferingUserConfigValidator 校验了错误的系统标签类型

此处存在两个问题:

  1. Line 1109 (已有代码): 条件判断使用了 InstanceOfferingSystemTags.INSTANCE_OFFERING_USER_CONFIG.isMatch(sysTag) 而非 DiskOfferingSystemTags.DISK_OFFERING_USER_CONFIG.isMatch(sysTag)。这意味着 DiskOffering 的校验器实际上在检查 InstanceOffering 的标签,导致 DiskOffering 的多标签校验失效。

  2. Line 1112 (本次修改): 错误消息说的是 "one instanceOffering",但这是 DiskOfferingUserConfigValidator,应该是 "one diskOffering"。

🐛 建议修复
             @Override
             public void validateSystemTagInCreateMessage(APICreateMessage msg) {
                 int userdataTagCount = 0;
                 for (String sysTag : msg.getSystemTags()) {
-                    if (InstanceOfferingSystemTags.INSTANCE_OFFERING_USER_CONFIG.isMatch(sysTag)) {
+                    if (DiskOfferingSystemTags.DISK_OFFERING_USER_CONFIG.isMatch(sysTag)) {
                         if (userdataTagCount > 0) {
                             throw new OperationFailureException(argerr(
-                            ORG_ZSTACK_CONFIGURATION_10004,         "Shouldn't be more than one systemTag for one instanceOffering."));
+                            ORG_ZSTACK_CONFIGURATION_10004,         "Shouldn't be more than one systemTag for one diskOffering."));
                         }
                         userdataTagCount++;

                         check(msg.getResourceUuid(), sysTag);
                     }
                 }
             }
plugin/eip/src/main/java/org/zstack/network/service/eip/EipApiInterceptor.java (2)

119-131: 错误消息与实际逻辑不符

!found 为 true 时(即 IP 不属于该 NIC),抛出的错误消息却说 "ip [uuid:%s] is attached to vm nic [%s]",这与实际情况相反。消息应改为 "is not attached"。

🐛 建议修复
         if (!found) {
-            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_EIP_10007, "ip [uuid:%s] is attached to vm nic [%s]", guestIpUuid, vmNicUuid));
+            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_EIP_10007, "ip [uuid:%s] is not attached to vm nic [%s]", guestIpUuid, vmNicUuid));
         }

191-205: 错误参数传递:应使用 msg.getUsedIpUuid() 而非 msg.getEipUuid()

第 200 行的错误消息说 "Ip address [uuid:%s] is not belonged to nic",但传入的是 msg.getEipUuid()(EIP 的 UUID),而不是 msg.getUsedIpUuid()(IP 地址的 UUID)。

🐛 建议修复
             if (!found) {
-                throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_EIP_10012, "Ip address [uuid:%s] is not belonged to nic [uuid:%s]", msg.getEipUuid(), msg.getVmNicUuid()));
+                throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_EIP_10012, "Ip address [uuid:%s] is not belonged to nic [uuid:%s]", msg.getUsedIpUuid(), msg.getVmNicUuid()));
             }
compute/src/main/java/org/zstack/compute/cluster/ClusterApiInterceptor.java (1)

53-60: msg 参数缺少 trim() 调用。

根据编码规范,在 Interceptor 类中需要检查来自 Message 的参数是否进行了 trim。用户可能在浏览器复制粘贴数据时带有空格、换行符等。当前代码直接调用 msg.getType()msg.getHypervisorType() 进行字符串比较,如果数据包含尾部空格,equals("baremetal") 的比较将失败。

应修改为:

  • msg.getType() 改为 msg.getType().trim() (需先检查 null)
  • msg.getHypervisorType() 改为 msg.getHypervisorType().trim() (需先检查 null)
plugin/kvm/src/main/java/org/zstack/kvm/KVMHostFactory.java (1)

780-792: 格式字符串参数未使用 - 潜在问题

Line 790 的格式字符串 "pci bridge need a value greater than 0 and lower than 32" 没有 %s 占位符,但传入了 KVMSystemTags.VM_PREDEFINED_PCI_BRIDGE_NUM_TOKEN 参数。这会导致:

  1. 参数被忽略,错误消息无法指明具体是哪个参数有问题
  2. 与 Line 786 的用法不一致

此外,"pci bridge need" 语法应为 "PCI bridge needs"。

建议修复
             } catch (Exception e) {
                 throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_KVM_10007, "%s must be a number", KVMSystemTags.VM_PREDEFINED_PCI_BRIDGE_NUM_TOKEN));
             }

             if (num <= 0 || num > 31) {
-                throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_KVM_10008, "pci bridge need a value greater than 0 and lower than 32", KVMSystemTags.VM_PREDEFINED_PCI_BRIDGE_NUM_TOKEN));
+                throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_KVM_10008, "%s needs a value greater than 0 and less than 32", KVMSystemTags.VM_PREDEFINED_PCI_BRIDGE_NUM_TOKEN));
             }
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephSnapshotProtector.java (1)

27-30: 缺少 return 语句导致 completion 被多次调用

irregularThirdPartySnapshotInstallPathtrue 时,第 29 行调用 completion.success() 后没有 return,方法会继续执行,导致第 37 行再次调用 completion.success()。若第 32 行条件为真,还会在第 33-34 行调用 completion.fail(),造成 completion 被多次回调,导致不可预期的行为。

建议修复
         if (irregularThirdPartySnapshotInstallPath) {
             completion.success();
+            return;
         }
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java (2)

6296-6310: 超时错误信息的时间单位错误(ms 当成 s)

这里 ctimeout 是毫秒,但文案写的是 “seconds”,并且 %d 传入了毫秒值,属于明确的用户可见错误。建议用 timeoutInSec 或把文案改成毫秒。

建议修复(保持“秒”语义)
-                                reply.setError(operr(ORG_ZSTACK_KVM_10123, "host[%s] not shutdown in %d seconds", msg.getHostUuid(), ctimeout));
+                                reply.setError(operr(ORG_ZSTACK_KVM_10123, "host[%s] not shutdown in %d seconds", msg.getHostUuid(), timeoutInSec));

1244-1251: dbf.findByUuid() 返回值可能为 null,需显式处理

虽然上游 APIGetHostWebSshUrlMsg 的 uuid 通过 @APIParam(resourceType=HostVO.class) 在 API 层经过验证,但:

  1. API 验证与处理器执行间存在间隙,host 可能被删除
  2. API 层验证 HostVO,而处理器搜索 KVMHostVO,两者不完全对应

目前 KVMHostVO host = dbf.findByUuid(...) 后直接调用 host.getManagementIp()/host.getPort() 会 NPE。异常处理块(第1273-1280行)同样引用 host.getUuid(),若此时 host 为 null 仍会 NPE。

建议显式 null 检查,返回带错误码的 reply。

plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.java (1)

1728-1737: (顺带)dbf.findByUuid() 结果可能为 null:这里存在 NPE 风险

BackupStorageVO bsvo = dbf.findByUuid(...); 后直接 bsvo.getType(),在 bs 记录被删/uuid 不存在时会 NPE。即便这不是本 PR 的核心改动,也建议至少加一个 null 检查并返回明确错误码。

考虑到你们有“cherry-pick 不做额外改动”的习惯,这里可以只做最小修复(null guard)而不做重构。Based on learnings, dbf.findByUuid() 可能返回 null。

plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageMonBase.java (2)

97-105: changeStatus() 的新错误码 OK,但错误信息建议补一个句号后的空格

当前字符串拼接为 "deleted." + "This error can be ignored",中间没有空格;属于用户可见文本质量问题。

修复建议
- "cannot update status ... it has been deleted." +
- "This error can be ignored"
+ "cannot update status ... it has been deleted. " +
+ "This error can be ignored"

372-388: httpCall() 将 agent 非 success 包装成 Platform.operr(...) 会改变上层的 mon 切换行为

当 agent 返回 ret.isSuccess()==false 时,该方法会统一包装为 Platform.operr(ORG_ZSTACK_STORAGE_CEPH_PRIMARY_10049, ...),这总是返回 SysErrors.OPERATION_ERROR 错误分类。

上层 CephPrimaryStorageBase.HttpCaller.doCall() 的错误处理逻辑显示:仅当错误不是 OPERATION_ERRORVOLUME_IN_USETIMEOUT 时,才会调用 doCall() 尝试下一个 mon。因此该包装会导致 agent 错误被视为 OPERATION_ERROR,使得 mon 自动切换失效,上层改为依赖 tryNext 标志决定是否重试或直接失败。

需要确认这是否为预期行为,或是否应该保留原始 agent 错误分类以保持 mon 故障转移机制。

compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.java (1)

1146-1157: 阻塞问题:Collections.emptyList()add() 会直接抛 UnsupportedOperationException

建议修改(最小 diff)
 private List<ErrorCode> extEmitterHandleSystemTag(final CreateVmInstanceMsg msg, final APICreateMessage cmsg, VmInstanceVO finalVo) {
-    List<ErrorCode> result = Collections.emptyList();
+    List<ErrorCode> result = new ArrayList<>();
     if (msg == null) {
         result.add(operr(ORG_ZSTACK_COMPUTE_VM_10238, "CreateVmInstanceMsg cannot be null"));
         return result;
     } else if (cmsg != null && cmsg.getSystemTags() != null && !cmsg.getSystemTags().isEmpty()) {
         return extEmitter.handleSystemTag(finalVo.getUuid(), cmsg.getSystemTags());
     } else if (cmsg == null && msg.getSystemTags() != null && !msg.getSystemTags().isEmpty()) {
         return extEmitter.handleSystemTag(finalVo.getUuid(), msg.getSystemTags());
     }
     return result;
 }

 private List<ErrorCode> extEmitterHandleSshKeyPair(final CreateVmInstanceMsg msg, final APICreateMessage cmsg, VmInstanceVO finalVo) {
-    List<ErrorCode> result = Collections.emptyList();
+    List<ErrorCode> result = new ArrayList<>();
     if (msg == null) {
         result.add(operr(ORG_ZSTACK_COMPUTE_VM_10239, "CreateVmInstanceMsg cannot be null"));
         return result;
     } else if (msg.getSshKeyPairUuids() != null && !msg.getSshKeyPairUuids().isEmpty()) {
         return extEmitter.associateSshKeyPair(finalVo.getUuid(), msg.getSshKeyPairUuids());
     }
     return result;
 }

Also applies to: 1159-1168

plugin/sdnController/src/main/java/org/zstack/sdnController/h3cVcfc/H3cVcfcApiInterceptor.java (2)

226-238: L3 创建限制:错误信息拼接缺少空格,建议顺手修正

当前 "can not create l3 network" + "because ..." 会变成 networkbecause。这类信息通常会直接呈现给用户/工单排查,建议修正为更可读的英文。


118-152: 应在 H3C VNI 范围校验中添加下界检查,或更正错误信息以保持一致

当前代码仅校验 startVni > 4094 || endVni > 4094,但错误信息声称 "the range must be within 1-4094"。由于 H3C 控制器将 VNI 映射为 VLAN ID,应同时校验 startVni < 1 || endVni < 1;否则应修正错误信息与实际校验逻辑保持一致,避免误导用户。

core/src/main/java/org/zstack/core/cloudbus/CloudBusImpl2.java (1)

668-694: tryBestToReplyError 错误码化 OK,但建议修正文案中的 “corrected JSON mediaType”

如果这是用户可见/日志常见信息,建议改成更准确的英文(例如 “valid JSON format”/“correct JSON format”)。

compute/src/main/java/org/zstack/compute/vm/StaticIpOperator.java (1)

300-314: argerr() 第二个参数应显式使用格式化占位符。

第 307 行 argerr(ORG_ZSTACK_COMPUTE_VM_10308, r.getError().getDetails()) 将原始错误详情直接作为格式字符串传递。若 details 中包含 % 字符,Platform.err() 内部调用 String.format() 时会抛出 MissingFormatArgumentException,虽然异常会被捕获,但这不是最佳实践。应改为:

-            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_COMPUTE_VM_10308, r.getError().getDetails()));
+            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_COMPUTE_VM_10308, "%s", r.getError().getDetails()));

这与第 311 行的调用模式保持一致,明确标记 details 为格式参数而非格式字符串。

network/src/main/java/org/zstack/network/l2/L2NoVlanNetwork.java (1)

927-1010: L2NoVlanNetwork 和 L2VlanNetwork 中的异常处理不一致,且异常未通过 Completion 正确转换。

SQLBatch.execute() 中直接抛出异常(第 974 行的 ApiMessageInterceptionException 和第 1006 行的 OperationFailureException)存在两个问题:

  1. 异常类型不一致:L2NoVlanNetwork 抛出 ApiMessageInterceptionException,但该异常应仅用于 API 参数验证。L2VlanNetwork 抛出 OperationFailureException。两个网络类型应使用统一的异常类型。

  2. 异常未通过 Completion 正确传递SQLBatch.execute() 中的异常不会自动被捕获并转换为 ErrorCode 传递给 Completion.fail(),而是直接抛出。应在 SQLBatch.execute() 外层使用 try-catch 捕获异常,转换为 ErrorCode 后通过 completion.fail() 传递,确保消息回复的错误信息被正确设置。

plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageAllocatorFactory.java (1)

380-390: 避免 uriParsers.get(protocol) 为空导致 NPE(建议补齐 unknown scheme 分支)
现在只处理了 URISyntaxException,但如果 scheme 不是 file/volume(或为 null),会在 .parseUri(...) 处 NPE。建议转为 argerr

建议修复(防 NPE)
             try {
                 protocol = new URI(msg.getRequiredInstallUri()).getScheme();
             } catch (URISyntaxException e) {
                 throw new OperationFailureException(
                         argerr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10023, "invalid uri, correct example is file://$URL;hostUuid://$HOSTUUID or volume://$VOLUMEUUID "));
             }
-            hostUuid = uriParsers.get(protocol).parseUri(msg.getRequiredInstallUri()).hostUuid;
+            AbstractUriParser parser = uriParsers.get(protocol);
+            if (parser == null) {
+                throw new OperationFailureException(
+                        argerr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10023,
+                                "unsupported uri scheme[%s], correct example is file://$URL;hostUuid://$HOSTUUID or volume://$VOLUMEUUID",
+                                protocol));
+            }
+            hostUuid = parser.parseUri(msg.getRequiredInstallUri()).hostUuid;
         }
compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java (1)

194-385: ChangeVmNicNetwork:static IP 占用检查疑似用错变量 + 文案换行符错误

  1. 循环 staticIps 时占用检查使用了 msg.getDestL3NetworkUuid()/msg.getStaticIp(),看起来应为当前迭代的 l3Uuid/staticIp,否则可能误判/漏判。
  2. 文案里 manually/n 似乎是想写 \n
建议修复(用正确变量 + 修正换行符)
-        if (!networkServices.isEmpty()) {
-            String error = "vm nic [%s] attached network services, please detach manually/n" + networkServices.toString();
+        if (!networkServices.isEmpty()) {
+            String error = "vm nic [%s] attached network services, please detach manually\n" + networkServices.toString();
             throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_COMPUTE_VM_10095, error, msg.getVmNicUuid()));
         }
@@
-                SimpleQuery<UsedIpVO> uq = dbf.createQuery(UsedIpVO.class);
-                uq.add(UsedIpVO_.l3NetworkUuid, Op.EQ, msg.getDestL3NetworkUuid());
-                uq.add(UsedIpVO_.ip, Op.EQ, msg.getStaticIp());
+                SimpleQuery<UsedIpVO> uq = dbf.createQuery(UsedIpVO.class);
+                uq.add(UsedIpVO_.l3NetworkUuid, Op.EQ, l3Uuid);
+                uq.add(UsedIpVO_.ip, Op.EQ, staticIp);
                 if (uq.isExists()) {
                     throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_COMPUTE_VM_10108, "the static IP[%s] has been occupied on the L3 network[uuid:%s]", staticIp, l3Uuid));
                 }
compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java (2)

2896-2924: 修复明显的错误引用:失败分支里用到了 r.getError()(会 NPE/错误原因丢失)
AttachNicToVmReply r = new AttachNicToVmReply(); 刚创建时 r.getError() 必然为空,r.getError().getDetails() 会直接 NPE;这里应使用回调入参 MessageReply reply 的 error。

建议修复(最小改动)
                 AttachNicToVmReply r = new AttachNicToVmReply();
                 if (!reply.isSuccess()) {
-                    r.setError(err(ORG_ZSTACK_COMPUTE_VM_10274, VmErrors.ATTACH_NETWORK_ERROR, r.getError(), r.getError().getDetails()));
+                    r.setError(err(ORG_ZSTACK_COMPUTE_VM_10274, VmErrors.ATTACH_NETWORK_ERROR, reply.getError(), reply.getError().getDetails()));
                 }
                 bus.reply(msg, r);

5766-5820: 顺手发现一个明显字段赋值 bug(建议至少在同 PR 修掉或单独跟进)
updateVmIsoFirstOrder() 里保存 protocol 时写成了 sourceCdRomVO.setIsoInstallPath(protocol);,看起来应为 setProtocol(protocol),否则会把 installPath 覆盖成协议字符串。
考虑到项目对 cherry-pick“少做额外修改”的习惯,如果不想在本 PR 扩大范围,建议至少建一个 follow-up 修复单。Based on learnings, ...

建议修复(如允许顺手修)
         sourceCdRomVO.setIsoUuid(targetCdRomIsoUuid);
         sourceCdRomVO.setIsoInstallPath(path);
-        sourceCdRomVO.setIsoInstallPath(protocol);
+        sourceCdRomVO.setProtocol(protocol);
plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerApiInterceptor.java (3)

175-183: dbf.findByUuid() 可能返回 null,当前会直接 NPE
Line 177-179 / Line 199-201:当传入的 serverGroupUuid 不存在时,groupVO.getLoadBalancerUuid() 会触发 NPE,导致前端拿不到预期的结构化错误码。建议显式判空并返回 argerr/operr

建议修复(示例)
         if (msg.getServergroupUuid() != null) {
             LoadBalancerServerGroupVO groupVO = dbf.findByUuid(msg.getServergroupUuid(), LoadBalancerServerGroupVO.class);
+            if (groupVO == null) {
+                throw new ApiMessageInterceptionException(
+                        argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10028, "server group[uuid:%s] not found", msg.getServergroupUuid()));
+            }
             msg.setLoadBalancerUuid(groupVO.getLoadBalancerUuid());
         } else if (msg.getLoadBalancerUuid() == null) {
             throw new ApiMessageInterceptionException(
                     operr(ORG_ZSTACK_NETWORK_SERVICE_LB_10028, "could not get candidate vmnic, because both load balancer uuid and server group uuid are not specified"));
         }

Also applies to: 197-205


1485-1617: 多处格式化参数数量不匹配:会在报错路径直接抛 MissingFormatArgumentException(阻断真实错误返回)
至少以下 changed 行存在占位符与参数不匹配/变量用错的问题:

  • Line 1510:"...[uuid:%s] ... weight[%s]..." 但只传了 1 个参数。
  • Line 1592:同类问题。
  • Line 1581:错误信息里打印了 serverIps 列表而非非法的 ipAddress(且占位符与参数语义不一致)。
    这类问题一旦触发,用户会拿到完全不相关的异常栈,而不是 ORG_ZSTACK_NETWORK_SERVICE_LB_xxx 的结构化错误。
建议修复(最小改动)
-                        throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10134, "could not add backend server vmnic to serverGroup[uuid:%s] ,because vmnic weight[%s] not a correct number",vmNic.get("weight")));
+                        throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10134,
+                                "could not add backend server vmnic to serverGroup[uuid:%s], because vmnic weight[%s] is not a valid number",
+                                msg.getServerGroupUuid(), vmNic.get("weight")));
...
-                    throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_NETWORK_SERVICE_LB_10140, "could not add backend server ip to serverGroup [uuid:%s], because ip [ipAddress:%s] is invalid",msg.getServerGroupUuid(),serverIps));
+                    throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_NETWORK_SERVICE_LB_10140,
+                            "could not add backend server ip to serverGroup[uuid:%s], because ipAddress[%s] is invalid",
+                            msg.getServerGroupUuid(), server.get("ipAddress")));
...
-                        throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10142, "could not add backend server ip to serverGroup[uuid:%s] ,because vmnic weight[%s] not a correct number",server.get("weight")));
+                        throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10142,
+                                "could not add backend server ip to serverGroup[uuid:%s], because server weight[%s] is not a valid number",
+                                msg.getServerGroupUuid(), server.get("weight")));

1791-1894: 错误信息参数用错 + 同类格式化风险(backend server change)

  • Line 1816:... vmnic [uuid:%s] 的参数传了 vmNic.containsKey("uuid")(boolean),应为 vmNic.get("uuid")
  • Line 1873:格式化占位符与参数数量不匹配(同上,会在异常路径二次抛错)。
  • Line 1892:英文拼写错误较多(如 beacause),建议顺手修正以符合“注释/报错必须是正确英文”的规范。
建议修复(示例)
-                        throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10160, "could not update backend server vmnic of serverGroup,because serverGroup[uuid:%s] don not have vmnic [uuid:%s] ",msg.getServerGroupUuid(),vmNic.containsKey("uuid")));
+                        throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10160,
+                                "could not update backend server vmnic of serverGroup, because serverGroup[uuid:%s] does not have vmnic[uuid:%s]",
+                                msg.getServerGroupUuid(), vmNic.get("uuid")));
...
-                            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10167, "could not add backend server ip to serverGroup[uuid:%s] ,because vmnic weight[%s] not a correct number",server.get("weight")));
+                            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10167,
+                                    "could not change backend server ip of serverGroup[uuid:%s], because server weight[%s] is not a valid number",
+                                    msg.getServerGroupUuid(), server.get("weight")));
...
-            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10171, "could not change backendserver, beacause vmincs and serverips is null"));
+            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_SERVICE_LB_10171, "could not change backend server, because both vmNics and serverIps are empty"));
compute/src/main/java/org/zstack/compute/host/HostApiInterceptor.java (2)

113-127: APIUpdateHostMsg 管理口 IP 判重未排除自身:可能误报冲突
Line 114-120:如果更新的是同一台 Host 且 managementIp 没变,q.isExists() 很可能命中自身导致拒绝更新。建议加 q.add(HostVO_.uuid, Op.NEQ, msg.getUuid())(或等价条件)排除自身。


129-136: 错误信息缺少空格(可读性)
Line 135:status:%s]which 建议补空格为 ] which

compute/src/main/java/org/zstack/compute/vm/VmInstanceExtensionPointEmitter.java (1)

285-299: preStartVm 直接 rethrow 违反了方法签名与调用方的 ErrorCode 契约

第 293-296 行直接 throw e; 与本方法签名(返回 ErrorCode)不一致。VmInstanceBase 中的调用方(line 7153)期望得到 ErrorCode 返回值:

ErrorCode preStart = extEmitter.preStartVm(inv);
if (preStart != null) {
    completion.fail(preStart);
    return;
}

若异常被抛出,调用方无法捕获,错误处理逻辑会被绕过。此外,同类的其他 pre* 方法(preStopVm、preRebootVm、preDestroyVm、preStartNewCreatedVm)都采用 logger.warn() 并返回 null 的方式。建议改为 wrap 成 operr(ORG_ZSTACK_COMPUTE_VM_10320, ...) 并返回,以保持一致性。

core/src/main/java/org/zstack/core/Platform.java (1)

978-1001: 参数验证不足 - globalErrorCode 缺少空值检查

err() 方法直接使用 globalErrorCode 参数调用 setGlobalErrorCode(),未进行空值或格式校验。虽然现有调用方均使用预定义常量(如 ORG_ZSTACK_STORAGE_SNAPSHOT_10026),建议添加防御性校验以提高稳定性:

public static ErrorCode err(String globalErrorCode, Enum errCode, ErrorCode cause, String fmt, Object...args) {
+    if (globalErrorCode == null || globalErrorCode.trim().isEmpty()) {
+        logger.warn("globalErrorCode is null or empty");
+        globalErrorCode = "UNKNOWN_ERROR";
+    }
     ErrorFacade errf = getComponentLoader().getComponent(ErrorFacade.class);

方法签名变更(添加 globalErrorCode 参数)已完整覆盖整个代码库的 88 个调用点,无需额外的向后兼容处理。

🤖 Fix all issues with AI agents
In @compute/src/main/java/org/zstack/compute/vm/CpuTopology.java:
- Line 6: Replace the thrown OperationFailureException in CpuTopology.java (the
block using operr(ORG_ZSTACK_COMPUTE_VM_10224, ...)) with
ApiMessageInterceptionException to follow VM parameter validation convention
(e.g., throw new
ApiMessageInterceptionException(operr(ORG_ZSTACK_COMPUTE_VM_10224, "...")));
ensure you update imports to include ApiMessageInterceptionException and remove
or keep OperationFailureException only if still needed elsewhere; keep the
existing operr(...) call and error code unchanged.

In @compute/src/main/java/org/zstack/compute/vm/VmAllocateNicFlow.java:
- Around line 113-117: When type == null the code calls errs.add(...) and
wcomp.allDone() but does not return, so execution continues and
vmMgr.getVmInstanceNicFactory(type) will receive a null and cause an NPE; fix by
returning immediately after wcomp.allDone() (i.e., add a return statement right
after the wcomp.allDone() call in the branch where type == null) so that
VmInstanceNicFactory vnicFactory = vmMgr.getVmInstanceNicFactory(type) is not
executed with a null type.

In @compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.java:
- Around line 1717-1721: The error message parameters are swapped: the operr
call in the loop over tuples uses tuple.get(0) and tuple.get(1) but the message
expects L3 network uuid then VM uuid; update the parameter order so the first
placeholder receives the L3 uuid and the second receives the VM uuid (i.e., swap
tuple.get(0, String.class) and tuple.get(1, String.class) in the
ApiMessageInterceptionException/ opErr call within the for (Tuple tuple :
tuples) block), leaving ORG_ZSTACK_COMPUTE_VM_10243 and the exception type
unchanged.

In
@core/src/main/java/org/zstack/core/externalservice/ExternalServiceManagerImpl.java:
- Around line 99-107: After detecting that
service.getExternalServiceCapabilities().isReloadConfig() is false and setting
event.setError(...), return immediately to stop further processing so
service.reload() will not be called and the error will not be overwritten;
modify the control flow in ExternalServiceManagerImpl (around the check using
service, event, and msg) to perform an early return right after setting the
error.

In @identity/src/main/java/org/zstack/identity/QuotaUtil.java:
- Around line 35-38: The static import statement "import static
org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;" is concatenated
with the following Javadoc/comment, causing a syntax/compile error; separate
them by inserting a newline between the import and the comment so the import
stands alone (locate the import in QuotaUtil.java and ensure there's a blank
line before the "/** Created by miao..." comment).

In @network/src/main/java/org/zstack/network/l3/L3NetworkCascadeExtension.java:
- Around line 32-34: The import line "import static
org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*" is incorrectly fused
with a javadoc start token "/**"; separate them by moving the "/**" onto its own
line (or remove it if no Javadoc follows) so the import is a standalone
statement and the javadoc starts properly; update the file to ensure the import
statement ends with a semicolon and the "/**" is on the next line before the
intended comment block.

In @plugin/ldap/src/main/java/org/zstack/ldap/LdapApiInterceptor.java:
- Around line 127-131: In LdapApiInterceptor the format string passed to argerr
is malformed ("[%,%s]") causing runtime formatting errors; update the format
string to include two %s placeholders so it reads something like "Wrong
LdapServerType[%s], valid values: [%s,%s]" and keep the existing argument order
(type, LdapConstant.OpenLdap.TYPE, LdapConstant.WindowsAD.TYPE) when calling
argerr with ORG_ZSTACK_LDAP_10010 to ensure correct message formatting.

In
@plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerBase.java:
- Around line 766-781: The IPv6 error message in LoadBalancerBase's else branch
uses the wrong getter; when checking self.getIpv6VipUuid() and throwing
OperationFailureException (operr(ORG_ZSTACK_NETWORK_SERVICE_LB_10019,...)),
replace the second format argument self.getVipUuid() with self.getIpv6VipUuid()
so the exception reports the correct IPv6 VIP UUID.

In
@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java:
- Around line 3798-3802: In the anonymous fail(ErrorCode errorCode) handler
inside LocalStorageKvmBackend (the completion.fail(operr(...)) call), guard
against a null cause by using errorCode.getCause() != null ?
errorCode.getCause().getDetails() : errorCode.getDetails() (or similar) when
building the error message; update the opErr call to include the fallback
details so you never call getDetails() on a null cause and still surface the
original error information.
- Around line 2191-2200: The checks inside the NoRollbackFlow run() use
completion.fail(...) which prevents SimpleFlowChain from invoking its
.error()/.done() handlers; replace those completion.fail(...) calls with
trigger.fail(operr(...)) in both places (the null root image UUID check using
msg.getVolume().getRootImageUuid() and the existence check using
dbf.isExist(..., ImageVO.class)), and keep the subsequent return so the flow
exits after calling trigger.fail().
🟡 Minor comments (35)
plugin/kvm/src/main/java/org/zstack/kvm/KvmVmHardwareVerifyExtensionPoint.java-32-34 (1)

32-34: 错误码添加正确,但错误消息存在语法问题。

错误码 ORG_ZSTACK_KVM_10144 的添加符合本 PR 统一错误码输出的目标。但根据编码规范,错误消息应使用正确的英文,"can not" 应为 "cannot"(在此语境下是一个单词)。

建议修复语法错误
-            throw new OperationFailureException(operr(ORG_ZSTACK_KVM_10144, "Failed to start vm," +
-                    " because can not disable vm.cpu.hypervisor.feature with vm.cpuMode none"));
+            throw new OperationFailureException(operr(ORG_ZSTACK_KVM_10144, "Failed to start vm," +
+                    " because cannot disable vm.cpu.hypervisor.feature with vm.cpuMode none"));
plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsApiParamChecker.java-50-54 (1)

50-54: 错误消息格式问题:存在多余的前导空格。

Line 53 的错误消息开头有一个多余的空格 " the url contains...", 应移除以保持与其他错误消息格式一致。

建议修复
-            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_STORAGE_PRIMARY_NFS_10007, " the url contains an invalid folder[/dev or /proc or /sys]"));
+            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_STORAGE_PRIMARY_NFS_10007, "the url contains an invalid folder[/dev or /proc or /sys]"));
compute/src/main/java/org/zstack/compute/VmNicUtils.java-62-65 (1)

62-65: 语法错误:"not support yet" 应为 "not supported yet"。

根据编码规范,错误消息应使用正确的英文。

📝 建议修复
-throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_COMPUTE_10007, "vm nic driver %s not support yet", driverType));
+throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_COMPUTE_10007, "vm nic driver %s not supported yet", driverType));
compute/src/main/java/org/zstack/compute/vm/VmAllocatePrimaryStorageForAttachingDiskFlow.java-45-48 (1)

45-48: 错误消息格式问题:存在多余前导空格和拼写不规范。

错误消息字符串存在两个小问题:

  1. 消息前有一个多余的空格 " Can not..." 应为 "Can not..."
  2. "Can not" 建议改为标准拼写 "Cannot"
🔧 建议修复
-            ErrorCode errorCode = operr(ORG_ZSTACK_COMPUTE_VM_10316, " Can not find the vm's host, please start the vm[%s], then mount the disk", spec.getVmInventory().getUuid());
+            ErrorCode errorCode = operr(ORG_ZSTACK_COMPUTE_VM_10316, "Cannot find the vm's host, please start the vm[%s], then mount the disk", spec.getVmInventory().getUuid());
plugin/loginPlugin/src/main/java/org/zstack/login/plugin/LoginPluginBackend.java-63-67 (1)

63-67: 格式化字符串与参数不匹配,且存在语法错误。

存在两个问题:

  1. 错误信息 "missing LoginUserInfo when use plugin login" 中没有 %s 占位符,但传入了 loginContext.getLoginPluginName() 作为参数,该参数会被忽略,导致错误信息缺少关键的插件名称信息。
  2. 根据编码规范,代码中应使用正确的英文。"when use plugin login" 语法不正确,应为 "when using plugin login"
🐛 建议修复
         LoginUserInfo info = ext.login(loginContext.getUsername(), loginContext.getPassword());
         if (info == null || info.getUsername() == null) {
-            completion.fail(operr(ORG_ZSTACK_LOGIN_PLUGIN_10002, "missing LoginUserInfo when use plugin login", loginContext.getLoginPluginName()));
+            completion.fail(operr(ORG_ZSTACK_LOGIN_PLUGIN_10002, "missing LoginUserInfo when using plugin[%s] to login", loginContext.getLoginPluginName()));
             return;
         }
plugin/mediator/src/main/java/org/zstack/mediator/ApiValidator.java-177-179 (1)

177-179: 错误消息格式问题:缺少空格

与上方 isVmNicUsedByPortForwarding 相同的问题,EIPs%s 之间缺少空格。

🔧 建议修复
-            throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_MEDIATOR_10004, "the vm[name:%s, uuid:%s] already has some EIPs%s attached", vm.getName(), vm.getUuid(),
+            throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_MEDIATOR_10004, "the vm[name:%s, uuid:%s] already has some EIPs %s attached", vm.getName(), vm.getUuid(),
plugin/mediator/src/main/java/org/zstack/mediator/ApiValidator.java-192-192 (1)

192-192: 英文语法问题

"already has bound to" 语法不正确,应为 "is already bound to""has already been bound to"。根据编码规范要求使用正确无误的英文。

🔧 建议修复
-                throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_MEDIATOR_10005, "the vip[uuid:%s] already has bound to other service[%s]", msg.getVipUuid(), useForList.toString()));
+                throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_MEDIATOR_10005, "the vip[uuid:%s] is already bound to other service[%s]", msg.getVipUuid(), useForList.toString()));
plugin/mediator/src/main/java/org/zstack/mediator/ApiValidator.java-156-158 (1)

156-158: 错误消息格式问题:缺少空格

rules%s 之间缺少空格,会导致输出类似 "rules(name:xxx, ip:xxx)" 的拼接文本。根据编码规范,错误消息应使用正确的英文格式。

🔧 建议修复
-            throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_MEDIATOR_10003, "the vm[name:%s, uuid:%s] already has some port forwarding rules%s attached", vm.getName(), vm.getUuid(),
+            throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_MEDIATOR_10003, "the vm[name:%s, uuid:%s] already has some port forwarding rules %s attached", vm.getName(), vm.getUuid(),
identity/src/main/java/org/zstack/identity/AccountQuotaUpdateChecker.java-33-34 (1)

33-34: 小问题: "can not" 应为 "cannot"

根据编码规范,错误信息应使用正确的英文。"can not" 应合并为 "cannot"。

📝 建议修复
-            return argerr(ORG_ZSTACK_IDENTITY_10016, "the quota[name:%s] of account[uuid:%s] can not be %d",
+            return argerr(ORG_ZSTACK_IDENTITY_10016, "the quota[name:%s] of account[uuid:%s] cannot be %d",
plugin/externalStorage/src/main/java/org/zstack/externalStorage/primary/ExternalPrimaryStorageAllocator.java-112-118 (1)

112-118: 错误消息存在语法问题。

根据编码规范,错误消息应使用正确的英文。当前消息存在以下问题:

  1. locate urls 应为 location urlslocated at urls
  2. cannot create other places 语法不正确,应为 cannot create in other placescannot be created elsewhere
📝 建议修复
-                            trigger.fail(operr(ORG_ZSTACK_EXTERNALSTORAGE_PRIMARY_10000, "creation relies on image cache[uuid:%s, locate urls: [%s]], cannot create other places.", spec.getImageSpec().getInventory().getUuid(), candidateUrls));
+                            trigger.fail(operr(ORG_ZSTACK_EXTERNALSTORAGE_PRIMARY_10000, "creation relies on image cache[uuid:%s, located at urls: [%s]], cannot be created elsewhere.", spec.getImageSpec().getInventory().getUuid(), candidateUrls));
compute/src/main/java/org/zstack/compute/allocator/HostAllocatorApiInterceptor.java-75-77 (1)

75-77: 拼写错误:"none-empty" 应为 "non-empty"。

根据编码规范要求,代码中应使用正确无拼写错误的英文。当前错误消息中 "none-empty" 是拼写错误。

📝 修复拼写错误
-            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10025, "zoneUuids, clusterUuids, hostUuids must at least have one be none-empty list, or all is set to true"));
+            throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10025, "zoneUuids, clusterUuids, hostUuids must have at least one non-empty list, or all must be set to true"));
compute/src/main/java/org/zstack/compute/allocator/HostAllocatorApiInterceptor.java-23-23 (1)

23-23: 修正第 76 行的英文拼写错误。

错误码 ORG_ZSTACK_COMPUTE_ALLOCATOR_10024ORG_ZSTACK_COMPUTE_ALLOCATOR_10025 已在 CloudOperationsErrorCode 中正确定义并使用。但第 76 行的错误提示信息中存在拼写错误:"none-empty" 应修改为 "non-empty"

network/src/main/java/org/zstack/network/service/HostRouteExtension.java-88-92 (1)

88-92: 修正错误信息的语法问题。

当前错误信息 "provide type null" 语法不正确,根据编码规范,应使用正确的英文表达。

📝 建议的修改
-            reply.setError(operr(ORG_ZSTACK_NETWORK_SERVICE_10000, "L3Network [uuid: %s] provide type null", msg.getL3NetworkUuid()));
+            reply.setError(operr(ORG_ZSTACK_NETWORK_SERVICE_10000, "L3Network [uuid: %s] has null provider type", msg.getL3NetworkUuid()));
compute/src/main/java/org/zstack/compute/allocator/TagAllocatorFlow.java-69-69 (1)

69-69: 错误消息存在语法错误,需要修复。

错误码 ORG_ZSTACK_COMPUTE_ALLOCATOR_10010ORG_ZSTACK_COMPUTE_ALLOCATOR_10011 已在 CloudOperationsErrorCode 类中正确定义。但两处错误消息都存在语法问题:

  • 第 69 行和第 105 行的消息中,"return zero candidate host" 应改为 "returns zero candidate hosts"(主谓一致,名词复数)
📝 建议修复
- fail(Platform.operr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10010, "InstanceOfferingTagAllocatorExtensionPoint[%s] return zero candidate host", extp.getClass().getName()));
+ fail(Platform.operr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10010, "InstanceOfferingTagAllocatorExtensionPoint[%s] returns zero candidate hosts", extp.getClass().getName()));
- fail(Platform.operr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10011, "DiskOfferingTagAllocatorExtensionPoint[%s] return zero candidate host", extp.getClass().getName()));
+ fail(Platform.operr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10011, "DiskOfferingTagAllocatorExtensionPoint[%s] returns zero candidate hosts", extp.getClass().getName()));
compute/src/main/java/org/zstack/compute/allocator/ResourceBindingAllocatorFlow.java-136-136 (1)

136-136: 错误消息中 "binded" 拼写错误。

"bind" 是不规则动词,过去分词应为 "bound",而非 "binded"。根据编码规范,错误消息应使用正确的英文。

建议修复
-            fail(Platform.operr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10005, "no available host found with binded resource %s", resources));
+            fail(Platform.operr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10005, "no available host found with bound resource %s", resources));
compute/src/main/java/org/zstack/compute/allocator/ResourceBindingAllocatorFlow.java-117-117 (1)

117-117: 错误消息存在语法问题。

根据编码规范,错误消息应使用正确的英文。"not support" 语法不正确,应改为 "does not support" 或使用更规范的表达方式。

建议修复
-                fail(Platform.operr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10004, "resource binding not support type %s yet", entry.getKey()));
+                fail(Platform.operr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10004, "resource binding does not support type %s yet", entry.getKey()));
plugin/kvm/src/main/java/org/zstack/kvm/KVMHostFactory.java-328-331 (1)

328-331: 拼写错误

"operation system" 应为 "operating system"。

建议修复
-            return operr(ORG_ZSTACK_KVM_10002, "the operation system[%s] of host[name:%s, ip:%s] is invalid",
+            return operr(ORG_ZSTACK_KVM_10002, "the operating system[%s] of host[name:%s, ip:%s] is invalid",
                     os, vo.getName(), vo.getManagementIp());
plugin/kvm/src/main/java/org/zstack/kvm/KVMHostFactory.java-192-195 (1)

192-195: 错误消息语法问题

根据编码规范,错误消息应使用正确的英文。"fail to load" 应改为 "failed to load"。

建议修复
-            throw new OperationFailureException(operr(ORG_ZSTACK_KVM_10001, "fail to load host info from file. because\n%s", e.getMessage()));
+            throw new OperationFailureException(operr(ORG_ZSTACK_KVM_10001, "failed to load host info from file, because: %s", e.getMessage()));
plugin/kvm/src/main/java/org/zstack/kvm/xmlhook/XmlHookBase.java-201-202 (1)

201-202: 错误码添加正确,但存在语法小问题。

添加 ORG_ZSTACK_KVM_XMLHOOK_10001 错误码符合PR目标。但根据编码规范,错误消息应使用正确的英文语法:"details is" 应改为 "details are"(details 是复数)。

📝 建议修复
-                                errs.add(operr(ORG_ZSTACK_KVM_XMLHOOK_10001, "xml hook[uuid: %s] updated successfully, but failed to restart vm[uuid:%s]. details is: %s",
+                                errs.add(operr(ORG_ZSTACK_KVM_XMLHOOK_10001, "xml hook[uuid: %s] updated successfully, but failed to restart vm[uuid:%s]. details: %s",
                                         msg.getXmlHookUuid(), vmUuid, reply.getError().getDetails()));
plugin/kvm/src/main/java/org/zstack/kvm/xmlhook/XmlHookBase.java-33-33 (1)

33-33: 修正错误消息中的语法问题。

静态导入和错误码常量已验证正确。ORG_ZSTACK_KVM_XMLHOOK_10001 已在 CloudOperationsErrorCode.java:3533 中定义,导入语句使用恰当。

但错误消息字符串中存在语法错误:第 201 行的 "details is:" 不正确,因为 "details" 是复数形式。应改为 "details are:" 或更简洁的 "details:"。

plugin/sdnController/src/main/java/org/zstack/sdnController/SdnControllerApiInterceptor.java-192-195 (1)

192-195: 错误信息与判断条件不一致

条件 msg.getNetmask() == null 检查的是 netmask 未指定 的情况,但错误信息写的是 "because netmask is specified"(因为 netmask 已指定),逻辑矛盾。

对比 validate(APISdnControllerAddHostMsg) 中 Line 148-151 的相同条件,其错误信息正确写为 "because netmask is not specified"

🐛 建议修复
         if (msg.getVtepIp() != null && msg.getNetmask() == null) {
             throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_SDNCONTROLLER_10023, "could not change host[uuid:%s] of sdn controller[uuid:%s], " +
-                    " because netmask is specified", msg.getHostUuid(), msg.getSdnControllerUuid()));
+                    " because netmask is not specified", msg.getHostUuid(), msg.getSdnControllerUuid()));
         }
plugin/sdnController/src/main/java/org/zstack/sdnController/SdnControllerApiInterceptor.java-297-300 (1)

297-300: 异常捕获范围过大,可能掩盖具体错误码

当前 catch (Exception e) 会捕获所有异常,包括 Line 283 和 Line 289 抛出的 ApiMessageInterceptionException(错误码 10028、10029)。这会导致这些具体错误被 10030 覆盖,丢失了区分 "范围格式错误" 与 "范围重叠" 的能力。

建议仅捕获 NumberFormatException,让 ApiMessageInterceptionException 正常向上传播。

🐛 建议修复
-            } catch (Exception e) {
+            } catch (NumberFormatException e) {
                 throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_SDNCONTROLLER_10030, "could not change sdn controller, " +
                         "because vlan range[%s] is not in the correct format", range));
             }
core/src/main/java/org/zstack/core/thread/DispatchQueueImpl.java-529-533 (1)

529-533: 修正方法名称:应使用 getErrorCode() 而非 getError()

建议保留 OperationFailureException 的原始错误码。当前实现仅使用 t.getMessage() 创建新错误,会丢失原始 ErrorCode 上下文信息。但提议中的方法名称有误,应改为:

if (t instanceof OperationFailureException) {
-   completion.fail(operr(ORG_ZSTACK_CORE_THREAD_10001, t.getMessage()));
+   ErrorCode originalErr = ((OperationFailureException) t).getErrorCode();
+   completion.fail(operr(ORG_ZSTACK_CORE_THREAD_10001, originalErr, t.getMessage()));
} else {
    completion.fail(inerr(ORG_ZSTACK_CORE_THREAD_10002, t.getMessage()));
}

OperationFailureException 提供的是 getErrorCode(),而非 getError()operr() 方法支持传入 ErrorCode 作为 cause 参数来保留原始错误链。

core/src/main/java/org/zstack/core/With.java-21-23 (1)

21-23: 修复导入语句格式问题。

导入语句与注释符号错误地合并在一起,导致格式混乱。应将其分开为独立的行。

🛠️ 建议修复
-import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;/**
- */
-
+import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;
+
+/**
+ */
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java-4149-4153 (1)

4149-4153: 拼写错误:faild → failed

错误信息会直接暴露给用户/日志检索,建议修正拼写。

建议修复
-                    reply.setError(operr(ORG_ZSTACK_KVM_10079, "Host[%s] update spice channel config faild, because %s", msg.getHostUuid(), ret.getError()));
+                    reply.setError(operr(ORG_ZSTACK_KVM_10079, "Host[%s] update spice channel config failed, because %s", msg.getHostUuid(), ret.getError()));
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.java-5057-5061 (1)

5057-5061: 用户可见文案拼写错误:depreciateddeprecated

这是面向用户/运维的错误信息,建议修正拼写并顺便微调语法。

修复建议
- reply.setError(operr(ORG_ZSTACK_STORAGE_CEPH_PRIMARY_10028, "backing up snapshots to backup storage is a depreciated feature, which will be removed in future version"));
+ reply.setError(operr(ORG_ZSTACK_STORAGE_CEPH_PRIMARY_10028, "backing up snapshots to backup storage is a deprecated feature, which will be removed in a future version"));
core/src/main/java/org/zstack/core/salt/SaltSetupMinionJob.java-137-144 (1)

137-144: 异常处理中丢失了目标系统信息

原有的 err 变量已经构造了包含 targetIp 的完整错误信息,但 completion.fail() 中只使用了 e.getMessage() / ie.getMessage(),导致错误信息丢失了目标系统上下文。建议保持错误信息的完整性。

建议修复
         } catch (SshException e) {
             String err = String.format("failed to setup minion on target system[%s], because %s", targetIp, e.getMessage());
             logger.warn(err, e);
-            completion.fail(operr(ORG_ZSTACK_CORE_SALT_10001, e.getMessage()));
+            completion.fail(operr(ORG_ZSTACK_CORE_SALT_10001, err));
         } catch (IOException ie) {
             String err = String.format("failed to setup minion on target system[%s], because %s", targetIp, ie.getMessage());
             logger.warn(err, ie);
-            completion.fail(inerr(ORG_ZSTACK_CORE_SALT_10002, ie.getMessage()));
+            completion.fail(inerr(ORG_ZSTACK_CORE_SALT_10002, err));
         } finally {
header/src/main/java/org/zstack/header/errorcode/ErrorCode.java-26-34 (1)

26-34: 新增 globalErrorCode 字段未在拷贝构造函数中处理

新增的 globalErrorCode 字段没有在第 76-84 行的拷贝构造函数中复制,这可能导致通过拷贝构造函数创建的 ErrorCode 对象丢失全局错误码信息。

♻️ 建议的修改
 public ErrorCode(ErrorCode other) {
     this.code = other.code;
     this.description = other.description;
     this.details = other.details;
     this.elaboration = other.elaboration;
     this.messages = other.messages;
     this.cause = other.cause;
     this.location = other.location;
+    this.globalErrorCode = other.globalErrorCode;
 }

另外,如果 globalErrorCode 是错误码身份的一部分,也应考虑将其加入 equals()hashCode() 方法中。

compute/src/main/java/org/zstack/compute/allocator/HostSortorChain.java-136-138 (1)

136-138: 错误消息中 e.getMessage() 重复出现。

在错误消息格式字符串中,e.getMessage() 出现了两次:一次在格式字符串 "[Host Allocation]: %s on host[uuid:%s]. try next one. %s" 的第一个 %s 位置,另一次在末尾的 %s 位置。这会导致错误消息中重复显示异常信息,降低可读性。

🔧 建议修复
                            trigger.fail(operr(
-                            ORG_ZSTACK_COMPUTE_ALLOCATOR_10034,         "[Host Allocation]: %s on host[uuid:%s]. try next one. %s", e.getMessage(), host.getUuid(), e.getMessage()));
+                            ORG_ZSTACK_COMPUTE_ALLOCATOR_10034, "[Host Allocation]: %s on host[uuid:%s]. try next one", e.getMessage(), host.getUuid()));
plugin/flatNetworkProvider/src/main/java/org/zstack/network/service/flat/FlatEipBackend.java-610-612 (1)

610-612: 修正异常文案拼写/重复词,避免前端/运维误判the thehostUUid)。

建议修改(最小 diff)
-                throw new OperationFailureException(operr(ORG_ZSTACK_NETWORK_SERVICE_FLAT_10017, "unable to apply the EIP operation for the the vm[uuid:%s, state:%s], because cannot find the VM's hostUUid",
+                throw new OperationFailureException(operr(ORG_ZSTACK_NETWORK_SERVICE_FLAT_10017, "unable to apply the EIP operation for the vm[uuid:%s, state:%s], because cannot find the VM's hostUuid",
                                 vmUuid, vm.getState()));
plugin/flatNetworkProvider/src/main/java/org/zstack/network/service/flat/FlatEipBackend.java-64-65 (1)

64-65: 错误码常量已全部验证存在,静态导入正确,无编译风险。 但第610-611行的错误信息存在拼写错误需要修正:

  • "for the the vm" → 重复的 the,应删除一个
  • "hostUUid" → 应为 hostUuid
错误信息位置
610:    throw new OperationFailureException(operr(ORG_ZSTACK_NETWORK_SERVICE_FLAT_10017, "unable to apply the EIP operation for the the vm[uuid:%s, state:%s], because cannot find the VM's hostUUid",
611:                            vmUuid, vm.getState()));
identity/src/main/java/org/zstack/identity/rbac/OperationTargetAPIRequestChecker.java-21-22 (1)

21-22: 错误码补充已确认;建议改进 checkIfTheAccountOperationItSelf 的错误语义

当 operationTarget 参数为空时,该方法抛出"permission denied"错误(10006 码),但实际原因是参数缺失而非权限问题,且错误信息中两个 uuid 字段值相同(都是 accountUuid),造成混淆。相比之下,其他三个检查方法(checkIfTheAccountOwnTheResource、checkIfTheAccountOwnTheTaggedResource、checkIfTheAccountCanAccessTheResource)在 uuids 为空时直接返回,行为不一致。建议明确此处的意图(是否要求 operationTarget 字段必填),并补充测试用例验证在 AccountVO 操作目标字段缺失时的处理逻辑。

network/src/main/java/org/zstack/network/l3/L3NetworkApiInterceptor.java-141-144 (1)

141-144: 错误消息与校验逻辑不匹配 - 复制粘贴错误。

当前代码校验的是 msg.getEndIp(),但错误消息却说是 "start ip" 并使用了 msg.getStartIp()。这会导致用户收到错误的提示信息。

🐛 修复建议
         if (!NetworkUtils.isValidIPAddress(msg.getEndIp())) {
             throw new ApiMessageInterceptionException(argerr(ORG_ZSTACK_NETWORK_L3_10002, "could not reserve ip range, " +
-                    "because start ip[%s] is not valid ip address", msg.getStartIp()));
+                    "because end ip[%s] is not valid ip address", msg.getEndIp()));
         }
plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java-2339-2343 (1)

2339-2343: 错误信息里 uuid;%s 疑似笔误

当前是 the volume[uuid;%s],更像是想写 uuid:%s。如果这是用户可见信息,建议顺手修正以免误导排障。

network/src/main/java/org/zstack/network/l2/L2NoVlanNetwork.java-218-224 (1)

218-224: errCode.getDetails() 包含 % 字符时可能触发 String.format 异常

代码将 errCode.getDetails() 作为格式化字符串传给 Platform.err(),若其中包含 % 字符且未被转义,会触发 String.format() 异常。虽然 Platform.err() 在第 986-993 行有 try-catch 机制捕获异常并回退使用原始字符串,但这种设计依赖异常处理来处理可预见的边界情况,不够理想。

建议:直接传递错误信息而无需格式化,或在调用前确保 errCode.getDetails() 中的 % 被正确转义,避免依赖异常处理作为主要防御机制。

import org.zstack.header.errorcode.OperationFailureException;

import static org.zstack.core.Platform.operr;
import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find CloudOperationsErrorCode file and check for the error code
find . -name "CloudOperationsErrorCode.java" -type f

Repository: MatheMatrix/zstack

Length of output: 180


🏁 Script executed:

#!/bin/bash
# Search for the error code constant
rg "ORG_ZSTACK_COMPUTE_VM_10224" --type java

Repository: MatheMatrix/zstack

Length of output: 180


🏁 Script executed:

#!/bin/bash
# Examine CpuTopology.java around the changed lines (72-78)
rg -A 10 -B 5 "ORG_ZSTACK_COMPUTE_VM_10224" compute/src/main/java/org/zstack/compute/vm/CpuTopology.java

Repository: MatheMatrix/zstack

Length of output: 180


🏁 Script executed:

#!/bin/bash
# Get full context of CpuTopology.java to understand the error handling pattern
head -80 compute/src/main/java/org/zstack/compute/vm/CpuTopology.java | tail -20

Repository: MatheMatrix/zstack

Length of output: 1056


🏁 Script executed:

#!/bin/bash
# Search for CloudOperationsErrorCode and the error code definition
find . -path "*clouderrorcode*" -name "*.java" -type f 2>/dev/null | head -5

Repository: MatheMatrix/zstack

Length of output: 145


🏁 Script executed:

#!/bin/bash
# Search for ORG_ZSTACK_COMPUTE_VM_10224 definition
rg "ORG_ZSTACK_COMPUTE_VM_10224\s*=" --type java

Repository: MatheMatrix/zstack

Length of output: 235


🏁 Script executed:

#!/bin/bash
# Check for ApiMessageInterceptionException usage in compute/vm directory
rg "ApiMessageInterceptionException" compute/src/main/java/org/zstack/compute/vm/ --type java | head -10

Repository: MatheMatrix/zstack

Length of output: 1769


🏁 Script executed:

#!/bin/bash
# Check how calculateValidTopology is called
rg "calculateValidTopology" compute/src/main/java/org/zstack/compute/vm/ --type java -B 2 -A 2 | head -30

Repository: MatheMatrix/zstack

Length of output: 1461


CPU 拓扑参数验证应使用 ApiMessageInterceptionException,而非 OperationFailureException

错误码 ORG_ZSTACK_COMPUTE_VM_10224 已在 CloudOperationsErrorCode 中正确定义,但第 72-74 行的异常处理方式不符合 ZStack VM 系统验证规范。同模块其他文件(如 VmQuotaOperator.java)的参数验证都使用 ApiMessageInterceptionException。请将异常类型改为 ApiMessageInterceptionException 以保持一致性:

throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_COMPUTE_VM_10224, "cpu topology is not correct, ..."));
🤖 Prompt for AI Agents
In @compute/src/main/java/org/zstack/compute/vm/CpuTopology.java at line 6,
Replace the thrown OperationFailureException in CpuTopology.java (the block
using operr(ORG_ZSTACK_COMPUTE_VM_10224, ...)) with
ApiMessageInterceptionException to follow VM parameter validation convention
(e.g., throw new
ApiMessageInterceptionException(operr(ORG_ZSTACK_COMPUTE_VM_10224, "...")));
ensure you update imports to include ApiMessageInterceptionException and remove
or keep OperationFailureException only if still needed elsewhere; keep the
existing operr(...) call and error code unchanged.

Comment on lines 113 to 117
if (type == null) {
errs.add(Platform.operr("there is no available nicType on L3 network [%s]", nw.getUuid()));
errs.add(Platform.operr(ORG_ZSTACK_COMPUTE_VM_10068, "there is no available nicType on L3 network [%s]", nw.getUuid()));
wcomp.allDone();
}
VmInstanceNicFactory vnicFactory = vmMgr.getVmInstanceNicFactory(type);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

wcomp.allDone() 后缺少 return 语句,将导致空指针异常。

type == null 时,代码调用 wcomp.allDone() 但未返回,程序继续执行到第 117 行 vmMgr.getVmInstanceNicFactory(type),此时 typenull,将触发 NullPointerException

🐛 建议修复
 if (type == null) {
     errs.add(Platform.operr(ORG_ZSTACK_COMPUTE_VM_10068, "there is no available nicType on L3 network [%s]", nw.getUuid()));
     wcomp.allDone();
+    return;
 }
🤖 Prompt for AI Agents
In @compute/src/main/java/org/zstack/compute/vm/VmAllocateNicFlow.java around
lines 113 - 117, When type == null the code calls errs.add(...) and
wcomp.allDone() but does not return, so execution continues and
vmMgr.getVmInstanceNicFactory(type) will receive a null and cause an NPE; fix by
returning immediately after wcomp.allDone() (i.e., add a return statement right
after the wcomp.allDone() call in the branch where type == null) so that
VmInstanceNicFactory vnicFactory = vmMgr.getVmInstanceNicFactory(type) is not
executed with a null type.

Comment on lines 1717 to 1721
for (Tuple tuple: tuples) {
if (tuple.get(2, Long.class) > 1) {
throw new ApiMessageInterceptionException(operr("unable to enable this function. There are multi nics of L3 network[uuid:%s] in the vm[uuid: %s]",
throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_COMPUTE_VM_10243, "unable to enable this function. There are multi nics of L3 network[uuid:%s] in the vm[uuid: %s]",
tuple.get(0, String.class), tuple.get(1, String.class)));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

错误信息参数疑似写反:L3 uuid / VM uuid 位置对调(会误导排障)。

建议修改(最小 diff)
-throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_COMPUTE_VM_10243, "unable to enable this function. There are multi nics of L3 network[uuid:%s] in the vm[uuid: %s]",
-            tuple.get(0, String.class), tuple.get(1, String.class)));
+throw new ApiMessageInterceptionException(operr(ORG_ZSTACK_COMPUTE_VM_10243, "unable to enable this function. There are multi nics of L3 network[uuid:%s] in the vm[uuid: %s]",
+            tuple.get(1, String.class), tuple.get(0, String.class)));
🤖 Prompt for AI Agents
In @compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.java
around lines 1717 - 1721, The error message parameters are swapped: the operr
call in the loop over tuples uses tuple.get(0) and tuple.get(1) but the message
expects L3 network uuid then VM uuid; update the parameter order so the first
placeholder receives the L3 uuid and the second receives the VM uuid (i.e., swap
tuple.get(0, String.class) and tuple.get(1, String.class) in the
ApiMessageInterceptionException/ opErr call within the for (Tuple tuple :
tuples) block), leaving ORG_ZSTACK_COMPUTE_VM_10243 and the exception type
unchanged.

Comment on lines 99 to 107
if (!service.getExternalServiceCapabilities().isReloadConfig()) {
event.setError(operr("service[%s] does not support reload config", msg.getName()));
event.setError(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_10002, "service[%s] does not support reload config", msg.getName()));
}

if (service.isAlive()) {
service.reload();
} else {
event.setError(operr("service[%s] is not running", msg.getName()));
event.setError(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_10003, "service[%s] is not running", msg.getName()));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

缺少 return 语句导致控制流错误

在第 100 行设置错误码后,代码没有提前返回,这会导致以下问题:

  1. 如果 service.isAlive() 返回 true,即使服务不支持 reload config,仍会调用 service.reload()
  2. 如果 service.isAlive() 返回 false,错误码会被 ORG_ZSTACK_CORE_EXTERNALSERVICE_10003 覆盖

根据编码规范中的"尽早返回(Early Return)"原则,应在设置错误后立即返回。

🐛 建议修复
 if (!service.getExternalServiceCapabilities().isReloadConfig()) {
     event.setError(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_10002, "service[%s] does not support reload config", msg.getName()));
+    bus.publish(event);
+    return;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!service.getExternalServiceCapabilities().isReloadConfig()) {
event.setError(operr("service[%s] does not support reload config", msg.getName()));
event.setError(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_10002, "service[%s] does not support reload config", msg.getName()));
}
if (service.isAlive()) {
service.reload();
} else {
event.setError(operr("service[%s] is not running", msg.getName()));
event.setError(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_10003, "service[%s] is not running", msg.getName()));
}
if (!service.getExternalServiceCapabilities().isReloadConfig()) {
event.setError(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_10002, "service[%s] does not support reload config", msg.getName()));
bus.publish(event);
return;
}
if (service.isAlive()) {
service.reload();
} else {
event.setError(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_10003, "service[%s] is not running", msg.getName()));
}
🤖 Prompt for AI Agents
In
@core/src/main/java/org/zstack/core/externalservice/ExternalServiceManagerImpl.java
around lines 99 - 107, After detecting that
service.getExternalServiceCapabilities().isReloadConfig() is false and setting
event.setError(...), return immediately to stop further processing so
service.reload() will not be called and the error will not be overwritten;
modify the control flow in ExternalServiceManagerImpl (around the check using
service, event, and msg) to perform an early return right after setting the
error.

Comment on lines +35 to +38
import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;/**
* Created by miao on 16-10-9.
*/

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

导入语句格式错误,会导致编译失败。

第 35 行的静态导入语句与 Javadoc 注释块直接连接在一起,缺少换行符。这会导致编译错误。

🐛 修复建议
-import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;/**
+import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;
+
+/**
  * Created by miao on 16-10-9.
  */
-
 @Configurable(preConstruction = true, autowire = Autowire.BY_TYPE)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;/**
* Created by miao on 16-10-9.
*/
import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;
/**
* Created by miao on 16-10-9.
*/
@Configurable(preConstruction = true, autowire = Autowire.BY_TYPE)
🤖 Prompt for AI Agents
In @identity/src/main/java/org/zstack/identity/QuotaUtil.java around lines 35 -
38, The static import statement "import static
org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;" is concatenated
with the following Javadoc/comment, causing a syntax/compile error; separate
them by inserting a newline between the import and the comment so the import
stands alone (locate the import in QuotaUtil.java and ensure there's a blank
line before the "/** Created by miao..." comment).

Comment on lines +32 to +34
import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;/**
*/

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

语法错误:import 语句与注释符号被错误合并

/** 被错误地连接在 import 语句末尾,这会导致编译失败。应将 Javadoc 注释符号移到单独的行。

🐛 建议修复
-import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;/**
- */
+import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;
+
+/**
+ */
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;/**
*/
import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;
/**
*/
🤖 Prompt for AI Agents
In @network/src/main/java/org/zstack/network/l3/L3NetworkCascadeExtension.java
around lines 32 - 34, The import line "import static
org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*" is incorrectly fused
with a javadoc start token "/**"; separate them by moving the "/**" onto its own
line (or remove it if no Javadoc follows) so the import is a standalone
statement and the javadoc starts properly; update the file to ensure the import
statement ends with a semicolon and the "/**" is on the next line before the
intended comment block.

Comment on lines 127 to 131
if(!(LdapConstant.OpenLdap.TYPE.equals(type) || LdapConstant.WindowsAD.TYPE.equals(type))){
throw new ApiMessageInterceptionException(
argerr("Wrong LdapServerType[%s], valid values: [%,%s]", type, LdapConstant.OpenLdap.TYPE, LdapConstant.WindowsAD.TYPE)
argerr(ORG_ZSTACK_LDAP_10010, "Wrong LdapServerType[%s], valid values: [%,%s]", type, LdapConstant.OpenLdap.TYPE, LdapConstant.WindowsAD.TYPE)
);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

格式化字符串存在语法错误,会导致运行时异常或输出错误。

[%,%s] 缺少格式说明符 s,应为 [%s,%s]。当前代码有 3 个格式参数 (type, LdapConstant.OpenLdap.TYPE, LdapConstant.WindowsAD.TYPE),但格式字符串只有 2 个有效的 %s 占位符。

🐛 修复格式化字符串
         if(!(LdapConstant.OpenLdap.TYPE.equals(type) || LdapConstant.WindowsAD.TYPE.equals(type))){
             throw new ApiMessageInterceptionException(
-                    argerr(ORG_ZSTACK_LDAP_10010, "Wrong LdapServerType[%s], valid values: [%,%s]", type, LdapConstant.OpenLdap.TYPE, LdapConstant.WindowsAD.TYPE)
+                    argerr(ORG_ZSTACK_LDAP_10010, "Wrong LdapServerType[%s], valid values: [%s,%s]", type, LdapConstant.OpenLdap.TYPE, LdapConstant.WindowsAD.TYPE)
             );
         }
🤖 Prompt for AI Agents
In @plugin/ldap/src/main/java/org/zstack/ldap/LdapApiInterceptor.java around
lines 127 - 131, In LdapApiInterceptor the format string passed to argerr is
malformed ("[%,%s]") causing runtime formatting errors; update the format string
to include two %s placeholders so it reads something like "Wrong
LdapServerType[%s], valid values: [%s,%s]" and keep the existing argument order
(type, LdapConstant.OpenLdap.TYPE, LdapConstant.WindowsAD.TYPE) when calling
argerr with ORG_ZSTACK_LDAP_10010 to ensure correct message formatting.

Comment on lines 766 to 781
VipVO vipVO = Q.New(VipVO.class).eq(VipVO_.uuid, msg.getVipUuid()).find();
if (StringUtils.isEmpty(vipVO.getIp())) {
throw new OperationFailureException(operr("fail to attach vip to lb , because vip[%s] has no ip", vipVO.getUuid()));
throw new OperationFailureException(operr(ORG_ZSTACK_NETWORK_SERVICE_LB_10017, "fail to attach vip to lb , because vip[%s] has no ip", vipVO.getUuid()));
}

if (NetworkUtils.isIpv4Address(vipVO.getIp())) {
if (!StringUtils.isEmpty(self.getVipUuid())) {
throw new OperationFailureException(operr("fail to attach ipv4 vip to lb , because lb[%s] has ipv4 vip[%s]", self.getUuid(), self.getVipUuid()));
throw new OperationFailureException(operr(ORG_ZSTACK_NETWORK_SERVICE_LB_10018, "fail to attach ipv4 vip to lb , because lb[%s] has ipv4 vip[%s]", self.getUuid(), self.getVipUuid()));
}
self.setVipUuid(vipVO.getUuid());
} else {
if (!StringUtils.isEmpty(self.getIpv6VipUuid())) {
throw new OperationFailureException(operr("fail to attach ipv6 vip to lb , because lb[%s] has ipv6 vip[%s]", self.getUuid(), self.getVipUuid()));
throw new OperationFailureException(operr(ORG_ZSTACK_NETWORK_SERVICE_LB_10019, "fail to attach ipv6 vip to lb , because lb[%s] has ipv6 vip[%s]", self.getUuid(), self.getVipUuid()));
}
self.setIpv6VipUuid(vipVO.getUuid());
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# 查找 "fail to attach ipv6 vip" 错误消息的具体位置
rg -n "fail to attach ipv6 vip" plugin/loadBalancer/ --type java -A 2 -B 2

Repository: MatheMatrix/zstack

Length of output: 894


🏁 Script executed:

# 读取 LoadBalancerBase.java 文件的相关行
cat -n plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerBase.java | sed -n '770,782p'

Repository: MatheMatrix/zstack

Length of output: 1025


🏁 Script executed:

# 查看 LoadBalancerBase.java 在这个 PR 中的变更
cd plugin/loadBalancer && git diff HEAD plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerBase.java | head -100

Repository: MatheMatrix/zstack

Length of output: 327


🏁 Script executed:

# 使用正确的路径语法检查文件在 PR 中的变更
git diff HEAD -- plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerBase.java | grep -A 20 -B 5 "fail to attach ipv6"

Repository: MatheMatrix/zstack

Length of output: 44


🏁 Script executed:

# 查找错误码 10017-10019 的定义位置,确定是否是本 PR 引入的
rg -n "ORG_ZSTACK_NETWORK_SERVICE_LB_10017|ORG_ZSTACK_NETWORK_SERVICE_LB_10018|ORG_ZSTACK_NETWORK_SERVICE_LB_10019" --type java

Repository: MatheMatrix/zstack

Length of output: 1507


第778行错误消息使用了错误的 getter 方法,需要修正。

IPv6 VIP 附加失败的错误消息(第778行)在 else 分支中检查了 self.getIpv6VipUuid(),但错误消息参数仍然传入了 self.getVipUuid(),这会导致错误消息显示错误的 VIP UUID。应该改为 self.getIpv6VipUuid() 以保持与 IPv4 分支一致(第773行的模式)。

相关代码片段
Line 773 (IPv4 - 正确):
if (!StringUtils.isEmpty(self.getVipUuid())) {
    throw new OperationFailureException(operr(ORG_ZSTACK_NETWORK_SERVICE_LB_10018, "fail to attach ipv4 vip to lb , because lb[%s] has ipv4 vip[%s]", self.getUuid(), self.getVipUuid()));
}

Line 777-778 (IPv6 - 需要修正):
if (!StringUtils.isEmpty(self.getIpv6VipUuid())) {
    throw new OperationFailureException(operr(ORG_ZSTACK_NETWORK_SERVICE_LB_10019, "fail to attach ipv6 vip to lb , because lb[%s] has ipv6 vip[%s]", self.getUuid(), self.getVipUuid()));
                                                                                                                                                                      ↑ 应为 self.getIpv6VipUuid()
}
🤖 Prompt for AI Agents
In
@plugin/loadBalancer/src/main/java/org/zstack/network/service/lb/LoadBalancerBase.java
around lines 766 - 781, The IPv6 error message in LoadBalancerBase's else branch
uses the wrong getter; when checking self.getIpv6VipUuid() and throwing
OperationFailureException (operr(ORG_ZSTACK_NETWORK_SERVICE_LB_10019,...)),
replace the second format argument self.getVipUuid() with self.getIpv6VipUuid()
so the exception reports the correct IPv6 VIP UUID.

Comment on lines 2191 to 2200
ReinitImageCmd cmd = new ReinitImageCmd();
if (msg.getVolume().getRootImageUuid() == null) {
completion.fail(operr("root image has been deleted, cannot reimage now"));
completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10074, "root image has been deleted, cannot reimage now"));
return;
}

if (!dbf.isExist(msg.getVolume().getRootImageUuid(), ImageVO.class)) {
completion.fail(operr("root image has been deleted, cannot reimage now"));
completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10075, "root image has been deleted, cannot reimage now"));
return;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file and get the full method context
cd plugin/localstorage && head -c 500000 src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java | tail -c 100000 | grep -n "2191\|2200" | head -20

# Get a broader view of the code structure around those lines
wc -l src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java

Repository: MatheMatrix/zstack

Length of output: 141


🏁 Script executed:

# Let's read the actual file around those lines to understand the context
sed -n '2150,2250p' plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java | cat -n

Repository: MatheMatrix/zstack

Length of output: 5759


🏁 Script executed:

# Search for NoRollbackFlow patterns in the codebase to understand the expected usage
rg "NoRollbackFlow" -A 10 -B 2 | head -100

Repository: MatheMatrix/zstack

Length of output: 12031


🏁 Script executed:

# Look for completion.fail vs trigger.fail patterns in this file
rg "completion\.fail|trigger\.fail" plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java -B 3 -A 1 | head -150

Repository: MatheMatrix/zstack

Length of output: 5575


🏁 Script executed:

# Understand the SimpleFlowChain implementation
find . -type f -name "*.java" -path "*SimpleFlowChain*" | head -5

Repository: MatheMatrix/zstack

Length of output: 128


🏁 Script executed:

# Search for SimpleFlowChain definition and NoRollbackFlow
rg "class SimpleFlowChain|class NoRollbackFlow" -A 20 | head -200

Repository: MatheMatrix/zstack

Length of output: 2848


🏁 Script executed:

# Let's examine the SimpleFlowChain implementation to understand error handling
cat core/src/main/java/org/zstack/core/workflow/SimpleFlowChain.java | head -300

Repository: MatheMatrix/zstack

Length of output: 8954


🏁 Script executed:

# Look at how SimpleFlowChain.fail() method is implemented
rg "public.*fail|private.*fail" core/src/main/java/org/zstack/core/workflow/SimpleFlowChain.java -A 5 | head -80

Repository: MatheMatrix/zstack

Length of output: 234


🏁 Script executed:

# Check the actual structure around error handling in SimpleFlowChain
sed -n '1,100p' core/src/main/java/org/zstack/core/workflow/SimpleFlowChain.java | cat -n

Repository: MatheMatrix/zstack

Length of output: 4856


🏁 Script executed:

# Check if the current version of the file still has this issue
sed -n '2180,2210p' plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java | cat -n

Repository: MatheMatrix/zstack

Length of output: 1987


🏁 Script executed:

# Look for similar patterns in the same file where completion.fail is called within NoRollbackFlow
rg -B 8 "new NoRollbackFlow" plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java | head -200

Repository: MatheMatrix/zstack

Length of output: 5328


🏁 Script executed:

# Check if there are other flows in the same file that correctly use trigger.fail()
rg "trigger\.fail|trigger\.next" plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java -B 3 | head -100

Repository: MatheMatrix/zstack

Length of output: 4977


🏁 Script executed:

# Double-check the exact line numbers and confirm the issue persists
sed -n '2191,2200p' plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java

Repository: MatheMatrix/zstack

Length of output: 636


🏁 Script executed:

# Verify if there are done/error handlers that expect trigger notification
sed -n '2200,2210p' plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java

Repository: MatheMatrix/zstack

Length of output: 650


NoRollbackFlow 中应该使用 trigger.fail() 而非 completion.fail(),否则流程链的错误处理器无法被触发

SimpleFlowChain 中,当 NoRollbackFlow.run() 方法执行过程中遇到错误时,直接调用 completion.fail(...) 然后 return 会导致流程链无法继续传递状态。由于没有调用 trigger.fail()trigger.next(),流程链的 .error().done() 处理器将永远不会被触发,造成流程链悬挂、资源无法释放、线程被占用等严重后果。

第 2193-2195 行和第 2198-2200 行的两处错误判断都存在此问题,应改为:

                if (msg.getVolume().getRootImageUuid() == null) {
-                    completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10074, "root image has been deleted, cannot reimage now"));
+                    trigger.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10074, "root image has been deleted, cannot reimage now"));
                    return;
                }

                if (!dbf.isExist(msg.getVolume().getRootImageUuid(), ImageVO.class)) {
-                    completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10075, "root image has been deleted, cannot reimage now"));
+                    trigger.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10075, "root image has been deleted, cannot reimage now"));
                    return;
                }
🤖 Prompt for AI Agents
In
@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java
around lines 2191 - 2200, The checks inside the NoRollbackFlow run() use
completion.fail(...) which prevents SimpleFlowChain from invoking its
.error()/.done() handlers; replace those completion.fail(...) calls with
trigger.fail(operr(...)) in both places (the null root image UUID check using
msg.getVolume().getRootImageUuid() and the existence check using
dbf.isExist(..., ImageVO.class)), and keep the subsequent return so the flow
exits after calling trigger.fail().

Comment on lines 3798 to 3802
@Override
public void fail(ErrorCode errorCode) {
completion.fail(operr("cannot find flag file [%s] on host [%s], because: %s", makeInitializedFilePath(), hostUuid, errorCode.getCause().getDetails()));
completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10081, "cannot find flag file [%s] on host [%s], because: %s", makeInitializedFilePath(), hostUuid, errorCode.getCause().getDetails()));
}
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find the ErrorCode class to understand getCause() behavior
find . -name "ErrorCode.java" -type f | head -20

Repository: MatheMatrix/zstack

Length of output: 177


🏁 Script executed:

# Look at the specific file and lines mentioned in the review
fd -t f "LocalStorageKvmBackend.java" | head -5

Repository: MatheMatrix/zstack

Length of output: 156


🏁 Script executed:

# Search for ErrorCode getCause patterns in the codebase
rg -n 'getCause\(\)' --type java -A 2 -B 2 | head -100

Repository: MatheMatrix/zstack

Length of output: 10267


🏁 Script executed:

# Check the ErrorCode class definition
cat -n ./header/src/main/java/org/zstack/header/errorcode/ErrorCode.java | head -100

Repository: MatheMatrix/zstack

Length of output: 3314


🏁 Script executed:

# Get the specific lines from LocalStorageKvmBackend.java
sed -n '3790,3830p' ./plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java

Repository: MatheMatrix/zstack

Length of output: 2066


🏁 Script executed:

# Check the getCause() method in ErrorCode
sed -n '100,200p' ./header/src/main/java/org/zstack/header/errorcode/ErrorCode.java

Repository: MatheMatrix/zstack

Length of output: 2747


🏁 Script executed:

# Check for more of the ErrorCode class to find getCause() and getDetails()
grep -n "public.*getCause\|public.*getDetails" ./header/src/main/java/org/zstack/header/errorcode/ErrorCode.java

Repository: MatheMatrix/zstack

Length of output: 136


🏁 Script executed:

# Search for all errorCode.getCause().getDetails() patterns (as suggested in the review)
rg -n 'getCause\(\)\.getDetails\(\)' . --type java

Repository: MatheMatrix/zstack

Length of output: 1167


🏁 Script executed:

# Also search for similar chained patterns on errorCode.getCause()
rg -n 'errorCode\.getCause\(\)\.' . --type java

Repository: MatheMatrix/zstack

Length of output: 1167


errorCode.getCause() 可能为 null,直接调用 getDetails() 存在 NPE 风险

errorCode.getCause().getDetails() 存在明确的空指针风险。ErrorCode 中的 cause 字段未初始化,getCause() 可能返回 null,在此情况下直接调用 getDetails() 会触发 NPE,将真实错误掩盖为空指针异常,影响问题诊断。

建议添加空值保护,优先回退到 errorCode.getDetails()

修复方案
         public void fail(ErrorCode errorCode) {
-            completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10081, "cannot find flag file [%s] on host [%s], because: %s", makeInitializedFilePath(), hostUuid, errorCode.getCause().getDetails()));
+            String details = errorCode.getCause() != null ? errorCode.getCause().getDetails() : errorCode.getDetails();
+            completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10081,
+                    "cannot find flag file [%s] on host [%s], because: %s",
+                    makeInitializedFilePath(), hostUuid, details));
         }
         public void fail(ErrorCode errorCode) {
-            completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10082, "cannot create flag file [%s] on host [%s], because: %s", makeInitializedFilePath(), hostUuid, errorCode.getCause().getDetails()));
+            String details = errorCode.getCause() != null ? errorCode.getCause().getDetails() : errorCode.getDetails();
+            completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10082,
+                    "cannot create flag file [%s] on host [%s], because: %s",
+                    makeInitializedFilePath(), hostUuid, details));
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@Override
public void fail(ErrorCode errorCode) {
completion.fail(operr("cannot find flag file [%s] on host [%s], because: %s", makeInitializedFilePath(), hostUuid, errorCode.getCause().getDetails()));
completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10081, "cannot find flag file [%s] on host [%s], because: %s", makeInitializedFilePath(), hostUuid, errorCode.getCause().getDetails()));
}
});
@Override
public void fail(ErrorCode errorCode) {
String details = errorCode.getCause() != null ? errorCode.getCause().getDetails() : errorCode.getDetails();
completion.fail(operr(ORG_ZSTACK_STORAGE_PRIMARY_LOCAL_10081, "cannot find flag file [%s] on host [%s], because: %s", makeInitializedFilePath(), hostUuid, details));
}
🤖 Prompt for AI Agents
In
@plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageKvmBackend.java
around lines 3798 - 3802, In the anonymous fail(ErrorCode errorCode) handler
inside LocalStorageKvmBackend (the completion.fail(operr(...)) call), guard
against a null cause by using errorCode.getCause() != null ?
errorCode.getCause().getDetails() : errorCode.getDetails() (or similar) when
building the error message; update the opErr call to include the fallback
details so you never call getDetails() on a null cause and still surface the
original error information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants