-
Notifications
You must be signed in to change notification settings - Fork 0
<fix>[xinfini]: change VolumeSpec fields from int to long #3159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 4.4.52-sm
Are you sure you want to change the base?
Conversation
fix apply userdata when vm attach none default nic Resolves: ZSTAC-75776 Change-Id: I6868776c9fbf656008d74d039ea7eb3979dde737
Change-Id: I7a7670696e696f706e697864656f77636f746270
<fix>[compute]: fix nicManager setNicDriverType See merge request zstackio/zstack!8337
<fix>[root]: bump version to 5.4.0 See merge request zstackio/zstack!8359
Change-Id: I776370656a6b707279747a666b7362767a6d6a67
Resolves: ZSTAC-74198 Change-Id: I637a74198f696e77766377626f78647269707470 <fix>[sdk]: support pushing user data to Keycloak Resolves: ZSTAC-74189 Change-Id: I7375741896676741897574646678757a7a7a7578 <fix>[iam2]: multe region adapter 1、APICreateAccountMsg support set `type` 2、APICreateIAM2VirtualIDMsg support set `type` 3、APICreateIAM2VirtualIDMsg.attributes support set `uuid` 4、APICreateIAM2OrganizationMsg.attributes support set `uuid` 5、APICreateIAM2VirtualIDGroupMsg.attributes support set `uuid` 6、APICreateIAM2ProjectMsg.attributes support set `uuid` Resolves: ZSTAC-74189 Change-Id: I62726d7a736b67706b6176726d75636568677068 <fix>[sdk]: sdk add morph flag Resolves: ZSTAC-74189 Change-Id: I656b6d6c6c72687a6669616864736361637a7475 <fix>[sdk]: sdk add morph flag 2 Resolves: ZSTAC-74189 Change-Id: I717a756d6c6f75726d74716d6c707063666d786e <fix>[iam2]: CreateIAM2ProjectFromTemplate add params `linkAccountUuid` Resolves: ZSTAC-74189 Change-Id: I6370787a6865776c616e7665796364666e716b71 <fix>[iam2]: CreateIAM2Project add params `linkAccountUuid` Resolves: ZSTAC-74189 Change-Id: I6a717564676d63647379797177746a796f646e61 <fix>[iam2]: SetIAM2ProjectLoginExpired add params `loginExpiredAttributeUuid` Resolves: ZSTAC-74189 Change-Id: I756a6967636463777775617a636c79617a676774
Resolves: ZSTAC-77221 Change-Id: I687377786d6f7a73636e61687a776a6b686b6d7a
<fix>[compute]: allow vm state change from unknown to nostate See merge request zstackio/zstack!8349
<fix>[ceph]: correct backup param See merge request zstackio/zstack!8363
<fix>[vm]: add default vm gpu alarm and event See merge request zstackio/zstack!8305
Resolves: ZSTAC-75806 Change-Id: I616963747a756966766c6d6f7a6375796c626875 Signed-off-by: zhangjianjun <jianjun.zhang@zstack.io>
<fix>[sso]: migrate sso config storage from SystemTagVO to AttributesVO See merge request zstackio/zstack!8346
Resolves: ZSTAC-75666 Change-Id: I6d766a70776676646862777475656e6f6162656f Signed-off-by: zhangjianjun <jianjun.zhang@zstack.io>
Resolves: ZSTAC-77707 Change-Id: I71637966757761786c6b6467636c626171726f6a
<fix>[VirtualRouter]: clean pf rule on backup VirtualRouterVmVO See merge request zstackio/zstack!8370
Resolves: ZSTAC-64577 Change-Id: I6676737167747a68767479776572626872686e6f
<fix>[dependencies]: update quartz to version 2.4.0 See merge request zstackio/zstack!8351
Resolves: ZSTAC-75481 Change-Id: I71727a7163766c72687677667a6d6b75726b6e76 Signed-off-by: zhangjianjun <jianjun.zhang@zstack.io>
Resolves: ZSTAC-77662 Change-Id: I6868727a64716d6661777064646f627779727764
<fix>[ovn]: config ovn ovs by ResourceConfig See merge request zstackio/zstack!8389
<fix>[nic]: Cleaning dirty Data See merge request zstackio/zstack!8391
<fix>[xinfini]: support set volume qos See merge request zstackio/zstack!8374
- sso login support saml 2.0 DBImpact Resolves: ZSTAC-76199 Change-Id: I656d6e726463787970636666756b746263766a79
Resolves: ZSV-9789 Change-Id: I74656b756a74716e6a697a6875766c766e73766e
Add new GuestOS in guestOsCategory and guestOsCharacter Resolves: ZSTAC-77340 Change-Id: I6868776c69929b27a5b34e1bb544e419e4724e10
<fix>[securityGroup]: strip space tab newLine in rule See merge request zstackio/zstack!8368
<improve>[conf]: Add new GuestOS See merge request zstackio/zstack!8402
Resolves: ZSTAC-78037 Change-Id: I71696c6579646b6f6774717a766c6d6365656c6e
<fix>[storage]: extrenal ps save imageCache size with actualsize See merge request zstackio/zstack!8398
<feature>[sdk]: support saml2.0 See merge request zstackio/zstack!8392
Resolves: ZSTAC-80119 Change-Id: I73726b787377706261676f6874726f6e61696f62
<fix>[sdk]: Update sdk See merge request zstackio/zstack!8779
Resolves: ZSTAC-79936 Change-Id: I65657166766872677463796e796d6f6f78746964
DBImpact Resolves: ZSTAC-79981 Change-Id: I706a65776377706f73766f62786e667472677a6c
Resolves: ZSTAC-79981 Change-Id: I766568646477627a766c6570617871786a776675
<fix>[conf]: Add isolated field to gpu and gpu spec See merge request zstackio/zstack!8796
Resolves: ZSTAC-80203 Change-Id: I73736d736c7775677273766572787462756d7a71
DBImpact Resolves: ZSTAC-80334 Change-Id: I6d7470667a6b666a617575707474626a626a7a68
<fix>[sdk]: Fix sdk and upgrade gpuVendor field See merge request zstackio/zstack!8793
<fix>[conf]: Fix model services whose cpu arch missed in db See merge request zstackio/zstack!8806
Resolves: ZSTAC-78057 Change-Id: I67617a667567756a7578706a6c78646e6267756d
Resolves: ZSTAC-80381 Change-Id: I7673656d7769756d706b736467797868666f7a69
<fix>[sdk]: Update sdk and add new fields See merge request zstackio/zstack!8800
DBImpact Resolves: ZSTAC-80203 Change-Id: I6473736b7a716b77796b6b796467716e7062627a
<fix>[conf]: Fix missing records in ModelServiceGpuVendorVO See merge request zstackio/zstack!8818
1. add snapshot placement capability to external storage 2. ext ps controller cannot use extension point, move it to exp ps. 3. retain the ask meaning of AskSnapshotCapabilityMsg. Resolves: ZSTAC-79582 ZSTAC-79481 Change-Id: I7a7074707079616b696f7375757577666c79756b
<fix>[storage]: use specify msg get volume path See merge request zstackio/zstack!8810
<fix>[compute]: Export public method for static ip case See merge request zstackio/zstack!8824
DBImpact Resolves: ZSTAC-79761 Change-Id: I7662666569656f6c736d736667747679726a666c
<feature>[conf]: Upgrade GpuDeviceVO add gpuStatus See merge request zstackio/zstack!8701
- add column supportMetrics for ModelServiceInstanceGroupVO DBImpact Resolves: ZSTAC-80375 Change-Id: I6862697569686b7a6b6e787673696e71736f6169
<fix>[conf]: update ModelServiceInstanceGroupVO schema See merge request zstackio/zstack!8827
Fix status from lowercase to uppercase, to avoid Enum.valueOf failure DBImpact Resolves: ZSTAC-80451 Change-Id: I67646d73796f7076626d6a616664746c76757979
Resolves: ZSTAC-80172 Change-Id: I7774626b796a6b626a626b78656d6f69656a7274
<fix>[conf]: Fix gpu work status upgrade sql See merge request zstackio/zstack!8833
<fix>[sdk]: Update sdk See merge request zstackio/zstack!8835
<fix>[ceph]: change default password See merge request zstackio/zstack!8837
APIImpact Resolves: ZSTAC-12345 Change-Id: I736a6c736b7267716d6e66706c7963796379756c
Walkthrough本次变更引入插件抽象与SSO/OAuth2框架、主机/网络/VM分配与迁移多处控制流重构与扩展,新增多类扩展点与工厂管理器,强化NIC/IP分配与磁盘/主存储选择,新增主机电源/IPMI执行器,多项全局配置与系统标签,且包含大量数据库迁移与新表。 Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User as User
participant Client as OAuth2ProviderDriver impl
participant IdP as Identity Provider
participant App as ZStack SSO Service
User->>Client: Request login (OIDC/OAuth2)
Client->>IdP: buildRequestTokenData(authorization_code)
IdP-->>Client: access_token / id_token
Client->>IdP: buildRequestUserInfoData(headers)
IdP-->>Client: userinfo JSON
Client->>App: parseUserInfo() -> principal map
App-->>User: session established
note over Client,IdP: 默认实现可被覆盖;支持refresh/introspect
sequenceDiagram
autonumber
participant API as API Layer
participant VM as VmAllocateNicIpFlow
participant Net as L3/IPAM Service
participant DB as Database
participant Bus as CloudBus
API->>VM: Run(flow, spec with NICs)
VM->>DB: Load NICs & static IP tags
VM->>Net: AllocateIpMsg (per L3, v4/v6)
Net-->>VM: AllocateIpReply (UsedIp)
VM->>DB: Persist UsedIpVO & update VmNicVO
VM->>Bus: Progress report
VM-->>API: next() or fail(error)
note over VM,Net: 支持静态IP/IPv6与回滚(ReturnIp)
sequenceDiagram
autonumber
participant User as Operator
participant Manager as HostManagerImpl
participant Exec as HostIpmiPowerExecutor
participant DB as Database
participant Host as Physical Host
User->>Manager: APIGetHostPowerStatus / PowerOn/Off/Reset
Manager->>Exec: getPowerStatus()/powerOn()/...
Exec->>Host: ipmitool command
Host-->>Exec: stdout/return code
Exec->>DB: updateIpmiPowerStatus
Exec-->>Manager: Completion(success/error)
Manager-->>User: APIEvent with result
Estimated code review effort🎯 5 (Critical) | ⏱️ ~180 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 20
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (10)
compute/src/main/java/org/zstack/compute/vm/VmReleaseResourceFlow.java (1)
23-30: 字段初始化顺序导致 NullPointerExceptionLine 30 在实例字段初始化时调用
pluginRgty.getExtensionList(...),但此时 Spring 的@Autowired注入尚未完成,pluginRgty仍为 null,会导致运行时 NPE。Java 字段初始化顺序:
- 实例字段初始化器执行
- 构造函数执行
- Spring 依赖注入(@Autowired)在构造期间/之后完成
🔎 建议的修复方案
方案 1:使用 @PostConstruct(推荐)
+import javax.annotation.PostConstruct; + @Configurable(preConstruction = true, autowire = Autowire.BY_TYPE) public class VmReleaseResourceFlow implements Flow { private static final CLogger logger = Utils.getLogger(VmReleaseResourceFlow.class); @Autowired private PluginRegistry pluginRgty; - private final List<VmReleaseResourceExtensionPoint> extensions = pluginRgty.getExtensionList(VmReleaseResourceExtensionPoint.class); + private List<VmReleaseResourceExtensionPoint> extensions; + + @PostConstruct + private void init() { + extensions = pluginRgty.getExtensionList(VmReleaseResourceExtensionPoint.class); + }方案 2:使用构造器注入
@Configurable(preConstruction = true, autowire = Autowire.BY_TYPE) public class VmReleaseResourceFlow implements Flow { private static final CLogger logger = Utils.getLogger(VmReleaseResourceFlow.class); - @Autowired - private PluginRegistry pluginRgty; + private final PluginRegistry pluginRgty; + private final List<VmReleaseResourceExtensionPoint> extensions; - private final List<VmReleaseResourceExtensionPoint> extensions = pluginRgty.getExtensionList(VmReleaseResourceExtensionPoint.class); + @Autowired + public VmReleaseResourceFlow(PluginRegistry pluginRgty) { + this.pluginRgty = pluginRgty; + this.extensions = pluginRgty.getExtensionList(VmReleaseResourceExtensionPoint.class); + }compute/src/main/java/org/zstack/compute/allocator/HostCapacityReserveManagerImpl.java (2)
180-205: 关键缺陷:未设置 finder.hypervisorType在
getReservedHostCapacityByZones方法中,虽然查询时过滤了 hypervisorType (lines 188-190),但创建ReservedCapacityFinder后没有设置finder.hypervisorType字段。这与getReservedHostCapacityByClusters方法的实现不一致(该方法在 line 226 设置了此字段)。虽然查询已经按 hypervisorType 过滤了主机列表,但 finder 内部的
findReservedCapacityByHypervisorType()方法可能依赖此字段进行额外的逻辑处理,缺少此设置可能导致行为不一致。🔎 建议的修复
ReservedCapacityFinder finder = new ReservedCapacityFinder(); finder.hostUuids = huuids; + finder.hypervisorType = hypervisorType; Collection<ReservedHostCapacity> col = finder.find().values();
237-241: 潜在的运行时异常风险Line 240 直接调用
.iterator().next()而未检查集合是否为空。如果hostUuids为空列表,finder.find().values()将返回空集合,调用.iterator().next()会抛出NoSuchElementException。建议添加空值检查或确保调用方保证
hostUuids非空。🔎 建议的防御性修复
@Override public ReservedHostCapacity getReservedHostCapacityByHosts(List<String> hostUuids) { + if (hostUuids == null || hostUuids.isEmpty()) { + ReservedHostCapacity ret = new ReservedHostCapacity(); + ret.setReservedCpuCapacity(0); + ret.setReservedMemoryCapacity(0); + return ret; + } ReservedCapacityFinder finder = new ReservedCapacityFinder(); finder.hostUuids = hostUuids; return finder.find().values().iterator().next(); }conf/db/upgrade/V4.4.0__schema.sql (3)
9-9: 建议使用IF EXISTS确保幂等性DROP INDEX语句缺少IF EXISTS子句,如果升级脚本重复执行会导致错误。建议修改为:
🔎 建议的修复
-DROP INDEX idxLicenseHistoryVOHash ON LicenseHistoryVO; +DROP INDEX IF EXISTS idxLicenseHistoryVOHash ON LicenseHistoryVO;或者使用存储过程检查索引是否存在后再删除。
Based on learnings, ZStack数据库升级脚本需要通过information_schema检查来确保幂等性。
121-122: 外键约束删除缺少幂等性检查Lines 121、124、127的
DROP FOREIGN KEY语句缺少存在性检查,如果升级脚本重复执行将会失败。由于MySQL不支持DROP FOREIGN KEY IF EXISTS语法,建议使用存储过程通过information_schema检查约束是否存在后再删除。🔎 建议的幂等性处理模式
DELIMITER $$ CREATE PROCEDURE dropForeignKeyIfExists( IN tableName VARCHAR(64), IN constraintName VARCHAR(64) ) BEGIN DECLARE constraintExists INT DEFAULT 0; SELECT COUNT(*) INTO constraintExists FROM information_schema.TABLE_CONSTRAINTS WHERE CONSTRAINT_SCHEMA = 'zstack' AND TABLE_NAME = tableName AND CONSTRAINT_NAME = constraintName AND CONSTRAINT_TYPE = 'FOREIGN KEY'; IF constraintExists > 0 THEN SET @sql = CONCAT('ALTER TABLE `zstack`.`', tableName, '` DROP FOREIGN KEY `', constraintName, '`'); PREPARE stmt FROM @sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; END IF; END $$ DELIMITER ; CALL dropForeignKeyIfExists('HostNumaNodeVO', 'HostNumaNodeVO_HostEO_uuid_fk'); CALL dropForeignKeyIfExists('VmInstanceNumaNodeVO', 'VmInstanceNumaNodeVO_VmInstanceEO_uuid_fk'); CALL dropForeignKeyIfExists('HostAllocatedCpuVO', 'HostAllocatedCpuVO_HostEO_uuid_fk'); DROP PROCEDURE IF EXISTS dropForeignKeyIfExists;Based on learnings, ZStack数据库升级脚本需要通过information_schema检查表和约束的存在性来确保脚本可以安全重复执行。
Also applies to: 124-125, 127-128
130-140: Timestamp 列缺少 DEFAULT 值,违反编码规范
lastOpDate和createDate列应分别添加DEFAULT CURRENT_TIMESTAMP和DEFAULT CURRENT_TIMESTAMP NOT NULL。当前定义缺少默认值会导致新插入的记录这两个字段为 NULL,且不符合编码规范要求(应使用DEFAULT CURRENT_TIMESTAMP而非DEFAULT '0000-00-00 00:00:00')。此外,createDate应添加NOT NULL约束以保证数据完整性。compute/src/main/java/org/zstack/compute/vm/VmMigrationCheckL2NetworkOnHostFlow.java (1)
39-50: 循环内的数据库查询违反性能最佳实践。在 for 循环中对每个 L3 网络都执行
dbf.findByUuid()查询对应的 L2NetworkVO,这会导致 N+1 查询问题,影响性能。根据编码规范:"禁止循环里套查询,避免嵌套查询带来的性能问题。"
建议在循环外批量加载所有需要的 L2NetworkVO,然后在循环内从 Map 中获取。
🔎 建议的优化方案
List<CheckL2NetworkOnHostMsg> cmsgs = new ArrayList<CheckL2NetworkOnHostMsg>(); + // Batch load all L2NetworkVOs + List<String> l2Uuids = VmNicSpec.getL3NetworkInventoryOfSpec(spec.getL3Networks()) + .stream().map(L3NetworkInventory::getL2NetworkUuid).distinct().collect(Collectors.toList()); + Map<String, L2NetworkVO> l2NetworkMap = Q.New(L2NetworkVO.class) + .in(L2NetworkVO_.uuid, l2Uuids) + .list() + .stream() + .collect(Collectors.toMap(L2NetworkVO::getUuid, Function.identity())); + for (L3NetworkInventory l3 : VmNicSpec.getL3NetworkInventoryOfSpec(spec.getL3Networks())) { - CheckL2NetworkOnHostMsg msg = new CheckL2NetworkOnHostMsg(); - L2NetworkVO l2NetworkVO = dbf.findByUuid(l3.getL2NetworkUuid(), L2NetworkVO.class); + L2NetworkVO l2NetworkVO = l2NetworkMap.get(l3.getL2NetworkUuid()); + if (l2NetworkVO == null) { + continue; + } VSwitchType switchType = VSwitchType.valueOf(l2NetworkVO.getvSwitchType()); if (!switchType.isAttachToCluster()) { continue; } + CheckL2NetworkOnHostMsg msg = new CheckL2NetworkOnHostMsg(); msg.setL2NetworkUuid(l3.getL2NetworkUuid());需要添加
java.util.stream.Collectors和java.util.function.Function导入。compute/src/main/java/org/zstack/compute/vm/VmAllocateHostForMigrateVmFlow.java (1)
44-78:setAllowNoL3Networks(true)被重复设置第 46 行在
MigrateVmMessage条件块内设置msg.setAllowNoL3Networks(true),但第 78 行又无条件地设置了相同的值。这导致第 46 行的设置变得多余。如果所有迁移场景都需要
allowNoL3Networks=true,则应移除第 46 行的重复调用;如果只有MigrateVmMessage需要,则第 78 行可能是错误的。🔎 建议移除重复设置
if (spec.getMessage() instanceof MigrateVmMessage) { destHostUuid = ((MigrateVmMessage)spec.getMessage()).getHostUuid(); - msg.setAllowNoL3Networks(true); }compute/src/main/java/org/zstack/compute/vm/VmHostNameHelper.java (1)
21-37: 两个getHostName方法的 IP 版本判断逻辑不一致
getHostName(VmInstanceInventory)在第 27 行使用nic.getIpVersion() == IPv6Constants.IPv4判断,而getHostName(VmInstanceVO)在第 45 行使用NetworkUtils.isIpv4Address(nic.getIp())。这两种方式的语义不同:
- 前者依赖 NIC 的
ipVersion属性- 后者通过解析 IP 地址字符串判断
建议统一两个方法的判断逻辑以保持一致性。
Also applies to: 39-55
compute/src/main/java/org/zstack/compute/vm/InstantiateVmFromNewCreatedStruct.java (1)
127-145:fromMessage(CreateVmInstanceMsg)中缺少sshKeyPairUuids的初始化在
fromMessage(CreateVmInstanceMsg msg)方法(第 148-165 行)中,CreateVmInstanceMsg包含getSshKeyPairUuids()方法,但未将其值传递给 struct。应在返回前添加:struct.setDisableL3Networks(msg.getDisableL3Networks()); struct.setDiskAOs(msg.getDiskAOs()); + struct.setSshKeyPairUuids(msg.getSshKeyPairUuids()); return struct;注:
InstantiateNewCreatedVmInstanceMsg不包含此字段,故第一个fromMessage方法无需修改。
🟠 Major comments (47)
conf/db/upgrade/V4.7.0.2__schema.sql-10-10 (1)
10-10: 添加 NOT NULL 列需使用存储过程处理历史数据根据编码规范,当存在
NOT NULL约束时,必须使用存储过程或其他函数来处理历史数据。虽然此处action列有DEFAULT 'ACCEPT',MySQL 会自动为现有行填充默认值,但建议明确使用存储过程以确保升级场景的安全性和可控性。conf/db/upgrade/V4.6.31__schema.sql-65-66 (1)
65-66: 代码中包含中文文本,且第66行存在 WHERE 条件错误根据编码规范,代码中不应包含中文。此外,第66行的 WHERE 子句检查
subject = '',但实际更新的是recoverySubject,这很可能是一个复制粘贴错误。🔎 修复 WHERE 条件错误
-UPDATE SNSTextTemplateVO SET recoverySubject = replace('报警器 %{ALARM_NAME} %{TITLE_ALARM_RESOURCE_NAME} %{ALARM_CURRENT_STATUS}','%','$') WHERE recoverySubject IS NULL or subject = ''; +UPDATE SNSTextTemplateVO SET recoverySubject = replace('报警器 %{ALARM_NAME} %{TITLE_ALARM_RESOURCE_NAME} %{ALARM_CURRENT_STATUS}','%','$') WHERE recoverySubject IS NULL or recoverySubject = '';关于中文文本:根据编码规范,应使用英文替代。如果这是用户可见的模板内容,可以考虑将其存储在国际化资源文件中。Based on coding guidelines, "代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写"。
conf/db/upgrade/V5.3.6__schema.sql-29-41 (1)
29-41: SQL 注释中包含中文,违反编码规范。与上一个 INSERT 语句相同的问题:中文注释需要替换为英文,并且缺少幂等性检查。
🔎 建议修复
INSERT INTO SystemTagVO (`uuid`, `resourceUuid`, `resourceType`, `inherent`, `type`, `tag`, `createDate`, `lastOpDate`) SELECT - REPLACE(UUID(), '-', ''), -- 生成不含连字符的uuid - uuid, -- 使用DatasetVO的uuid作为resourceUuid - 'DatasetVO', -- resourceType - 1, -- inherent - 'System', -- type - 'dataset::datatype::Text', -- tag - CURRENT_TIMESTAMP(), -- createDate - CURRENT_TIMESTAMP() -- lastOpDate + REPLACE(UUID(), '-', ''), -- generate uuid without hyphens + uuid, -- use DatasetVO uuid as resourceUuid + 'DatasetVO', -- resourceType + 1, -- inherent + 'System', -- type + 'dataset::datatype::Text', -- tag + CURRENT_TIMESTAMP(), -- createDate + CURRENT_TIMESTAMP() -- lastOpDate FROM DatasetVO -WHERE `system` = true; +WHERE `system` = true +AND NOT EXISTS ( + SELECT 1 FROM SystemTagVO st + WHERE st.resourceUuid = DatasetVO.uuid + AND st.resourceType = 'DatasetVO' + AND st.tag = 'dataset::datatype::Text' +);conf/db/upgrade/V5.3.6__schema.sql-15-27 (1)
15-27: SQL 注释中包含中文,违反编码规范。根据编码规范要求,代码中不应包含中文,包括注释。请将所有中文注释替换为英文。
此外,此 INSERT 语句缺少幂等性检查。如果升级脚本重复执行,会插入重复的 SystemTagVO 记录。建议添加存在性检查以确保安全重复执行。
🔎 建议修复
INSERT INTO SystemTagVO (`uuid`, `resourceUuid`, `resourceType`, `inherent`, `type`, `tag`, `createDate`, `lastOpDate`) SELECT - REPLACE(UUID(), '-', ''), -- 生成不含连字符的uuid - uuid, -- 使用DatasetVO的uuid作为resourceUuid - 'DatasetVO', -- resourceType - 1, -- inherent - 'System', -- type - 'dataset::usage::scenarios::ModelEval', -- tag - CURRENT_TIMESTAMP(), -- createDate - CURRENT_TIMESTAMP() -- lastOpDate + REPLACE(UUID(), '-', ''), -- generate uuid without hyphens + uuid, -- use DatasetVO uuid as resourceUuid + 'DatasetVO', -- resourceType + 1, -- inherent + 'System', -- type + 'dataset::usage::scenarios::ModelEval', -- tag + CURRENT_TIMESTAMP(), -- createDate + CURRENT_TIMESTAMP() -- lastOpDate FROM DatasetVO -WHERE `system` = true; +WHERE `system` = true +AND NOT EXISTS ( + SELECT 1 FROM SystemTagVO st + WHERE st.resourceUuid = DatasetVO.uuid + AND st.resourceType = 'DatasetVO' + AND st.tag = 'dataset::usage::scenarios::ModelEval' +);compute/src/main/java/org/zstack/compute/allocator/AttachedVolumePrimaryStorageAllocatorFlow.java-40-42 (1)
40-42: 改进错误消息的准确性并优化重复调用当前实现存在以下问题:
- 错误消息不准确:错误提示"cannot find root volume"只适用于根卷为 null 的情况,但对于根卷存在但状态不是 Ready 的场景则会产生误导。
- 重复调用:
vm.getRootVolume()在第 40 行被调用了两次,应缓存结果以提高效率。- 类型安全性:使用字符串比较卷状态虽然可行,但直接比较枚举会更加类型安全。
🔎 建议的改进方案
- if (vm.getRootVolume() == null || !VolumeStatus.Ready.toString().equals(vm.getRootVolume().getStatus())) { - fail(Platform.operr("cannot find root volume of vm[uuid:%s]", vm.getUuid())); + VolumeInventory rootVolume = vm.getRootVolume(); + if (rootVolume == null) { + fail(Platform.operr("cannot find root volume of vm[uuid:%s]", vm.getUuid())); + } + if (!VolumeStatus.Ready.toString().equals(rootVolume.getStatus())) { + fail(Platform.operr("root volume[uuid:%s] of vm[uuid:%s] is not ready, current status: %s", + rootVolume.getUuid(), vm.getUuid(), rootVolume.getStatus())); }compute/src/main/java/org/zstack/compute/vm/VmCpuVendor.java-4-5 (1)
4-5: 枚举常量应使用全大写加下划线的命名规范。根据 Java 命名规范和代码规范要求,枚举常量应使用全大写字母并用下划线分隔单词(UPPER_SNAKE_CASE),而不是 CamelCase 风格。同时,
None作为常量名不够清晰,建议使用更具描述性的名称如UNKNOWN或UNSPECIFIED。🔎 建议的命名规范修正
public enum VmCpuVendor { - None, - AuthenticAMD; + UNKNOWN, + AUTHENTIC_AMD; }基于代码规范:常量命名要求全部大写,使用下划线分隔单词,并表达清楚避免使用含糊名称。
compute/src/main/java/org/zstack/compute/host/HostPowerExecutor.java-7-16 (1)
7-16: 需要添加 Javadoc 文档。根据编码规范,接口方法必须配有有效的 Javadoc 注释。当前接口及其方法缺少文档说明,建议补充:
- 接口的整体用途和职责
- 每个方法的功能描述
- 参数说明(特别是
force和returnEarly的具体含义)- 返回值说明(如
getPowerStatus的返回值含义)- 可能抛出的异常
根据编码规范的要求。
🔎 建议的 Javadoc 文档示例
/** - * @Author : jingwang - * @create 2023/4/13 10:01 AM + * Host power management executor interface. + * Provides operations to control host power state including power on/off/reset and status query. + * + * @author jingwang + * @since 2023/4/13 */ public interface HostPowerExecutor { + /** + * Powers off the specified host. + * + * @param host the host to power off + * @param force whether to force power off without graceful shutdown + * @param completion callback invoked when operation completes + * @param returnEarly whether to return immediately without waiting for operation completion + */ void powerOff(HostVO host, Boolean force, Completion completion, boolean returnEarly); + + /** + * Powers on the specified host. + * + * @param host the host to power on + * @param completion callback invoked when operation completes + */ void powerOn(HostVO host, Completion completion); + + /** + * Resets (reboots) the specified host. + * + * @param host the host to reset + * @param completion callback invoked when operation completes + * @param returnEarly whether to return immediately without waiting for operation completion + */ void powerReset(HostVO host, Completion completion, boolean returnEarly); + + /** + * Queries the current power status of the specified host. + * + * @param host the host to query + * @return the current power status + */ HostPowerStatus getPowerStatus(HostVO host); }conf/db/upgrade/V4.5.0__schema.sql-131-132 (1)
131-132: 时间戳列缺少NOT NULL和默认值
SSORedirectTemplateVO表的时间戳列定义不完整。🔎 建议修复
- `lastOpDate` timestamp ON UPDATE CURRENT_TIMESTAMP, - `createDate` timestamp, + `lastOpDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, + `createDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,conf/db/upgrade/V4.4.46__schema.sql-2-25 (1)
2-25: 重大问题:脚本缺乏幂等性保护该升级脚本不具备幂等性。如果脚本因故重复执行(如升级失败后重试),会再次尝试插入和更新相同的记录,导致数据重复或执行失败。
根据 ZStack 项目的最佳实践和从学习中获取的经验,数据库升级脚本必须通过检查
INFORMATION_SCHEMA确保可以安全重复执行。🔎 建议的修复方案
在存储过程开始处添加执行状态检查:
DELIMITER $$ CREATE PROCEDURE splitHybridLicenseRecords() BEGIN + -- Check if migration has already been performed + IF EXISTS (SELECT 1 FROM `zstack`.`LicenseHistoryVO` + WHERE `licenseType` = 'AddOn' AND `prodInfo` = 'hybrid') THEN + SELECT 'Migration already completed, skipping...' AS message; + ELSE IF EXISTS ( SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'LicenseHistoryVO' AND table_schema = 'zstack' AND column_name = 'capacity') THEN insert into `zstack`.`LicenseHistoryVO` (`uuid`, `cpuNum`, `hostNum`, `vmNum`, `expiredDate`, `issuedDate`, `uploadDate`, `licenseType`, `userName`, `prodInfo`, `createDate`, `lastOpDate`, `hash`, `source`, `managementNodeUuid`, `mergedTo`, `capacity`) select `uuid`, `cpuNum`, `hostNum`, `vmNum`, `expiredDate`, `issuedDate`, `uploadDate`, 'AddOn' as `licenseType`, `userName`, 'hybrid' as `prodInfo`, `createDate`, `lastOpDate`, `hash`, `source`, `managementNodeUuid`, `mergedTo`, `capacity` from `zstack`.`LicenseHistoryVO` where `licenseType`='Hybrid'; ELSE insert into `zstack`.`LicenseHistoryVO` (`uuid`, `cpuNum`, `hostNum`, `vmNum`, `expiredDate`, `issuedDate`, `uploadDate`, `licenseType`, `userName`, `prodInfo`, `createDate`, `lastOpDate`, `hash`, `source`, `managementNodeUuid`, `mergedTo`) select `uuid`, `cpuNum`, `hostNum`, `vmNum`, `expiredDate`, `issuedDate`, `uploadDate`, 'AddOn' as `licenseType`, `userName`, 'hybrid' as `prodInfo`, `createDate`, `lastOpDate`, `hash`, `source`, `managementNodeUuid`, `mergedTo` from `zstack`.`LicenseHistoryVO` where `licenseType`='Hybrid'; END IF; update `zstack`.`LicenseHistoryVO` set `licenseType`='Paid' where `licenseType`='Hybrid'; + END IF; SELECT CURTIME(); END $$ DELIMITER ;基于学习:ZStack 数据库升级脚本需要通过 information_schema 检查来确保幂等性。
conf/db/upgrade/V5.3.46__schema.sql-4-33 (1)
4-33: 确保存储过程的幂等性以避免重复执行时的数据损坏。该存储过程直接更新
AuditsVO表中的resourceUuid和resourceType字段,但缺少幂等性检查。如果升级脚本被重复执行,已经迁移过的记录(resourceUuid已是projectUuid)可能会被错误地再次处理,导致数据损坏。根据项目经验,数据库升级脚本应通过
information_schema检查或其他机制确保可以安全地重复执行。🔎 建议的幂等性改进方案
UPDATE AuditsVO a JOIN IAM2ProjectAccountRefVO i ON a.resourceUuid = i.accountUuid SET a.resourceUuid = i.projectUuid, a.resourceType = CASE WHEN a.resourceType = 'AccountVO' THEN 'IAM2ProjectVO' ELSE a.resourceType END -WHERE a.apiName = 'org.zstack.header.identity.APIUpdateQuotaMsg'; +WHERE a.apiName = 'org.zstack.header.identity.APIUpdateQuotaMsg' + AND a.resourceType = 'AccountVO';通过添加
AND a.resourceType = 'AccountVO'条件,确保只处理尚未迁移的记录。基于项目经验(learnings),数据库升级脚本需要具备幂等性保证。
build/deploydb.sh-51-54 (1)
51-54:else分支中使用已被移除的 MySQL 8.0 GRANT 语法第 51-54 行的
GRANT ... identified by语法在 MySQL 8.0 中已被移除,此脚本在 MySQL 8.0+ 环境下会执行失败。第 30-42 行的if分支已使用了正确的模式:先使用CREATE USER IF NOT EXISTS ... IDENTIFIED BY创建用户,再使用GRANT授予权限。请同样更新else分支以保持一致。conf/db/upgrade/V4.6.11__schema.sql-1-1 (1)
1-1: 必须使用存储过程处理历史数据中的 NULL 值。将
vSwitchType修改为NOT NULL时,应当使用存储过程先检查并处理历史数据中的 NULL 值,而不是直接设置默认值。根据编码规范,当添加NOT NULL约束时,必须使用存储过程或其他函数来处理历史数据。建议先用存储过程将现有的 NULL 值更新为 'LinuxBridge',然后再添加 NOT NULL 约束。
🔎 建议的修复方案
+DELIMITER $$ +CREATE PROCEDURE updateL2NetworkVSwitchType() + BEGIN + UPDATE `zstack`.`L2NetworkEO` SET `vSwitchType` = 'LinuxBridge' WHERE `vSwitchType` IS NULL; + END $$ +DELIMITER ; + +CALL updateL2NetworkVSwitchType(); +DROP PROCEDURE IF EXISTS updateL2NetworkVSwitchType; + ALTER TABLE `zstack`.`L2NetworkEO` MODIFY `vSwitchType` varchar(32) NOT NULL DEFAULT 'LinuxBridge' AFTER `type`;conf/db/upgrade/V4.5.1.1__schema.sql-1-2 (1)
1-2: 确保升级脚本幂等性。
ALTER TABLE ADD COLUMN语句在字段已存在时会失败。根据 ZStack 升级脚本最佳实践,应通过information_schema检查字段是否存在,以确保脚本可安全重复执行。建议使用存储过程将两个 ADD COLUMN 操作包装,并在执行前检查information_schema.COLUMNS中的字段存在性。关于
encryptSubjectDN字段的varchar(128)长度限制需要验证,以确保它对应用中实际使用的 X.509 Subject Distinguished Names 足够。如果长度无法保证,请根据编码规范改用text类型。compute/src/main/java/org/zstack/compute/host/PostHostExtensionPointForNuma.java-100-103 (1)
100-103: 同样的Trace日志性能问题与Line 81-84相同,此处的trace日志也在predicate中执行O(m×n)次,存在相同的性能隐患。
compute/src/main/java/org/zstack/compute/host/PostHostExtensionPointForNuma.java-81-84 (1)
81-84: Trace日志在predicate中可能影响性能在
noneMatch的predicate lambda内部记录trace日志会导致O(m×n)次日志调用(m个旧节点 × n个新节点),即使trace级别未启用,字符串拼接操作仍会执行。建议:
- 将详细比对日志移到最终的
forEach中,仅对实际需要删除的节点记录- 或使用条件判断:
if (logger.isTraceEnabled()) { ... }🔎 优化后的日志记录
oldHostNumaNodes.stream() .filter(oldNumaNode -> newHostNumaNodes.stream() .noneMatch(newNumaNode -> { - logger.trace(String.format("oldNumaNode.getNodeCPUs(): %s, newNumaNode.getNodeCPUs(): %s, equals: %s", oldNumaNode.getNodeCPUs(), newNumaNode.getNodeCPUs(), newNumaNode.getNodeCPUs().equals(oldNumaNode.getNodeCPUs()))); - logger.trace(String.format("oldNumaNode.getNodeDistance(): %s, newNumaNode.getNodeDistance(): %s, equals: %s", oldNumaNode.getNodeDistance(), newNumaNode.getNodeDistance(), newNumaNode.getNodeDistance().equals(oldNumaNode.getNodeDistance()))); - logger.trace(String.format("oldNumaNode.getNodeMemSize(): %s, newNumaNode.getNodeMemSize(): %s, equals: %s", oldNumaNode.getNodeMemSize(), newNumaNode.getNodeMemSize(), Objects.equals(newNumaNode.getNodeMemSize(), oldNumaNode.getNodeMemSize()))); - logger.trace(String.format("oldNumaNode.getNodeID(): %s, newNumaNode.getNodeID(): %s, equals: %s", oldNumaNode.getNodeID(), newNumaNode.getNodeID(), Objects.equals(newNumaNode.getNodeID(), oldNumaNode.getNodeID()))); - return newNumaNode.getNodeCPUs().equals(oldNumaNode.getNodeCPUs()) && newNumaNode.getNodeDistance().equals(oldNumaNode.getNodeDistance()) && Objects.equals(newNumaNode.getNodeMemSize(), oldNumaNode.getNodeMemSize()) && Objects.equals(newNumaNode.getNodeID(), oldNumaNode.getNodeID()); })) .forEach(oldNumaNode -> { - logger.trace("remove old numa node: " + JSONObjectUtil.toJsonString(oldNumaNode)); + if (logger.isTraceEnabled()) { + logger.trace("remove old numa node: " + JSONObjectUtil.toJsonString(oldNumaNode)); + } dbf.removeByPrimaryKey(oldNumaNode.getId(), HostNumaNodeVO.class); });conf/db/upgrade/V4.7.0__schema.sql-233-237 (1)
233-237:addHaStrategyConditionOnVmHaLevel幂等性问题与
addHostHaStatus类似,此过程在插入HaStrategyConditionVO时未检查是否已存在hostStorageState记录,可能导致重复执行时出错或产生重复数据。conf/db/upgrade/V4.7.0__schema.sql-272-273 (1)
272-273: 存储过程缺乏幂等性保护
addHostHaStatus过程直接执行 INSERT 而未检查记录是否已存在。如果升级脚本重复运行,会因uuid唯一约束导致失败。基于检索到的学习记录,ZStack 升级脚本应通过
information_schema检查来确保可安全重复执行。🔎 建议修复
- insert into HostHaStateVO(`uuid`, `state`,`lastOpDate`, `createDate`)values (hostUuid, 'None', current_time_stamp, current_time_stamp); + INSERT IGNORE INTO HostHaStateVO(`uuid`, `state`,`lastOpDate`, `createDate`)values (hostUuid, 'None', current_time_stamp, current_time_stamp);或在过程开始时检查表是否已有数据。
compute/src/main/java/org/zstack/compute/vm/devices/VmInstanceDeviceManagerImpl.java-239-260 (1)
239-260: 缺少功能开关检查。新方法
revertExistingDeviceAddressFromArchive与现有的revertDeviceAddressFromArchive(第217-237行)功能类似,但缺少了功能开关检查。现有方法在第218行检查deviceAddressRecordingDisabled()并在功能禁用时返回空列表,而此新方法没有进行该检查。这可能导致在设备地址记录功能被禁用时仍然创建设备地址记录,违反了功能开关的约定。
🔎 建议添加功能开关检查
@Override public List<VmInstanceDeviceAddressVO> revertExistingDeviceAddressFromArchive(String vmInstanceUuid, String archiveForResourceUuid) { + if (deviceAddressRecordingDisabled()) { + return Collections.emptyList(); + } + VmInstanceDeviceAddressGroupVO group = Q.New(VmInstanceDeviceAddressGroupVO.class) .eq(VmInstanceDeviceAddressGroupVO_.resourceUuid, archiveForResourceUuid) .find();compute/src/main/java/org/zstack/compute/vm/devices/VmInstanceDeviceManagerImpl.java-133-144 (1)
133-144: 验证缺失:未检查设备是否存在。此新方法
deleteVmDeviceAddress(String resourceUuid)直接删除设备地址记录,但未验证resourceUuid对应的设备是否存在。这与双参数版本的方法(第115-131行)不一致,后者通过checkParams进行了完整的参数验证。当
resourceUuid不存在时,SQL DELETE 会静默成功但不删除任何记录,可能掩盖潜在的调用错误。建议:
- 如果允许静默删除不存在的设备,请在注释中明确说明此行为
- 如果需要验证,应调用
vmDeviceExists(resourceUuid)并在设备不存在时返回错误🔎 建议的验证逻辑
@Override public ErrorCode deleteVmDeviceAddress(String resourceUuid) { if (resourceUuid == null) { return operr("missing parameter, resourceUuid is requested"); } + + if (!vmDeviceExists(resourceUuid) && !vmExists(resourceUuid)) { + return operr("cannot find vm device or vm with uuid: %s", resourceUuid); + } SQL.New(VmInstanceDeviceAddressVO.class) .eq(VmInstanceDeviceAddressVO_.resourceUuid, resourceUuid) .delete(); return null; }Committable suggestion skipped: line range outside the PR's diff.
compute/src/main/java/org/zstack/header/vm/SshKeyPairAssociateExtensionPoint.java-7-13 (1)
7-13: 必须添加接口方法的 Javadoc 注释。根据编码规范要求,接口方法必须配有有效的 Javadoc 注释。请为以下三个方法添加详细的文档说明:
associateSshKeyPair: 说明关联操作的行为、参数含义、返回的 ErrorCode 情况fetchAssociatedSshKeyPairs: 说明获取逻辑、返回值说明(特别是是否可能返回 null 或空列表)cloneSshKeyPairsToVm: 说明克隆行为、参数含义基于编码规范:"接口方法不应有多余的修饰符(例如 public),且必须配有有效的 Javadoc 注释。"
🔎 建议的 Javadoc 模板
public interface SshKeyPairAssociateExtensionPoint { + /** + * Associate SSH key pairs with a VM instance. + * + * @param vmUuid the UUID of the VM instance + * @param keyPairUuids the list of SSH key pair UUIDs to associate + * @return ErrorCode if association fails, null if successful + */ ErrorCode associateSshKeyPair(String vmUuid, List<String> keyPairUuids); + /** + * Fetch all associated SSH key pairs for a VM instance. + * + * @param vmUuid the UUID of the VM instance + * @return list of SSH key pair UUIDs, never null (returns empty list if none found) + */ List<String> fetchAssociatedSshKeyPairs(String vmUuid); + /** + * Clone SSH key pair associations from one VM to another. + * + * @param originVmUuid the UUID of the source VM + * @param destVmUuid the UUID of the destination VM + */ void cloneSshKeyPairsToVm(String originVmUuid, String destVmUuid); }compute/src/main/java/org/zstack/compute/vm/VmQuotaUtil.java-153-159 (1)
153-159:getRequiredMemory缺少 null 处理与
getRequiredCpu类似,当 VM 不存在时findValue()返回 null,该方法会直接返回 null,可能导致调用方出现 NPE。建议与
getRequiredCpu保持一致的 null 处理方式。🔎 建议的修复方案
@Transactional(readOnly = true) public Long getRequiredMemory(String vmInstanceUuid) { - return Q.New(VmInstanceVO.class) + Long memorySize = Q.New(VmInstanceVO.class) .select(VmInstanceVO_.memorySize) .eq(VmInstanceVO_.uuid, vmInstanceUuid) .findValue(); + return memorySize == null ? 0L : memorySize; }compute/src/main/java/org/zstack/compute/vm/VmQuotaUtil.java-143-151 (1)
143-151:getRequiredCpu存在潜在的 NPE 风险当
vmInstanceUuid对应的 VM 不存在时,findValue()返回 null,随后Integer.toUnsignedLong(cpuNum)会抛出NullPointerException。建议添加 null 检查或在方法文档中明确调用者需保证 VM 存在。
🔎 建议的修复方案
@Transactional(readOnly = true) public Long getRequiredCpu(String vmInstanceUuid) { Integer cpuNum = Q.New(VmInstanceVO.class) .select(VmInstanceVO_.cpuNum) .eq(VmInstanceVO_.uuid, vmInstanceUuid) .findValue(); + if (cpuNum == null) { + return 0L; + } return Integer.toUnsignedLong(cpuNum); }compute/src/main/java/org/zstack/compute/host/HostTrackImpl.java-38-40 (1)
38-40: 线程安全改进良好,但静态字段缺少配置入口。将
trackers从HashMap改为ConcurrentHashMap提升了线程安全性,这是必要的改进。但是,
alwaysStartRightNow静态字段被初始化为false却没有看到任何设置该字段的方法。如果这个字段用于控制测试行为(从 line 262 的使用来看),建议:
- 添加包可见或测试可见的 setter 方法
- 或使用系统属性/全局配置来控制该标志
- 添加注释说明该字段的用途和设置时机
🔎 建议添加配置方法
private static boolean alwaysStartRightNow = false; + + // Visible for testing + static void setAlwaysStartRightNow(boolean value) { + alwaysStartRightNow = value; + }compute/src/main/java/org/zstack/compute/host/HostTrackImpl.java-399-415 (1)
399-415: 事件处理器缺少清理逻辑。
onHostPingSkip()方法注册了两个事件处理器(HOST_PING_SKIP和HOST_PING_CANCEL_SKIP),但在stop()方法(line 418-420)中没有看到相应的注销逻辑。根据 ZStack 的组件生命周期管理,如果这些事件处理器在组件停止时不清理,可能导致:
- 内存泄漏(处理器持有组件引用)
- 在组件重启时重复注册处理器
- 停止后的组件仍然响应事件
建议在
stop()方法中添加事件处理器的注销逻辑。🔎 建议的修复方案
如果
EventFacade支持注销,添加清理逻辑:+ private EventCallback pingSkipCallback; + private EventCallback pingCancelSkipCallback; + private void onHostPingSkip() { - evtf.on(HostCanonicalEvents.HOST_PING_SKIP, new EventCallback() { + pingSkipCallback = new EventCallback() { @Override protected void run(Map tokens, Object data) { HostCanonicalEvents.HostPingSkipData d = (HostCanonicalEvents.HostPingSkipData) data; Long deadline = System.currentTimeMillis() / 1000 + d.getSkipTimeInSec(); skippedPingHostDeadline.put(d.getHostUuid(), deadline); } - }); - evtf.on(HostCanonicalEvents.HOST_PING_CANCEL_SKIP, new EventCallback() { + }; + evtf.on(HostCanonicalEvents.HOST_PING_SKIP, pingSkipCallback); + + pingCancelSkipCallback = new EventCallback() { @Override protected void run(Map tokens, Object data) { HostCanonicalEvents.HostPingSkipData d = (HostCanonicalEvents.HostPingSkipData) data; skippedPingHostDeadline.invalidate(d.getHostUuid()); } - }); + }; + evtf.on(HostCanonicalEvents.HOST_PING_CANCEL_SKIP, pingCancelSkipCallback); } @Override public boolean stop() { + // Clean up event handlers + if (pingSkipCallback != null) { + evtf.off(HostCanonicalEvents.HOST_PING_SKIP, pingSkipCallback); + } + if (pingCancelSkipCallback != null) { + evtf.off(HostCanonicalEvents.HOST_PING_CANCEL_SKIP, pingCancelSkipCallback); + } return true; }注:如果
EventFacade不支持off方法或采用其他清理机制,请调整相应实现。Committable suggestion skipped: line range outside the PR's diff.
compute/src/main/java/org/zstack/compute/host/HostTrackImpl.java-226-238 (1)
226-238: 默认的HostReconnectTaskFactory.createTaskWithLastConnectError实现会丢弃错误上下文。验证确认了该问题。默认接口实现显式忽略
errorCode参数,直接返回createTask(uuid, completion)。虽然KVMHostReconnectTaskFactory覆盖了该方法,但只处理特定错误类型(HOST_PASSWORD_HAS_BEEN_CHANGED),其他错误情况仍会调用父方法导致错误信息丢失。建议:
- 为所有
HostReconnectTaskFactory实现添加错误上下文处理,避免信息丢失- 在
createTaskWithLastConnectError中添加日志记录未被特定处理的错误类型,便于诊断compute/src/main/java/org/zstack/compute/allocator/HostCapacityReserveManagerImpl.java-72-110 (1)
72-110: hypervisorType 字段的使用存在混淆
findReservedCapacityByHypervisorType()方法的实现存在设计混淆:
- Lines 73-76:检查
finder.hypervisorType是否在扩展点映射中,如果不存在则提前返回- Lines 85-91:从数据库查询主机并按实际的 hypervisorType(从数据库获取)进行分组
- Lines 97-109:遍历分组后的所有 hypervisor 类型并调用对应的扩展点
问题是:
finder.hypervisorType字段似乎只用于 line 73-76 的早期返回检查和日志记录 (lines 94-95, 103-105),但实际处理时会遍历所有从数据库查询到的 hypervisor 类型,而不是只处理finder.hypervisorType指定的类型。这导致:
- 如果调用方已经在查询时过滤了 hypervisorType(如
getReservedHostCapacityByZones和getReservedHostCapacityByClusters),则finder.hypervisorType字段是冗余的- 如果调用方没有过滤(如
getReservedHostCapacityByHosts),则可能返回多种 hypervisor 类型的容量建议明确
hypervisorType字段的用途:是用于过滤还是仅用于日志记录?compute/src/main/java/org/zstack/compute/host/HostIpmiPowerExecutor.java-184-209 (1)
184-209: 修复泛型原始类型使用在多处
return new Pair(...)语句中使用了原始类型而非泛型类型。应该使用new Pair<>(...)或new Pair<HostPowerStatus, ErrorCode>(...)以保证类型安全。🔎 建议的修复
public static Pair<HostPowerStatus, ErrorCode> getPowerStatusWithErrorCode(HostIpmiVO ipmi) { if (mockedPowerStatus != null) { - return new Pair(mockedPowerStatus, null); + return new Pair<>(mockedPowerStatus, null); } if (isIpmiUnConfigured(ipmi)) { - return new Pair(HostPowerStatus.UN_CONFIGURED, + return new Pair<>(HostPowerStatus.UN_CONFIGURED, operr("ipmi information is not complete.")); } ShellResult rst = IPMIToolCaller.fromHostIpmiVo(ipmi).status(); if (rst.getRetCode() == 0) { if (rst.getStdout().trim().equals("Chassis Power is on")) { - return new Pair(HostPowerStatus.POWER_ON, null); + return new Pair<>(HostPowerStatus.POWER_ON, null); } else if (rst.getStdout().trim().equals("Chassis Power is off")) { - return new Pair(HostPowerStatus.POWER_OFF, null); + return new Pair<>(HostPowerStatus.POWER_OFF, null); } else { - return new Pair(HostPowerStatus.POWER_UNKNOWN, operr("host[%s] got unexpected return value", ipmi.getUuid())); + return new Pair<>(HostPowerStatus.POWER_UNKNOWN, operr("host[%s] got unexpected return value", ipmi.getUuid())); } } else { - return new Pair(HostPowerStatus.POWER_UNKNOWN, operr("host[%s] can not connect ipmi[%s], because:%s", + return new Pair<>(HostPowerStatus.POWER_UNKNOWN, operr("host[%s] can not connect ipmi[%s], because:%s", ipmi.getUuid(), ipmi.getIpmiAddress(), rst.getStderr())); } }compute/src/main/java/org/zstack/compute/host/HostIpmiPowerExecutor.java-236-242 (1)
236-242: 违反封装原则:IPMIToolCaller 的字段应为私有内部类
IPMIToolCaller的所有字段(hostname、port、username、password、passFile)都声明为public,这违反了封装原则。特别是password字段,暴露为公共字段存在安全风险。建议将这些字段改为
private,因为它们已通过工厂方法fromHostIpmiVo和fromHostVo进行初始化,外部不需要直接访问。🔎 建议的修复
protected static class IPMIToolCaller { private final String interfaceToUse = "lanplus"; - public String hostname; - public int port; - public String username; - public String password; - public String passFile; + private String hostname; + private int port; + private String username; + private String password; + private String passFile;Committable suggestion skipped: line range outside the PR's diff.
conf/db/upgrade/V4.3.35__schema.sql-1-8 (1)
1-8: 升级脚本缺少幂等性保障。数据库升级脚本直接执行
DROP FOREIGN KEY和ADD CONSTRAINT,但没有使用IF EXISTS检查。如果升级过程中断后重新执行,脚本会因为外键约束不存在而失败。根据 learnings 中的经验,ZStack 升级脚本需要通过
information_schema检查约束的存在性以确保可以安全重复执行。🔎 建议的修复方案
-ALTER TABLE `zstack`.`HostNumaNodeVO` DROP FOREIGN KEY `HostNumaNodeVO_HostEO_uuid_fk`; -ALTER TABLE `zstack`.`HostNumaNodeVO` ADD CONSTRAINT `HostNumaNodeVO_HostEO_uuid_fk` FOREIGN KEY (`hostUuid`) REFERENCES `HostEO` (`uuid`) ON DELETE CASCADE ; +-- Drop FK only if exists +SET @exist := (SELECT COUNT(*) FROM information_schema.TABLE_CONSTRAINTS + WHERE CONSTRAINT_SCHEMA = 'zstack' + AND TABLE_NAME = 'HostNumaNodeVO' + AND CONSTRAINT_NAME = 'HostNumaNodeVO_HostEO_uuid_fk' + AND CONSTRAINT_TYPE = 'FOREIGN KEY'); +SET @sqlstmt := IF(@exist > 0, 'ALTER TABLE `zstack`.`HostNumaNodeVO` DROP FOREIGN KEY `HostNumaNodeVO_HostEO_uuid_fk`', 'SELECT ''FK does not exist'''); +PREPARE stmt FROM @sqlstmt; +EXECUTE stmt; + +-- Add FK only if not exists +SET @exist := (SELECT COUNT(*) FROM information_schema.TABLE_CONSTRAINTS + WHERE CONSTRAINT_SCHEMA = 'zstack' + AND TABLE_NAME = 'HostNumaNodeVO' + AND CONSTRAINT_NAME = 'HostNumaNodeVO_HostEO_uuid_fk' + AND CONSTRAINT_TYPE = 'FOREIGN KEY'); +SET @sqlstmt := IF(@exist = 0, 'ALTER TABLE `zstack`.`HostNumaNodeVO` ADD CONSTRAINT `HostNumaNodeVO_HostEO_uuid_fk` FOREIGN KEY (`hostUuid`) REFERENCES `HostEO` (`uuid`) ON DELETE CASCADE', 'SELECT ''FK already exists'''); +PREPARE stmt FROM @sqlstmt; +EXECUTE stmt;对其他两个外键约束应用相同的模式。
conf/db/upgrade/V4.6.21__schema.sql-102-107 (1)
102-107: 表重命名操作缺少幂等性检查。根据已有的学习记录,直接使用
RENAME TABLE或ALTER TABLE ... RENAME AS不能保证幂等性。应通过information_schema.tables检查表的存在性,确保升级脚本可以安全地重复执行。🔎 建议的修改
-- Check if source table exists and target doesn't before renaming DROP PROCEDURE IF EXISTS renameScsiLunVOToLunVO; DELIMITER $$ CREATE PROCEDURE renameScsiLunVOToLunVO() BEGIN IF EXISTS (SELECT 1 FROM information_schema.tables WHERE table_schema = 'zstack' AND table_name = 'ScsiLunVO') AND NOT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_schema = 'zstack' AND table_name = 'LunVO') THEN ALTER TABLE `ScsiLunVO` RENAME AS LunVO; END IF; END $$ DELIMITER ; CALL renameScsiLunVOToLunVO(); DROP PROCEDURE IF EXISTS renameScsiLunVOToLunVO; # for compatibility - also check if view already exists DROP VIEW IF EXISTS `ScsiLunVO`; CREATE VIEW `ScsiLunVO` AS SELECT uuid, name, wwid, vendor, model, wwn, serial, type, hctl, path, size, state, source, multipathDeviceUuid, createDate, lastOpDate FROM `LunVO` WHERE source IN ('iSCSI', 'fiberChannel');abstraction/src/main/java/org/zstack/abstraction/sns/EndpointDriver.java-9-11 (1)
9-11: 为接口方法添加Javadoc文档根据编码规范,接口方法必须配有有效的Javadoc注释。当前
send方法缺少文档说明,特别是:
- 参数
message的用途和约束- 返回值
boolean的含义(成功/失败?)- 可能抛出的异常
🔎 建议的补充
public interface EndpointDriver extends PluginDriver { + /** + * Send a message to the endpoint. + * + * @param message the endpoint data to send + * @return true if the message was sent successfully, false otherwise + */ boolean send(PluginEndpointData message); }根据编码规范要求。
abstraction/src/main/java/org/zstack/abstraction/InvalidPluginDefinitionException.java-3-4 (1)
3-4: 建议为异常类添加标准构造函数当前异常类完全为空,缺少标准的构造函数。这会导致使用时无法传递有意义的错误信息或原因链。建议至少添加以下构造函数:
- 无参构造函数
- 带消息的构造函数
- 带消息和原因的构造函数
🔎 建议的补充
public class InvalidPluginDefinitionException extends RuntimeException { + public InvalidPluginDefinitionException() { + super(); + } + + public InvalidPluginDefinitionException(String message) { + super(message); + } + + public InvalidPluginDefinitionException(String message, Throwable cause) { + super(message, cause); + } + + public InvalidPluginDefinitionException(Throwable cause) { + super(cause); + } }根据Java异常类的最佳实践。
abstraction/src/main/java/org/zstack/abstraction/PluginMethod.java-3-5 (1)
3-5: 缺少@Retention和@Target元注解该注解缺少关键的元注解定义:
- 如果需要在运行时通过反射读取此注解(插件系统通常需要),需要添加
@Retention(RetentionPolicy.RUNTIME)- 应添加
@Target来限制注解的使用范围(如ElementType.METHOD)🔎 建议修复
package org.zstack.abstraction; +import java.lang.annotation.ElementType; +import java.lang.annotation.Retention; +import java.lang.annotation.RetentionPolicy; +import java.lang.annotation.Target; + +@Retention(RetentionPolicy.RUNTIME) +@Target(ElementType.METHOD) public @interface PluginMethod { boolean required() default true; }compute/src/main/java/org/zstack/compute/allocator/QuotaAllocatorFlow.java-35-51 (1)
35-51: 违反编码规范:新代码中使用了 TypedQuery根据编码规范:"新增代码里面避免使用
TypedQuery"。此处应使用项目推荐的Q.New()或SimpleQuery替代。此外,第 49 行调用
next(candidates)后缺少return语句,导致控制流继续执行到throwExceptionIfIAmTheFirstFlow(),在单元测试模式下可能引发意外行为。🔎 建议使用 Q.New() 并添加 return
if (CoreGlobalProperty.UNIT_TEST_ON) { if (candidates == null || candidates.isEmpty()) { - String sql = "select h from HostVO h where h.clusterUuid in (:cuuids)"; - Set<String> clusterUuids = new HashSet<>(); - clusterUuids.add(spec.getVmInstance().getClusterUuid()); - - TypedQuery<HostVO> hq = dbf.getEntityManager().createQuery(sql, HostVO.class); - hq.setParameter("cuuids", clusterUuids); - if (usePagination()) { - hq.setFirstResult(paginationInfo.getOffset()); - hq.setMaxResults(paginationInfo.getLimit()); - } - candidates = hq.getResultList(); + SimpleQuery<HostVO> q = dbf.createQuery(HostVO.class); + q.add(HostVO_.clusterUuid, SimpleQuery.Op.EQ, spec.getVmInstance().getClusterUuid()); + if (usePagination()) { + q.setStart(paginationInfo.getOffset()); + q.setLimit(paginationInfo.getLimit()); + } + candidates = q.list(); logger.debug(String.format("vm clusteruuid:%s, candidates size:%s", spec.getVmInstance().getClusterUuid(), candidates.size())); next(candidates); + return; } }compute/src/main/java/org/zstack/compute/vm/NewVmInstanceMsgBuilder.java-51-52 (1)
51-52: 清理被注释的代码代码中存在被注释掉的旧实现,应该删除以保持代码整洁。
🔎 建议删除被注释的代码
nicSpecs.add(vmNicSpec); -// nicSpecs.add(new VmNicSpec(l3Invs, -// vmNicParms != null && !vmNicParms.isEmpty() ? vmNicParms.get(i) : null)); }As per coding guidelines, 应删除被注释的代码(commented-out code)以保持代码整洁。
compute/src/main/java/org/zstack/compute/host/HostManagerImpl.java-219-229 (1)
219-229: 扩展点回调处理存在问题
deleteHostNetworkLabel和updateHostNetworkLabel方法使用safeForEach遍历扩展点,但每个扩展点都接收同一个Completion对象。如果有多个扩展点,第一个调用completion.success()或completion.fail()后,后续扩展点的回调将无效或产生意外行为。应该使用链式调用或
While模式来正确处理多个扩展点的异步回调。🔎 建议修复思路
private void deleteHostNetworkLabel(APIDeleteHostNetworkServiceTypeMsg msg, Completion completion) { HostNetworkLabelVO vo = dbf.findByUuid(msg.getUuid(), HostNetworkLabelVO.class); List<HostNetworkLabelExtensionPoint> exts = pluginRgty.getExtensionList(HostNetworkLabelExtensionPoint.class); if (exts.isEmpty()) { completion.success(); return; } new While<>(exts).each((ext, whileCompletion) -> { ext.deleteHostNetworkLabel(vo.toInventory(), new Completion(whileCompletion) { @Override public void success() { whileCompletion.done(); } @Override public void fail(ErrorCode errorCode) { whileCompletion.addError(errorCode); whileCompletion.done(); } }); }).run(new WhileDoneCompletion(completion) { @Override public void done(ErrorCodeList errorCodeList) { if (!errorCodeList.getCauses().isEmpty()) { completion.fail(errorCodeList.getCauses().get(0)); } else { completion.success(); } } }); }compute/src/main/java/org/zstack/compute/allocator/DesignatedHostAllocatorFlow.java-35-37 (1)
35-37: SQL 注入风险使用
String.join直接拼接 SQL 可能存在 SQL 注入风险。如果clusterUuids来自外部输入且包含恶意数据,可能导致安全问题。建议使用参数化查询。🔎 建议使用参数化查询
- if (!CollectionUtils.isEmpty(clusterUuids)) { - sql.append(String.format("h.clusterUuid in ('%s') and ", String.join("','", clusterUuids))); - } + // Move to parameterized query approach with TypedQuery.setParameter考虑重构整个方法使用参数化查询而非字符串拼接,或者至少对
clusterUuids中的值进行验证(确保是有效的 UUID 格式)。Committable suggestion skipped: line range outside the PR's diff.
compute/src/main/java/org/zstack/compute/vm/CpuTopology.java-31-44 (1)
31-44: 除法运算可能导致除零错误或产生零值当
cpuCores或cpuThreads为 0 时,除法运算会抛出ArithmeticException。此外,如果cpuNum小于除数,结果会是 0,这可能不是期望的行为。建议在计算前验证这些值:
🔎 建议修复
在
calculateValidTopology方法开头添加验证:if ((cpuSockets != null && cpuSockets <= 0) || (cpuCores != null && cpuCores <= 0) || (cpuThreads != null && cpuThreads <= 0)) { String errMsg = "cpu topology values must be positive"; if (throwException) { throw new OperationFailureException(operr(errMsg)); } return errMsg; }compute/src/main/java/org/zstack/compute/vm/VmInstantiateOtherDiskFlow.java-150-163 (1)
150-163: 回滚时使用的磁盘大小可能不正确在第 157 行,回滚逻辑使用
diskAO.getSize()释放主存储空间。但是,当通过diskOfferingUuid创建卷时(第 80-84 行),实际分配的大小是从DiskOfferingVO查询得到的size,而不是diskAO.getSize()(此时可能为 0)。这会导致释放的空间大小与实际分配的大小不匹配。
🔎 建议修复
将分配的大小保存到共享变量中,在回滚时使用:
+ Long allocatedDiskSize; private void setupCreateVolumeFromDiskSizeFlows(Long diskSize) { + allocatedDiskSize = diskSize; flow(new Flow() { ... @Override public void rollback(FlowRollback chain, Map data) { if (!isSuccessAllocatePS) { chain.rollback(); return; } ReleasePrimaryStorageSpaceMsg msg = new ReleasePrimaryStorageSpaceMsg(); - msg.setDiskSize(diskAO.getSize()); + msg.setDiskSize(allocatedDiskSize); ... } });Committable suggestion skipped: line range outside the PR's diff.
abstraction/src/main/java/org/zstack/abstraction/sso/OAuth2ProviderDriver.java-27-36 (1)
27-36: 常量名称存在拼写错误,需要修复
OAuth2PluginConstants.GRANT_TYPE_OCDE_VALUE的常量名中 "OCDE" 是拼写错误,应修改为GRANT_TYPE_CODE_VALUE。常量定义位于OAuth2PluginConstants.java第 20 行,当前定义为public static final String GRANT_TYPE_OCDE_VALUE = "authorization_code";,需要将常量名改为GRANT_TYPE_CODE_VALUE以符合GRANT_TYPE_PASSWORD_VALUE等其他常量的命名规范。compute/src/main/java/org/zstack/compute/vm/MacOperator.java-174-178 (1)
174-178:deviceId为负数时可能生成无效的 MAC 地址当
deviceId为负的short值时,Integer.toHexString(deviceId)会产生一个带有ffff前缀的 8 位十六进制字符串(例如 -1 会变成 "ffffffff")。这会导致生成的 MAC 地址格式不正确。建议添加范围检查或使用无符号处理:
🔎 建议修复
- String deviceIdStr = Integer.toHexString(deviceId); + String deviceIdStr = Integer.toHexString(deviceId & 0xFF); if (deviceIdStr.length() < 2) { deviceIdStr = "0" + deviceIdStr; + } else if (deviceIdStr.length() > 2) { + deviceIdStr = deviceIdStr.substring(deviceIdStr.length() - 2); }compute/src/main/java/org/zstack/compute/vm/CpuTopology.java-13-19 (1)
13-19: 构造函数缺少输入验证,可能抛出NumberFormatException当传入的字符串参数不是有效数字时(如空字符串或非数字字符),
Integer.valueOf()会抛出NumberFormatException。建议添加输入验证或异常处理。🔎 建议修复
public CpuTopology(int cpuNum, String cpuSockets, String cpuCores, String cpuThreads) { this.cpuNum = cpuNum; - this.cpuSockets = cpuSockets == null ? null : Integer.valueOf(cpuSockets); - this.cpuCores = cpuCores == null ? null : Integer.valueOf(cpuCores); - this.cpuThreads = cpuThreads == null ? null : Integer.valueOf(cpuThreads); + this.cpuSockets = parseIntOrNull(cpuSockets); + this.cpuCores = parseIntOrNull(cpuCores); + this.cpuThreads = parseIntOrNull(cpuThreads); + } + + private static Integer parseIntOrNull(String value) { + if (value == null || value.trim().isEmpty()) { + return null; + } + try { + return Integer.valueOf(value.trim()); + } catch (NumberFormatException e) { + return null; + } }Committable suggestion skipped: line range outside the PR's diff.
compute/src/main/java/org/zstack/compute/vm/VmInstanceUtils.java-166-175 (1)
166-175: 潜在的空指针异常风险Line 167 查询
InstanceOfferingVO后直接使用ivo.getState()和ivo.getType(),但如果传入的instanceOfferingUuid对应的记录不存在,Q.New(...).find()会返回null,导致 NPE。建议在使用前添加空值检查:
🔎 建议的修复方案
// InstanceOffering takes precedence over CPU/memory settings. InstanceOfferingVO ivo = Q.New(InstanceOfferingVO.class).eq(InstanceOfferingVO_.uuid, instanceOfferingUuid).find(); + if (ivo == null) { + throw new ApiMessageInterceptionException(operr("instance offering[uuid:%s] not found", instanceOfferingUuid)); + } + if (ivo.getState() == InstanceOfferingState.Disabled) { throw new ApiMessageInterceptionException(operr("instance offering[uuid:%s] is Disabled, can't create vm from it", instanceOfferingUuid)); }compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.java-2801-2811 (1)
2801-2811: 迁移后清理VM_STATE_PAUSED_AFTER_MIGRATE的条件逻辑似乎反了当前实现:
if (!Objects.equals(d.getOldState(), VmInstanceState.Migrating.toString()) && Objects.equals(d.getNewState(), VmInstanceState.Running.toString())) { VmSystemTags.VM_STATE_PAUSED_AFTER_MIGRATE.delete(d.getVmUuid()); }这会在「旧状态不是 Migrating 且新状态是 Running」时删除 tag,也就是说:
Stopped -> Running、Unknown -> Running等都会删;- 唯独
Migrating -> Running不会 删,和方法名“deleteMigrateSystemTagWhenVmStateChangedToRunning”以及语义预期相反。更合理的条件应该是“从 Migrating 切到 Running 才清理”:
建议修改示例
- if (!Objects.equals(d.getOldState(), VmInstanceState.Migrating.toString()) && Objects.equals(d.getNewState(), VmInstanceState.Running.toString())) { + if (Objects.equals(d.getOldState(), VmInstanceState.Migrating.toString()) + && Objects.equals(d.getNewState(), VmInstanceState.Running.toString())) { VmSystemTags.VM_STATE_PAUSED_AFTER_MIGRATE.delete(d.getVmUuid()); }compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.java-1145-1167 (1)
1145-1167: 避免对Collections.emptyList()调用add造成运行时异常
extEmitterHandleSystemTag()与extEmitterHandleSshKeyPair()都以Collections.emptyList()初始化result,在msg == null分支中马上result.add(...),一旦该分支被触发会直接抛出UnsupportedOperationException。虽然目前调用点总是传入非空
msg,但这属于明显的易错代码,后续重构时很容易踩雷,建议改为返回新的列表或直接返回单元素列表。建议修改示例
- private List<ErrorCode> extEmitterHandleSystemTag(final CreateVmInstanceMsg msg, final APICreateMessage cmsg, VmInstanceVO finalVo) { - List<ErrorCode> result = Collections.emptyList(); - if (msg == null) { - result.add(operr("CreateVmInstanceMsg cannot be null")); - return result; - } else if (cmsg != null && cmsg.getSystemTags() != null && !cmsg.getSystemTags().isEmpty()) { - return extEmitter.handleSystemTag(finalVo.getUuid(), cmsg.getSystemTags()); - } else if (cmsg == null && msg.getSystemTags() != null && !msg.getSystemTags().isEmpty()) { - return extEmitter.handleSystemTag(finalVo.getUuid(), msg.getSystemTags()); - } - return result; - } + private List<ErrorCode> extEmitterHandleSystemTag(final CreateVmInstanceMsg msg, final APICreateMessage cmsg, VmInstanceVO finalVo) { + if (msg == null) { + return Collections.singletonList(operr("CreateVmInstanceMsg cannot be null")); + } + if (cmsg != null && cmsg.getSystemTags() != null && !cmsg.getSystemTags().isEmpty()) { + return extEmitter.handleSystemTag(finalVo.getUuid(), cmsg.getSystemTags()); + } + if (cmsg == null && msg.getSystemTags() != null && !msg.getSystemTags().isEmpty()) { + return extEmitter.handleSystemTag(finalVo.getUuid(), msg.getSystemTags()); + } + return Collections.emptyList(); + }
extEmitterHandleSshKeyPair()同理可改为直接在msg == null分支返回单元素列表或抛出更显式的异常,而不是在空列表上add。conf/db/upgrade/V5.1.0__schema.sql-19-19 (1)
19-19:lastFailedReason设为NOT NULL但无默认值且无历史数据处理根据编码规范,当使用
NOT NULL约束时,必须使用存储过程或其他函数处理历史数据。新创建的任务记录可能尚未有失败原因。建议改为
DEFAULT ''或允许 NULL:- `lastFailedReason` varchar(1024) NOT NULL, + `lastFailedReason` varchar(1024) DEFAULT NULL,conf/db/upgrade/V5.3.28__schema.sql-141-152 (1)
141-152: 表结构改变需要数据迁移逻辑,不能直接 DROP TABLEHostHaStateVO 从 V4.7.0 到 V5.3.28 的表结构发生根本性改变(主键从 uuid 改为 id,字段改变,新增外键),直接 DROP TABLE 会丢失升级前的所有数据。应添加数据迁移逻辑,如存储过程,将旧表数据转换后迁移至新表,或至少通过 CREATE TABLE IF NOT EXISTS 来检查表是否已存在。
另外,行 148 的
DEFAULT '0000-00-00 00:00:00'应改为DEFAULT CURRENT_TIMESTAMP,以符合编码规范。
| if (autoSetClusterResourceConfigs.containsKey(architecture)) { | ||
| logger.debug(String.format("The [architecture:%s] has already configured by other class, do not do this", architecture)); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
严重逻辑错误:return 语句会导致整个方法提前退出
当检测到某个 architecture 已被配置时,使用 return 会立即退出整个 initClusterResourceConfig() 方法,导致后续的 ClusterArchitectureResourceConfig 子类不会被处理。应该使用 continue 跳过当前迭代。
🔎 建议修复
if (autoSetClusterResourceConfigs.containsKey(architecture)) {
logger.debug(String.format("The [architecture:%s] has already configured by other class, do not do this", architecture));
- return;
+ continue;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (autoSetClusterResourceConfigs.containsKey(architecture)) { | |
| logger.debug(String.format("The [architecture:%s] has already configured by other class, do not do this", architecture)); | |
| return; | |
| } | |
| if (autoSetClusterResourceConfigs.containsKey(architecture)) { | |
| logger.debug(String.format("The [architecture:%s] has already configured by other class, do not do this", architecture)); | |
| continue; | |
| } |
🤖 Prompt for AI Agents
In
compute/src/main/java/org/zstack/compute/cluster/arch/ClusterResourceConfigInitializerImpl.java
around lines 56 to 59, the code uses "return" when an architecture is already
configured which prematurely exits the entire initClusterResourceConfig() method
and prevents processing remaining ClusterArchitectureResourceConfig
implementations; change the control flow to use "continue" to skip only the
current iteration so subsequent config classes are still handled, ensuring the
loop iterates over all implementations while leaving the existing debug log
intact.
| detachDataVolume(hinvs); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
存在竞态条件风险:卷分离与主机删除并发执行。
detachDataVolume(hinvs) 是异步执行的(内部使用 While + CloudBusCallBack),但此处调用后立即继续执行主机删除流程(92-123行),没有等待卷分离完成。这可能导致:
- 主机删除消息在卷尚未分离完成时就被发送
- 资源冲突或清理不完整
- 删除操作失败或状态不一致
🔎 建议的修复方案
将 detachDataVolume 改为同步等待完成,或者将主机删除逻辑移到 detachDataVolume 的完成回调中:
- detachDataVolume(hinvs);
-
- List<HostDeletionMsg> msgs = new ArrayList<HostDeletionMsg>();
- for (HostInventory hinv : hinvs) {
- HostDeletionMsg msg = new HostDeletionMsg();
- msg.setForceDelete(action.isActionCode(CascadeConstant.DELETION_FORCE_DELETE_CODE));
- msg.setHostUuid(hinv.getUuid());
- bus.makeTargetServiceIdByResourceUuid(msg, HostConstant.SERVICE_ID, hinv.getUuid());
- msgs.add(msg);
- }
-
- bus.send(msgs, 10, new CloudBusListCallBack(completion) {
- @Override
- public void run(List<MessageReply> replies) {
+ detachDataVolume(hinvs, new Completion(completion) {
+ @Override
+ public void success() {
+ List<HostDeletionMsg> msgs = new ArrayList<HostDeletionMsg>();
+ for (HostInventory hinv : hinvs) {
+ HostDeletionMsg msg = new HostDeletionMsg();
+ msg.setForceDelete(action.isActionCode(CascadeConstant.DELETION_FORCE_DELETE_CODE));
+ msg.setHostUuid(hinv.getUuid());
+ bus.makeTargetServiceIdByResourceUuid(msg, HostConstant.SERVICE_ID, hinv.getUuid());
+ msgs.add(msg);
+ }
+
+ bus.send(msgs, 10, new CloudBusListCallBack(completion) {
+ @Override
+ public void run(List<MessageReply> replies) {
+ // existing logic
+ }
+ });
+ }
+
+ @Override
+ public void fail(ErrorCode errorCode) {
+ // decide whether to proceed or fail
+ completion.fail(errorCode);
+ }
+ });并相应修改 detachDataVolume 方法签名以接受 Completion 参数。
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In compute/src/main/java/org/zstack/compute/host/HostCascadeExtension.java
around lines 90-91, detachDataVolume(hinvs) is invoked asynchronously but the
host deletion flow continues immediately, causing a race where deletion may
proceed before volumes are detached; make detachDataVolume synchronous or chain
the deletion logic into its completion callback: change detachDataVolume to
accept a Completion (or return a Future) and either block/wait for its
successful completion before continuing the deletion steps at lines 92-123, or
move the host deletion code into the Completion's success branch and handle
errors in its fail branch to ensure volumes are detached before sending the host
delete messages.
| private void detachDataVolume(List<HostInventory> hinvs) { | ||
| List<DetachDataVolumeFromHostMsg> dmsgs = hinvs.stream().map(hostInv -> { | ||
| VolumeHostRefVO refVO = Q.New(VolumeHostRefVO.class).eq(VolumeHostRefVO_.hostUuid, hostInv.getUuid()).find(); | ||
| if (refVO == null) { | ||
| return null; | ||
| } | ||
| String volumeInstallPath = Q.New(VolumeVO.class).select(VolumeVO_.installPath) | ||
| .eq(VolumeVO_.uuid, refVO.getVolumeUuid()).findValue(); | ||
| return new VolumeHostRef(refVO, volumeInstallPath); | ||
| }).filter(Objects::nonNull).map(volumeHostRef -> { | ||
| DetachDataVolumeFromHostMsg dmsg = new DetachDataVolumeFromHostMsg(); | ||
| dmsg.setHostUuid(volumeHostRef.refVO.getHostUuid()); | ||
| dmsg.setMountPath(volumeHostRef.refVO.getMountPath()); | ||
| dmsg.setDevice(volumeHostRef.refVO.getDevice()); | ||
| dmsg.setVolumeUuid(volumeHostRef.refVO.getVolumeUuid()); | ||
| dmsg.setVolumeInstallPath(volumeHostRef.volumeInstallPath); | ||
| bus.makeTargetServiceIdByResourceUuid(dmsg, HostConstant.SERVICE_ID, volumeHostRef.refVO.getHostUuid()); | ||
| return dmsg; | ||
| }).collect(Collectors.toList()); | ||
| if (dmsgs.isEmpty()) { | ||
| return; | ||
| } | ||
| new While<>(dmsgs).each((umsg, com) -> bus.send(umsg, new CloudBusCallBack(com) { | ||
| @Override | ||
| public void run(MessageReply reply) { | ||
| if (!reply.isSuccess()) { | ||
| logger.warn(String.format("failed to detach volume on host, %s", reply.getError())); | ||
| } | ||
| com.done(); | ||
| } | ||
| })).run(new WhileDoneCompletion(null) { | ||
| @Override | ||
| public void done(ErrorCodeList errorCodeList) { | ||
| } | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
detachDataVolume 方法存在多个严重问题。
问题 1 - 严重 BUG:仅分离单个卷(第 192 行)
Q.New(VolumeHostRefVO.class).eq(...).find() 仅返回一条记录。如果主机挂载了多个数据卷,只会分离其中一个,其余卷会被遗漏。
应该使用 list() 代替 find(),并配合 flatMap 处理多个卷。
问题 2 - 性能问题:循环中嵌套查询(第 191-198 行)
在 stream 的 map 中对每个主机执行两次数据库查询(第 192 行和第 196-197 行),形成 N+1 查询问题。违反编码规范:"禁止循环里套查询,避免嵌套查询带来的性能问题"。
建议批量查询所有相关的 VolumeHostRefVO 和 VolumeVO,再进行内存关联。
问题 3 - 异步执行未同步(第 212-224 行)
完成处理器为空(221-223 行),确认了"即发即弃"设计。结合第 90 行的调用方式,会导致卷分离与主机删除并发执行(已在 90-91 行标记)。
🔎 建议的完整修复方案
- private void detachDataVolume(List<HostInventory> hinvs) {
+ private void detachDataVolume(List<HostInventory> hinvs, Completion completion) {
+ // Batch query all VolumeHostRefVO for all hosts
+ List<String> hostUuids = hinvs.stream().map(HostInventory::getUuid).collect(Collectors.toList());
+ List<VolumeHostRefVO> refVOs = Q.New(VolumeHostRefVO.class)
+ .in(VolumeHostRefVO_.hostUuid, hostUuids)
+ .list();
+
+ if (refVOs.isEmpty()) {
+ completion.success();
+ return;
+ }
+
+ // Batch query all volumes
+ List<String> volumeUuids = refVOs.stream()
+ .map(VolumeHostRefVO::getVolumeUuid)
+ .collect(Collectors.toList());
+ List<VolumeVO> volumes = Q.New(VolumeVO.class)
+ .select(VolumeVO_.uuid, VolumeVO_.installPath)
+ .in(VolumeVO_.uuid, volumeUuids)
+ .list();
+
+ // Build volume install path map
+ Map<String, String> volumeInstallPathMap = volumes.stream()
+ .collect(Collectors.toMap(VolumeVO::getUuid, VolumeVO::getInstallPath));
+
+ // Build detach messages
- List<DetachDataVolumeFromHostMsg> dmsgs = hinvs.stream().map(hostInv -> {
- VolumeHostRefVO refVO = Q.New(VolumeHostRefVO.class).eq(VolumeHostRefVO_.hostUuid, hostInv.getUuid()).find();
- if (refVO == null) {
- return null;
- }
- String volumeInstallPath = Q.New(VolumeVO.class).select(VolumeVO_.installPath)
- .eq(VolumeVO_.uuid, refVO.getVolumeUuid()).findValue();
- return new VolumeHostRef(refVO, volumeInstallPath);
- }).filter(Objects::nonNull).map(volumeHostRef -> {
+ List<DetachDataVolumeFromHostMsg> dmsgs = refVOs.stream().map(refVO -> {
+ String volumeInstallPath = volumeInstallPathMap.get(refVO.getVolumeUuid());
+ if (volumeInstallPath == null) {
+ logger.warn(String.format("Cannot find install path for volume[uuid:%s]", refVO.getVolumeUuid()));
+ return null;
+ }
DetachDataVolumeFromHostMsg dmsg = new DetachDataVolumeFromHostMsg();
- dmsg.setHostUuid(volumeHostRef.refVO.getHostUuid());
- dmsg.setMountPath(volumeHostRef.refVO.getMountPath());
- dmsg.setDevice(volumeHostRef.refVO.getDevice());
- dmsg.setVolumeUuid(volumeHostRef.refVO.getVolumeUuid());
- dmsg.setVolumeInstallPath(volumeHostRef.volumeInstallPath);
- bus.makeTargetServiceIdByResourceUuid(dmsg, HostConstant.SERVICE_ID, volumeHostRef.refVO.getHostUuid());
+ dmsg.setHostUuid(refVO.getHostUuid());
+ dmsg.setMountPath(refVO.getMountPath());
+ dmsg.setDevice(refVO.getDevice());
+ dmsg.setVolumeUuid(refVO.getVolumeUuid());
+ dmsg.setVolumeInstallPath(volumeInstallPath);
+ bus.makeTargetServiceIdByResourceUuid(dmsg, HostConstant.SERVICE_ID, refVO.getHostUuid());
return dmsg;
- }).collect(Collectors.toList());
+ }).filter(Objects::nonNull).collect(Collectors.toList());
+
if (dmsgs.isEmpty()) {
+ completion.success();
return;
}
+
new While<>(dmsgs).each((umsg, com) -> bus.send(umsg, new CloudBusCallBack(com) {
@Override
public void run(MessageReply reply) {
if (!reply.isSuccess()) {
logger.warn(String.format("failed to detach volume on host, %s", reply.getError()));
}
com.done();
}
- })).run(new WhileDoneCompletion(null) {
+ })).run(new WhileDoneCompletion(completion) {
@Override
public void done(ErrorCodeList errorCodeList) {
+ // Detach failures are logged but don't fail the operation
+ // to ensure host deletion can proceed
+ completion.success();
}
});
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
compute/src/main/java/org/zstack/compute/host/HostCascadeExtension.java around
lines 190-225: the current detachDataVolume implementation only detaches one
VolumeHostRef per host (uses find()), performs N+1 queries inside the stream,
and fires async detach calls without properly waiting/error-aggregating; fix by:
1) replace the single-record query with a batch query — collect all host UUIDs
from hinvs, query VolumeHostRefVO.list() for those host UUIDs, and collect all
VolumeHostRefVOs (use flatMap semantics so multiple refs per host are handled);
2) batch-query VolumeVOs for all referenced volume UUIDs (one query) and build a
map volumeUuid->installPath, then build DetachDataVolumeFromHostMsg for every
VolumeHostRefVO using the map (no per-item DB queries); 3) send the messages
with the existing While<> but provide a proper WhileDoneCompletion that
aggregates errors (or returns a completion/future) so the caller can wait/handle
failures instead of "fire-and-forget" — at minimum log/collect errorCodeList in
done(...) and propagate or surface it to the caller (or change method signature
to return a Completion or throw on failure) so detach operations are
synchronized with host deletion.
| VmNicType type = nicManager.getVmNicType(spec.getVmInventory().getUuid(), nw); | ||
| if (type == null) { | ||
| errs.add(Platform.operr("there is no available nicType on L3 network [%s]", nw.getUuid())); | ||
| wcomp.allDone(); | ||
| } | ||
| VmInstanceNicFactory vnicFactory = vmMgr.getVmInstanceNicFactory(type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
控制流问题:wcomp.allDone() 后缺少 return 语句
当 type == null 时,代码调用了 wcomp.allDone() 但没有返回。这会导致后续代码继续执行,从而可能产生 NullPointerException(在第 116 行使用 type)。
🔎 建议修复
VmNicType type = nicManager.getVmNicType(spec.getVmInventory().getUuid(), nw);
if (type == null) {
errs.add(Platform.operr("there is no available nicType on L3 network [%s]", nw.getUuid()));
wcomp.allDone();
+ return;
}
VmInstanceNicFactory vnicFactory = vmMgr.getVmInstanceNicFactory(type);🤖 Prompt for AI Agents
In compute/src/main/java/org/zstack/compute/vm/VmAllocateNicFlow.java around
lines 111 to 116, the control flow calls wcomp.allDone() when type == null but
does not return, which allows execution to continue and use the null type;
modify the flow to immediately return after wcomp.allDone() (or alternatively
place the subsequent VmInstanceNicFactory lookup and usages inside an else
branch) so that no code runs when type is null and a NullPointerException cannot
occur.
| for (VolumeSpec vspec : volumeSpecs) { | ||
| CreateVolumeMsg msg = new CreateVolumeMsg(); | ||
| Set<String> tags = new HashSet<>(); | ||
| if (vspec != null && vspec.getTags() != null) { | ||
| tags.addAll(vspec.getTags()); | ||
| if (vspec != null) { | ||
| if (vspec.getTags() != null) { | ||
| tags.addAll(vspec.getTags()); | ||
| } | ||
| } else { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
严重:索引不匹配导致的数据关联错误
在 prepareMsg 方法中使用 continue 跳过 null 或非 Root/Data 类型的 VolumeSpec,但在 run 方法的回调中(第144行),直接使用 replies.indexOf(r) 从 spec.getVolumeSpecs() 获取 vspec,这会导致索引不匹配。
例如:如果 volumeSpecs 有 3 个元素 [A, null, B],msgs 只会有 2 个消息,但回调中 replies.indexOf(r) 返回 0 或 1 时,会错误地从 volumeSpecs 中取到 A 或 null,而不是正确的 A 或 B。
🔎 建议修复思路
应该在构建消息时同时维护一个有效的 VolumeSpec 列表,或者在消息中携带关联信息以便在回调中正确匹配:
protected List<CreateVolumeMsg> prepareMsg(Map<String, Object> ctx) {
+ List<VolumeSpec> validVolumeSpecs = new ArrayList<>();
// ...
for (VolumeSpec vspec : volumeSpecs) {
// ...
if (vspec == null) {
continue;
}
// ... build msg ...
+ validVolumeSpecs.add(vspec);
msgs.add(msg);
}
+ ctx.put("validVolumeSpecs", validVolumeSpecs);
return msgs;
}然后在 run 方法中使用 validVolumeSpecs 而非 spec.getVolumeSpecs()。
Committable suggestion skipped: line range outside the PR's diff.
| @@ -0,0 +1,9 @@ | |||
| ALTER TABLE `zstack`.`GuestOsCategoryVO` MODIFY COLUMN osRelease varchar(64) NOT NULL; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改列为 NOT NULL 未处理历史数据
根据编码规范,当添加 NOT NULL 约束时,必须使用存储过程或其他方式处理历史数据。如果 GuestOsCategoryVO 表中存在 osRelease 为 NULL 的记录,此 ALTER 语句将失败。
🔎 建议修复
+-- 先处理可能存在的 NULL 值
+UPDATE `zstack`.`GuestOsCategoryVO` SET osRelease = 'unknown' WHERE osRelease IS NULL;
+
ALTER TABLE `zstack`.`GuestOsCategoryVO` MODIFY COLUMN osRelease varchar(64) NOT NULL;或者使用存储过程检查并处理:
DELIMITER $$
CREATE PROCEDURE UpdateGuestOsCategoryOsRelease()
BEGIN
UPDATE `zstack`.`GuestOsCategoryVO` SET osRelease = 'unknown' WHERE osRelease IS NULL;
ALTER TABLE `zstack`.`GuestOsCategoryVO` MODIFY COLUMN osRelease varchar(64) NOT NULL;
END $$
DELIMITER ;
CALL UpdateGuestOsCategoryOsRelease();
DROP PROCEDURE IF EXISTS UpdateGuestOsCategoryOsRelease;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ALTER TABLE `zstack`.`GuestOsCategoryVO` MODIFY COLUMN osRelease varchar(64) NOT NULL; | |
| -- Handle possible existing NULL values | |
| UPDATE `zstack`.`GuestOsCategoryVO` SET osRelease = 'unknown' WHERE osRelease IS NULL; | |
| ALTER TABLE `zstack`.`GuestOsCategoryVO` MODIFY COLUMN osRelease varchar(64) NOT NULL; |
🤖 Prompt for AI Agents
In conf/db/upgrade/V4.7.0.3__schema.sql around line 1, the ALTER TABLE sets
osRelease to NOT NULL but does not handle existing NULL values; update existing
NULL osRelease rows to a safe default (e.g., 'unknown' or appropriate value)
before applying the NOT NULL constraint. Implement this as a transactional
migration: (1) update GuestOsCategoryVO rows where osRelease IS NULL to the
chosen default, (2) verify no NULLs remain, and (3) run the ALTER TABLE to
MODIFY COLUMN osRelease varchar(64) NOT NULL; optionally wrap the update + alter
in a stored procedure or transaction as per project conventions and drop the
procedure afterward if used.
| ALTER TABLE `zstack`.`BareMetal2BondingNicRefVO` modify column nicUuid varchar(32) DEFAULT NULL; | ||
| CALL ADD_COLUMN('BareMetal2BondingNicRefVO', 'provisionNicUuid', 'VARCHAR(32)', 1, NULL); | ||
| CALL ADD_CONSTRAINT('BareMetal2BondingNicRefVO', 'fkBareMetal2BondingNicRefVOProvisionNicVO', 'provisionNicUuid', 'BareMetal2InstanceProvisionNicVO', 'uuid', 'CASCADE'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# 查找ADD_COLUMN和ADD_CONSTRAINT存储过程定义
rg -n "CREATE.*PROCEDURE.*ADD_COLUMN" --type=sql -A 25 | head -60Repository: MatheMatrix/zstack
Length of output: 2150
🏁 Script executed:
# 查找ADD_CONSTRAINT存储过程定义
rg -n "CREATE.*PROCEDURE.*ADD_CONSTRAINT" --type=sql -A 25 | head -60Repository: MatheMatrix/zstack
Length of output: 2076
🏁 Script executed:
# 查看V4.7.13__schema.sql文件的完整内容和上下文
fd "V4.7.13__schema.sql" --type f -x cat -n {}Repository: MatheMatrix/zstack
Length of output: 1635
第25行应使用存储过程而非直接DDL修改,以确保幂等性
虽然nicUuid改为可空列是安全的向后兼容操作,但第25行直接使用ALTER TABLE修改列属性,在升级脚本重复执行时缺乏保护。根据ZStack最佳实践,应创建存储过程(如MODIFY_COLUMN)来检查列的当前状态,仅在必要时执行修改。
第26-27行做法正确:
ADD_COLUMN存储过程通过INFORMATION_SCHEMA.COLUMNS检查列存在性(beforeMigrate.sql:160-164),确保幂等性ADD_CONSTRAINT存储过程通过INFORMATION_SCHEMA.TABLE_CONSTRAINTS检查约束存在性(beforeMigrate.sql:197-201),确保幂等性
建议将第25行重构为类似的存储过程调用模式。
🤖 Prompt for AI Agents
In conf/db/upgrade/V4.7.13__schema.sql around lines 25-27, the direct ALTER
TABLE on line 25 should be replaced with a stored-procedure based change to
ensure idempotency; update the script to call the existing MODIFY_COLUMN (or add
a similar stored procedure) that checks INFORMATION_SCHEMA for the current
nicUuid column definition and only performs the nullable change if it differs,
mirroring how ADD_COLUMN and ADD_CONSTRAINT protect against repeat runs, so the
upgrade becomes safe to re-run without error.
| ALTER TABLE `zstack`.`L2NetworkEO` ADD COLUMN `isolated` boolean NOT NULL DEFAULT FALSE AFTER `virtualNetworkId`; | ||
| ALTER TABLE `zstack`.`L2NetworkEO` ADD COLUMN `pvlan` varchar(128) DEFAULT NULL AFTER `virtualNetworkId`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
列位置定义错误,需要修正。
第2行的 pvlan 列定义为 AFTER 'virtualNetworkId',但第1行已经在 virtualNetworkId 之后添加了 isolated 列。当这两条语句顺序执行时,MySQL 会将 pvlan 插入到 virtualNetworkId 和 isolated 之间,最终列顺序为:virtualNetworkId, pvlan, isolated, ...
然而第4行的视图定义显示列的顺序应该是 isolated, pvlan,这表明原意是 pvlan 应该在 isolated 之后。
🔎 建议的修复
ALTER TABLE `zstack`.`L2NetworkEO` ADD COLUMN `isolated` boolean NOT NULL DEFAULT FALSE AFTER `virtualNetworkId`;
-ALTER TABLE `zstack`.`L2NetworkEO` ADD COLUMN `pvlan` varchar(128) DEFAULT NULL AFTER `virtualNetworkId`;
+ALTER TABLE `zstack`.`L2NetworkEO` ADD COLUMN `pvlan` varchar(128) DEFAULT NULL AFTER `isolated`;🤖 Prompt for AI Agents
In conf/db/upgrade/V4.8.0.6__schema.sql around lines 1-2, the ALTER statements
add isolated then pvlan both AFTER `virtualNetworkId`, which yields column order
`virtualNetworkId, pvlan, isolated` but the view expects `isolated, pvlan`;
change the second ALTER to add pvlan AFTER `isolated` (or swap the two ALTERs so
pvlan is added after isolated) so the final column order is `virtualNetworkId,
isolated, pvlan` and matches the view definition.
| CONSTRAINT `fkEthernetVfPciDeviceVOL3NetworkEO` FOREIGN KEY (`l3NetworkUuid`) REFERENCES `L3NetworkEO` (`uuid`) ON DELETE SET NULL | ||
| ) ENGINE=InnoDB DEFAULT CHARSET=utf8; | ||
|
|
||
| ALTER TABLE ConsoleProxyVO ADD COLUMN `expiredDate` timestamp NOT NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConsoleProxyVO.expiredDate 以 NOT NULL 形式直接新增,升级时可能失败
当前语句:
ALTER TABLE ConsoleProxyVO ADD COLUMN `expiredDate` timestamp NOT NULL;在 MySQL 上若 ConsoleProxyVO 已有历史数据,这种“NOT NULL 且无默认值”的新增列在严格 SQL 模式下会直接失败,且不符合“NOT NULL 时需要用存储过程/函数处理历史数据”的升级规范。
建议改为两步或三步走,例如:
- 先以可为空形式新增列并为历史数据回填:
ALTER TABLE ConsoleProxyVO ADD COLUMN `expiredDate` timestamp NULL; UPDATE ConsoleProxyVO SET expiredDate = CURRENT_TIMESTAMP WHERE expiredDate IS NULL;
- 如确需强制非空,再修改为 NOT NULL:
ALTER TABLE ConsoleProxyVO MODIFY COLUMN `expiredDate` timestamp NOT NULL;
这样既满足历史数据处理要求,又避免升级脚本在已有环境上直接报错中断。
🤖 Prompt for AI Agents
In conf/db/upgrade/V5.0.0__schema.sql around line 81, adding
ConsoleProxyVO.expiredDate as NOT NULL directly will fail on MySQL with existing
rows; instead add the column as nullable, backfill existing rows with an
appropriate value (e.g. CURRENT_TIMESTAMP or other business-appropriate default)
and then alter the column to NOT NULL; implement this as a two-step (ADD COLUMN
... NULL; UPDATE ... SET expiredDate = ... WHERE expiredDate IS NULL) followed
by a MODIFY/ALTER to NOT NULL to avoid upgrade failures on strict SQL mode.
| DECLARE cur CURSOR FOR SELECT pci.uuid, pci.vendor FROM PciDeviceVO pci; | ||
| DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE; | ||
|
|
||
| OPEN cur; | ||
| read_loop: LOOP | ||
| FETCH cur INTO vendor, pciDeviceUuid; | ||
| IF done THEN | ||
| LEAVE read_loop; | ||
| END IF; | ||
| UPDATE MdevDeviceVO SET vendor = vendor WHERE parentUuid = pciDeviceUuid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
严重错误:FETCH 变量顺序与 SELECT 列顺序不匹配
游标定义为 SELECT pci.uuid, pci.vendor FROM PciDeviceVO pci,但 FETCH 语句为 FETCH cur INTO vendor, pciDeviceUuid,顺序相反。这会导致 uuid 被赋给 vendor,vendor 被赋给 pciDeviceUuid,最终 UPDATE 语句会将 uuid 值写入 vendor 字段。
🔎 建议修复
- DECLARE cur CURSOR FOR SELECT pci.uuid, pci.vendor FROM PciDeviceVO pci;
+ DECLARE cur CURSOR FOR SELECT pci.vendor, pci.uuid FROM PciDeviceVO pci;或修改 FETCH 语句:
- FETCH cur INTO vendor, pciDeviceUuid;
+ FETCH cur INTO pciDeviceUuid, vendor;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| DECLARE cur CURSOR FOR SELECT pci.uuid, pci.vendor FROM PciDeviceVO pci; | |
| DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE; | |
| OPEN cur; | |
| read_loop: LOOP | |
| FETCH cur INTO vendor, pciDeviceUuid; | |
| IF done THEN | |
| LEAVE read_loop; | |
| END IF; | |
| UPDATE MdevDeviceVO SET vendor = vendor WHERE parentUuid = pciDeviceUuid; | |
| DECLARE cur CURSOR FOR SELECT pci.uuid, pci.vendor FROM PciDeviceVO pci; | |
| DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE; | |
| OPEN cur; | |
| read_loop: LOOP | |
| FETCH cur INTO pciDeviceUuid, vendor; | |
| IF done THEN | |
| LEAVE read_loop; | |
| END IF; | |
| UPDATE MdevDeviceVO SET vendor = vendor WHERE parentUuid = pciDeviceUuid; |
🤖 Prompt for AI Agents
In conf/db/upgrade/V5.1.8__schema.sql around lines 160 to 169 the cursor SELECT
lists columns as (pci.uuid, pci.vendor) but the FETCH uses (vendor,
pciDeviceUuid) in reverse order; update the FETCH to match the SELECT order
(FETCH cur INTO pciDeviceUuid, vendor) or alternatively change the SELECT to
(pci.vendor, pci.uuid) so the fetched variables align, then keep the UPDATE as
UPDATE MdevDeviceVO SET vendor = vendor WHERE parentUuid = pciDeviceUuid.
Resolves: ZSTAC-79460
Change-Id: I6b77656f6668796677626b6c6b7361696e6a6274
sync from gitlab !8973