refactor: build pipeline with template + lang pack + GPU ID automation#2
refactor: build pipeline with template + lang pack + GPU ID automation#2
Conversation
Separate language packs and GPU ID database from script business logic.
Scripts are now templates in src/ with placeholders, built into
self-contained dist/ scripts via scripts/build.py.
- src/: template scripts with {{LANG_PACKS}} and {{GPU_IDS}} placeholders
- lang/: per-script language packs (ZH_CN, EN_US)
- data/gpu-ids.sh: auto-generated GPU PCI ID database (857 IDs, up from 294)
- scripts/build.py: template + lang + data → dist/ merger with validation
- scripts/extract_keys.py: gettext key analysis and translation coverage
- scripts/update_gpu_ids.py: fetch pci-ids.ucw.cz and generate GPU ID mapping
- .github/workflows/: CI (build+validate), release (build+upload), GPU ID auto-update
- Remove root-level scripts, old language_packs/, language_pack_generate.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello @pescn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the project's build and localization processes, enhancing maintainability, scalability, and accuracy. By introducing a template-based build pipeline, language packs are now dynamically embedded into scripts, and the GPU ID database is automatically updated, reducing manual effort and potential errors. The new structure clearly separates business logic from localization and data, streamlining development and ensuring consistency across various installation scripts. Highlights
Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
The lang pack syntax validation wrote temp files to outdir before it was created, causing FileNotFoundError in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This is a major and well-executed refactoring. The introduction of a build pipeline with templates, language packs, and data separation greatly improves the maintainability and scalability of the project. The automation of GPU ID updates is a fantastic addition that will reduce manual effort and improve accuracy. The new CI/CD workflows will ensure code quality and streamline releases. My review includes a few suggestions for improvement, including a bug fix in one of the script templates and an optimization for the GPU ID database generation.
- Add missing __addrepo_ol__() for Oracle Linux CUDA repo setup - Add Hopper to is_open_module_supported(), default unknown arch to open module - Fix local var=$(cmd) antipattern masking exit codes in 3 files - Add DEVICE_ID_RE validation in update_gpu_ids.py for PCI ID integrity - Support GETTEXT_DYNAMIC annotations for dynamic gettext key checking - Atomic file writes in build.py (temp file + rename) - Strip whitespace in --langs argument parsing - Remove dead final.summary.header key from cuda-install lang packs - Consolidate extract_keys.py to reuse build.py extraction logic - Fix source URL in generated gpu-ids.sh header - Pin Python to 3.12 in all GitHub Actions workflows - Add --check-keys step to release workflow - Fix cleanup_temp_files() to handle mktemp -d directories - Fix PLAN.md regex typo and EN_US.sh Chinese comment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (5)
📝 WalkthroughWalkthrough本拉取请求引入了基于模板的构建系统:新增构建与校验脚本、语言包目录、GPU ID 数据源及生成器、CI/Release/更新工作流,并用模板(src/.sh)+语言包(lang/)+数据(data/)在构建时生成可发布的 dist/.sh;同时移除了若干旧的嵌入式翻译与生成工具及旧版 cuda-install.sh。 Changes
序列图sequenceDiagram
participant User as 用户/CI
participant Builder as build.py
participant Template as 模板 (src/*.sh)
participant LangPack as 语言包 (lang/*/*.sh)
participant Data as 数据 (data/*.sh)
participant Output as 输出 (dist/*.sh)
User->>Builder: 触发构建 (--target / CI / Release)
Builder->>Template: 读取模板并提取 gettext 键
Builder->>LangPack: 发现并读取匹配语言包,提取键
Builder->>Data: 读取相关数据文件 (gpu-ids)
Builder->>Builder: 校验键覆盖完整性 (check-keys)
Builder->>Template: 用 LangPack 内容替换 {{LANG_PACKS}}
Builder->>Template: 用 Data 内容替换 {{GPU_IDS}}
Builder->>Builder: 可选 bash -n 语法检查
Builder->>Output: 写入 dist/*.sh 并设为可执行
Output-->>User: 发布产物 / CI 报告
代码审查工作量评估🎯 5 (Critical) | ⏱️ ~120 分钟 可能相关的问题
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (20)
lang/cuda-install/ZH_CN.sh (1)
1-3: 建议添加 shell 指令注释以消除静态分析警告。此文件作为语言包被
source引入,无需 shebang,但 ShellCheck 会报 SC2148 警告。添加 shell 指令可消除该误报。建议
+# shellcheck shell=bash declare -A LANG_PACK_ZH_CN🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lang/cuda-install/ZH_CN.sh` around lines 1 - 3, Add a ShellCheck directive comment at the top of the file to silence SC2148 for this sourced language pack: insert a line like a shell directive comment (e.g. "# shellcheck shell=bash") before the existing declare -A LANG_PACK_ZH_CN and LANG_PACK_ZH_CN=( lines so ShellCheck recognizes the intended shell without adding a shebang; keep the rest of the file unchanged.scripts/extract_keys.py (2)
79-101:show_all()不返回失败状态。当存在缺失翻译键时,
show_all()只打印信息但不传递失败状态给调用方。如果未来将--all用于自动化验证,可能需要汇总结果并返回布尔值。目前作为开发者工具使用是可以的。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/extract_keys.py` around lines 79 - 101, The show_all() function currently only prints missing-key information and never returns a failure status; change show_all() to return a boolean (e.g., True when all templates have complete lang packs, False if any missing keys) and propagate detection of missing translations into that boolean by: 1) tracking a local flag (e.g., has_failures) inside show_all while iterating templates/keys/lang_dir; 2) when lang_dir is missing or when calling show_diff(template, lang_file) detect/receive a failure signal (modify show_diff to return a bool or raise a specific exception) and set the flag; and 3) return the flag (or its negation) at the end so callers can sys.exit(1) or use the result in CI. Update callers to check the returned boolean and exit with a nonzero code when false.
23-23:from build import ...依赖隐式的sys.path行为。当通过
python3 scripts/extract_keys.py运行时,Python 会自动将scripts/加入sys.path,因此from build import ...能正确解析为scripts/build.py。但如果从其他路径调用或作为模块导入时可能会失败。可考虑使用显式路径操作使其更健壮:
建议
-from build import extract_gettext_keys, extract_lang_pack_keys +# Ensure scripts/ is on the path so we can import build.py +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from build import extract_gettext_keys, extract_lang_pack_keys🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/extract_keys.py` at line 23, The import "from build import extract_gettext_keys, extract_lang_pack_keys" relies on implicit sys.path behavior and can fail when run from other CWDs or as a module; update scripts/extract_keys.py to perform an explicit, robust import: either use a package-relative import ("from .build import extract_gettext_keys, extract_lang_pack_keys") if the scripts folder is a package, or programmatically resolve the script directory via Path(__file__).resolve().parent and insert it into sys.path (or use importlib.machinery.SourceFileLoader) before importing those symbols so extract_gettext_keys and extract_lang_pack_keys are reliably loaded..github/workflows/ci.yml (1)
23-27: 建议:为 shell glob 添加防护。如果
dist/为空或不存在,dist/*.sh不会展开,bash 会尝试对字面字符串dist/*.sh执行语法检查并报错。虽然这在构建失败时能间接暴露问题,但错误信息会令人困惑。建议改进
- name: Syntax check run: | + shopt -s nullglob + files=(dist/*.sh) + if [ ${`#files`[@]} -eq 0 ]; then + echo "ERROR: No .sh files found in dist/" + exit 1 + fi - for f in dist/*.sh; do + for f in "${files[@]}"; do bash -n "$f" echo "OK: $f" done🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/ci.yml around lines 23 - 27, The for-loop using the glob dist/*.sh (see "for f in dist/*.sh", "bash -n \"$f\"", "echo \"OK: $f\"") will pass the literal pattern when no files exist; update the CI step to guard the glob before attempting syntax-checks — either enable bash nullglob for that shell run or add a pre-check (e.g., verify files exist via a glob check/compgen or test -e) and skip the loop when there are no matches so bash -n never receives the literal pattern.src/cuda-install.sh (3)
42-56:EXIT_SUCCESS、EXIT_NETWORK_FAILED、EXIT_REPO_ADD_FAILED已定义但未在脚本中使用。静态分析(SC2034)标记了这些常量。如果是为将来扩展保留的,建议添加注释说明;如果确实不需要,可以移除以减少噪声。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuda-install.sh` around lines 42 - 56, Remove or document the unused exit-code constants to silence SC2034: either delete EXIT_SUCCESS, EXIT_NETWORK_FAILED, and EXIT_REPO_ADD_FAILED if they are not used anywhere, or add a brief comment above their declarations (e.g., "reserved for future use") to indicate they are intentionally unused; update the block where EXIT_* constants are defined so the intent is clear for readers and static analysis tools referencing the constants EXIT_SUCCESS, EXIT_NETWORK_FAILED, and EXIT_REPO_ADD_FAILED.
278-294: 锁文件的 check-then-write 操作存在 TOCTOU 竞争条件。Line 281 检查
$lock_file是否存在,Line 292 才写入 PID。在检查与写入之间如果有另一个实例启动,两个进程可能同时认为没有锁并都写入 PID,导致并发冲突。可以考虑使用
mkdir作为原子锁操作(mkdir "$lock_dir"失败即表示锁已存在),或使用flock。♻️ 使用 flock 的示例
create_install_lock() { local lock_file="$STATE_DIR/.install.lock" - - if [[ -f "$lock_file" ]]; then - local lock_pid - lock_pid=$(cat "$lock_file" 2>/dev/null) - if [[ -n "$lock_pid" ]] && kill -0 "$lock_pid" 2>/dev/null; then - exit_with_code $EXIT_STATE_FILE_CORRUPTED "$(gettext "state.lock.another_running") $lock_pid)" - else - warn "$(gettext "state.lock.cleaning_orphaned")" - rm -f "$lock_file" - fi - fi - - echo $$ > "$lock_file" + exec 9>"$lock_file" + if ! flock -n 9; then + local lock_pid + lock_pid=$(cat "$lock_file" 2>/dev/null) + exit_with_code $EXIT_STATE_FILE_CORRUPTED "$(gettext "state.lock.another_running") $lock_pid)" + fi + echo $$ >&9 debug "$(gettext "state.lock.created") $lock_file (PID: $$)" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuda-install.sh` around lines 278 - 294, The current create_install_lock function has a TOCTOU race between checking and writing $lock_file; replace the check-then-write with an atomic lock operation: either attempt to create a lock directory (e.g. mkdir on a dedicated lock dir under STATE_DIR) and treat mkdir failure as “another instance running” (call exit_with_code $EXIT_STATE_FILE_CORRUPTED with the other PID read from the existing lock if present), or open the lock file and acquire an exclusive flock (use exec to open a file descriptor, then flock -n; on failure call exit_with_code), and only write the current PID into the file after the atomic acquisition; keep existing warn and debug calls (warn "$(gettext "state.lock.cleaning_orphaned")" and debug "$(gettext "state.lock.created") ...") and ensure cleanup removes the lock dir/file on exit.
871-912:help函数名覆盖了 bash 内置命令help。虽然在此脚本上下文中不太可能调用 bash 内置
help,但建议重命名为show_help或usage以避免潜在的命名冲突。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuda-install.sh` around lines 871 - 912, The function named help shadows the bash builtin; rename the function (e.g., to show_help or usage) and update every place that calls help to the new name (preserve behavior such as printing the heredoc and exit 0), ensuring any exported/used symbols or traps referencing help are updated as well so no callers remain pointing to the old help function.scripts/update_gpu_ids.py (2)
150-165:classify_device的device_id参数未被使用。Ruff (ARG001) 正确标记了此问题。当前函数仅依赖
device_name进行分类。如果此参数是为将来扩展预留的(如基于 ID 范围的分类),建议添加_前缀以明确表达意图。♻️ 明确未使用意图
-def classify_device(device_id: str, device_name: str) -> str | None: +def classify_device(_device_id: str, device_name: str) -> str | None:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/update_gpu_ids.py` around lines 150 - 165, The parameter device_id in classify_device is unused (Ruff ARG001); rename it to _device_id in the function signature to indicate it's intentionally unused and silence the linter, keeping the rest of the logic that uses device_name, _NON_GPU_RE, CHIP_CODE_RE and CHIP_TO_ARCH unchanged; also update any callers if they rely on a positional/keyword name to pass the ID.
330-334: 文件写入非原子操作,中断可能导致data/gpu-ids.sh损坏。
write_text()直接写入目标文件。如果写入过程中脚本被中断(如网络超时后重试、SIGINT),可能生成不完整的文件。建议先写入临时文件,再通过os.replace()原子替换。♻️ 原子写入
+ import tempfile # Write to file output_path = Path(args.output) if args.output else OUTPUT_FILE output_path.parent.mkdir(parents=True, exist_ok=True) - output_path.write_text(output, encoding="utf-8") + fd, tmp_path = tempfile.mkstemp( + dir=str(output_path.parent), suffix=".tmp", prefix=".gpu-ids-" + ) + try: + with os.fdopen(fd, "w", encoding="utf-8") as f: + f.write(output) + os.replace(tmp_path, str(output_path)) + except BaseException: + os.unlink(tmp_path) + raise print(f"\nWritten to {output_path} ({len(output.splitlines())} lines)")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/update_gpu_ids.py` around lines 330 - 334, The current non-atomic write using output_path.write_text(output, encoding="utf-8") can leave data/gpu-ids.sh corrupted if interrupted; modify the write to first write to a temporary file in the same directory (e.g., output_path.with_suffix(".tmp") or using tempfile.NamedTemporaryFile) and then atomically replace the target using os.replace(temp_path, output_path); ensure parent dir creation remains (output_path.parent.mkdir(...)) and update the final print to reference output_path after the replace so the behavior of OUTPUT_FILE/output_path is preserved..github/workflows/update-gpu-ids.yml (2)
28-35:$GITHUB_OUTPUT和$GITHUB_STEP_SUMMARY应使用双引号。actionlint/shellcheck 标记了 SC2086。虽然在 GitHub Actions 环境中通常不会有问题,但引用这些变量是 shell 最佳实践,可防止路径中包含空格时的意外行为。
♻️ 添加双引号
- name: Check for changes id: diff run: | - if git diff --quiet data/gpu-ids.sh; then - echo "changed=false" >> $GITHUB_OUTPUT + if git diff --quiet -- data/gpu-ids.sh; then + echo "changed=false" >> "$GITHUB_OUTPUT" else - echo "changed=true" >> $GITHUB_OUTPUT - echo "## GPU ID Database Changes" >> $GITHUB_STEP_SUMMARY - git diff --stat data/gpu-ids.sh >> $GITHUB_STEP_SUMMARY + echo "changed=true" >> "$GITHUB_OUTPUT" + echo "## GPU ID Database Changes" >> "$GITHUB_STEP_SUMMARY" + git diff --stat data/gpu-ids.sh >> "$GITHUB_STEP_SUMMARY" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/update-gpu-ids.yml around lines 28 - 35, Wrap the GitHub Actions output and step summary variable expansions in double quotes to avoid SC2086: change all occurrences of $GITHUB_OUTPUT and $GITHUB_STEP_SUMMARY used as redirection targets (e.g., echo "changed=false" >> $GITHUB_OUTPUT, echo "## GPU ID Database Changes" >> $GITHUB_STEP_SUMMARY, git diff --stat data/gpu-ids.sh >> $GITHUB_STEP_SUMMARY) to use quoted expansions (>> "$GITHUB_OUTPUT" and >> "$GITHUB_STEP_SUMMARY") so the shell treats them safely if they contain spaces or special characters.
37-51: 将peter-evans/create-pull-request固定到 commit SHA 以增强供应链安全当前使用
@v7标签引用。GitHub 官方安全加固指南建议将第三方 action 固定到完整的 commit SHA,这是防止标签被篡改的供应链攻击的主要防护措施。对于具有contents:write和pull-requests:write权限的工作流尤为重要。建议将
@v7替换为@22a9089034f40e5a961c8808d113e2c98fb63676(v7.0.11),并在注释中保留人类可读的版本号以便维护:uses: peter-evans/create-pull-request@22a9089034f40e5a961c8808d113e2c98fb63676 # v7.0.11🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/update-gpu-ids.yml around lines 37 - 51, Replace the loose action tag "peter-evans/create-pull-request@v7" with the fixed commit SHA "peter-evans/create-pull-request@22a9089034f40e5a961c8808d113e2c98fb63676" in the workflow step that uses the Create PR action (the step shown as uses: peter-evans/create-pull-request@v7), and add an inline comment after the uses reference containing the human-readable version "v7.0.11" so maintainers can see the version while the action is pinned to the specific commit for supply-chain security.scripts/build.py (4)
106-141:check_keys对于使用 gettext 但没有语言包目录的模板仅输出警告而不报错。当前行为:如果模板引用了 gettext key 但
lang/<script>/目录不存在,仅打印 Warning(第 120 行)而all_ok保持为True。这意味着--check-keys不会因此失败。如果这是有意为之(允许开发中的模板),建议在注释中说明设计意图。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/build.py` around lines 106 - 141, check_keys currently logs a warning when a template uses gettext but find_lang_packs returns empty, leaving all_ok true so --check-keys doesn't fail; change this to treat missing language packs as an error by setting all_ok = False (and optionally change the warning message to an error) when lang_packs is empty so the function returns False; update the behavior in check_keys around the find_lang_packs branch (the block that prints "Warning: {script_name} uses gettext but has no language packs") to set all_ok = False and emit an error-level message instead.
76-95:extract_gettext_keys的正则可能遗漏转义引号的情况。当前正则
r'gettext\s+"([^"]+)"'和r"gettext\s+'([^']+)'"不处理包含转义引号的 key(如gettext "key\"with\"quotes")。如果模板中不会出现这种情况则无需修改,但建议在文档或注释中明确这一约束。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/build.py` around lines 76 - 95, The regexes in extract_gettext_keys miss keys containing escaped quotes; update the double- and single-quoted patterns to accept backslash-escaped characters (e.g. use r'gettext\s+"((?:[^"\\]|\\.)*)"' and r"gettext\s+'((?:[^'\\]|\\.)*)'"), then unescape the captured string (replace backslash-escaped quotes and other escapes) before checking startswith("$") and adding to the keys set; keep the existing dynamic-annotation handling unchanged.
39-51:find_templates直接调用sys.exit(1)会阻碍函数的可测试性。
find_templates和其他函数在遇到错误时直接调用sys.exit(1),这使得该函数难以在单元测试中使用。考虑抛出异常,由main()统一处理退出逻辑。不过作为 CLI 构建工具,当前的方式也是可接受的。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/build.py` around lines 39 - 51, The function find_templates currently calls sys.exit(1) on errors which prevents unit testing; change find_templates to raise explicit exceptions (e.g., FileNotFoundError or a custom TemplateNotFoundError) instead of calling sys.exit, update any callers (notably main) to catch these exceptions and call sys.exit(1) there with the appropriate error message, and ensure the error messages previously printed to stderr are preserved when handling the exception in main.
170-189: 语言包文件被重复读取。第 173 行通过
read_file(p)读取所有语言包内容用于拼接,但第 178 行在语法检查时又对每个语言包调用了read_file(pack_path)。可以缓存读取结果以避免重复 I/O。♻️ 建议缓存语言包内容
if LANG_PACKS_PLACEHOLDER in template_content: lang_packs = find_lang_packs(script_name, langs) if lang_packs: - lang_content = "\n".join(read_file(p) for p in lang_packs) + pack_contents = {p: read_file(p) for p in lang_packs} + lang_content = "\n".join(pack_contents[p] for p in lang_packs) # Validate each lang pack syntax if not skip_syntax_check: for pack_path in lang_packs: # Wrap in a function to make it parseable - test_script = f"#!/bin/bash\n{read_file(pack_path)}\n" + test_script = f"#!/bin/bash\n{pack_contents[pack_path]}\n" test_file = outdir / f".syntax_test_{pack_path.name}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/build.py` around lines 170 - 189, The code currently reads each language pack twice (once in the join for lang_content and again during syntax validation); cache the results from read_file by building a mapping like lang_contents = {p: read_file(p) for p in lang_packs} (or a list of tuples) and then use those cached strings both when creating lang_content (join cached values) and when constructing test_script for validate_syntax; update the block that references lang_packs, read_file, validate_syntax, test_script and test_file to use the cached content and avoid repeat I/O, preserving the skip_syntax_check flow and file unlink behavior.src/nvidia-install.sh (5)
1953-1953: 嵌套命令替换缺少引号导致 word splitting。log_step "... ($(if $USE_OPEN_MODULES; then echo $(gettext "nvidia_driver.type.open"); else echo $(gettext "nvidia_driver.type.proprietary"); fi), $INSTALL_TYPE)..."内层
echo $(gettext ...)缺少引号,翻译文本中的空格会导致 word splitting。同样的问题出现在第 2307、2492-2497 等多行。实际影响较小(echo会用空格重新拼接),但建议统一加上引号以保持一致性。♻️ 示例修复
-log_step "$(gettext "nvidia_driver.install.starting") ($(if $USE_OPEN_MODULES; then echo $(gettext "nvidia_driver.type.open"); else echo $(gettext "nvidia_driver.type.proprietary"); fi), $INSTALL_TYPE)..." +log_step "$(gettext "nvidia_driver.install.starting") ($(if $USE_OPEN_MODULES; then echo "$(gettext "nvidia_driver.type.open")"; else echo "$(gettext "nvidia_driver.type.proprietary")"; fi), $INSTALL_TYPE)..."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nvidia-install.sh` at line 1953, The command substitution inside the log_step call can perform word-splitting because the inner echo uses an unquoted $(gettext ...); update the nested substitutions so translated strings are quoted (e.g. use "$(gettext "nvidia_driver.type.open")" and "$(gettext "nvidia_driver.type.proprietary")") and similarly quote any other echo $(gettext ...) occurrences (seen near lines with USE_OPEN_MODULES, log_step, and the INSTALL_TYPE usage) to avoid spurious word splitting while preserving the original behavior.
44-98: 多个EXIT_*常量已声明但未在脚本中使用。Shellcheck 标记了大量未使用的退出码常量,包括
EXIT_SUCCESS、EXIT_PERMISSION_DENIED、EXIT_GPU_ARCH_INCOMPATIBLE、EXIT_NETWORK_FAILED、EXIT_REPO_DOWNLOAD_FAILED等。这些常量被定义但从未被引用。如果它们是为未来功能预留的,建议添加注释说明;否则应移除以减少代码噪音。作为
readonly常量,Shellcheck 的 SC2034 告警在这里是合理的。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nvidia-install.sh` around lines 44 - 98, Many EXIT_* readonly constants (e.g. EXIT_SUCCESS, EXIT_PERMISSION_DENIED, EXIT_GPU_ARCH_INCOMPATIBLE, EXIT_NETWORK_FAILED, EXIT_REPO_DOWNLOAD_FAILED) are defined but never used; either remove unused declarations from the readonly block or explicitly mark them as intentionally reserved to silence ShellCheck (add a brief comment like "reserved for future use" above the block and apply a ShellCheck disable SC2034 for those symbols). Update the readonly declarations in the existing constants block accordingly so the intent is clear and ShellCheck warnings are addressed.
831-862: GPU 检测循环中多处local var=$(cmd)模式掩盖了命令返回值。Shellcheck SC2155 指出了多处问题,其中第 839 行最为关键:
local device_id=$(lspci -s "$pci_address" -nn | grep -oP '10de:\K[0-9a-fA-F]{4}' | tr '[:upper:]' '[:lower:]')如果
lspci -s执行失败,local声明会将返回码覆盖为 0,错误被静默忽略。在 GPU 检测这种关键路径中,建议将声明和赋值分开。♻️ 建议修复示例
- local pci_address=$(echo "$line" | awk '{print $1}') - local device_id=$(lspci -s "$pci_address" -nn | grep -oP '10de:\K[0-9a-fA-F]{4}' | tr '[:upper:]' '[:lower:]') + local pci_address + pci_address=$(echo "$line" | awk '{print $1}') + local device_id + device_id=$(lspci -s "$pci_address" -nn | grep -oP '10de:\K[0-9a-fA-F]{4}' | tr '[:upper:]' '[:lower:]')🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nvidia-install.sh` around lines 831 - 862, The device detection loop currently uses patterns like `local device_id=$(...)` which masks the command's exit status; separate declaration and assignment for variables like `device_id` (and any other `local var=$(cmd)` such as `gpu_info`/`pci_address`) so failures aren't silenced: declare `local device_id` first, then run `device_id=$(lspci -s "$pci_address" -nn | grep -oP '10de:\K[0-9a-fA-F]{4}' | tr '[:upper:]' '[:lower:]')` and immediately check the command exit code or `-n "$device_id"`; if the command fails, log an error and set `has_incompatible_gpu=true` accordingly before calling `detect_gpu_architecture` or `is_open_module_supported` (respect `USE_OPEN_MODULES` flow) to avoid proceeding with invalid data.
402-459:show_usage帮助文本硬编码为中文,未使用 gettext 国际化。尽管脚本支持 zh_CN 和 en_US 两种语言,但
show_usage中所有帮助文本(选项说明、示例等)都是纯中文。当用户通过--lang en_US --help使用时,仍然会看到中文帮助信息。第 502 行的注释说明
pre_parse_arguments在语言选择之前运行因此不需要 i18n,但show_usage同样被parse_arguments(第 520 行,在语言选择之后)调用。建议至少在未来迭代中将帮助文本纳入国际化。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nvidia-install.sh` around lines 402 - 459, The help text in show_usage() is hardcoded Chinese and not internationalized; modify show_usage to use gettext (e.g., wrap all user-visible strings with _() or call a language lookup) and ensure gettext is initialized (bindtextdomain/textdomain) before parse_arguments may call show_usage; alternatively move language selection earlier or have pre_parse_arguments set LANG/initialize i18n so show_usage prints translated text for en_US and zh_CN; update message keys for all option descriptions, examples and notes in show_usage and add corresponding translations for en_US.
256-258: 信号处理中 INT/TERM 与 EXIT trap 的交互会导致清理函数执行两次。当收到 INT 信号时:
- INT trap 执行
cleanup_on_exit INT,调用清理函数后exit 130exit 130触发 EXIT trap,再次执行cleanup_on_exit EXIT,清理函数再次运行当前清理函数(
cleanup_temp_files、cleanup_lock_files)是幂等的,所以不会造成错误,但存在冗余执行。可以通过设置一个标志位避免重复清理。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nvidia-install.sh` around lines 256 - 258, The INT/TERM and EXIT traps can cause cleanup_on_exit to run twice; change cleanup_on_exit to be idempotent by adding a shell-level guard (e.g. a CLEANUP_DONE flag) that returns immediately if set, and set that flag at the start of cleanup_on_exit before calling cleanup_temp_files and cleanup_lock_files; ensure cleanup_on_exit still performs the appropriate exit code behavior after setting the flag so subsequent EXIT/INT/TERM triggers do nothing extra.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (24)
.github/workflows/ci.yml.github/workflows/release.yml.github/workflows/update-gpu-ids.yml.gitignorePLAN.mdcuda-install.shdata/gpu-ids-overrides.shdata/gpu-ids.shlang/cuda-install/EN_US.shlang/cuda-install/ZH_CN.shlang/nvidia-install/EN_US.shlang/nvidia-install/ZH_CN.shlanguage_pack_generate.pylanguage_packs/EN_US.potlanguage_packs/ZH_CN.potnvidia-install.shscripts/build.pyscripts/extract_keys.pyscripts/update_gpu_ids.pysrc/cuda-install.shsrc/nvidia-container-installer.shsrc/nvidia-container-uninstall.shsrc/nvidia-install.shsrc/nvidia-uninstall.sh
💤 Files with no reviewable changes (4)
- cuda-install.sh
- language_packs/EN_US.pot
- language_packs/ZH_CN.pot
- language_pack_generate.py
🧰 Additional context used
🧬 Code graph analysis (2)
scripts/build.py (1)
scripts/extract_keys.py (1)
main(103-143)
src/nvidia-container-uninstall.sh (1)
src/nvidia-uninstall.sh (3)
log_error(53-55)parse_arguments(107-141)log_info(41-43)
🪛 actionlint (1.7.10)
.github/workflows/update-gpu-ids.yml
[error] 28-28: shellcheck reported issue in this script: SC2086:info:2:27: Double quote to prevent globbing and word splitting
(shellcheck)
[error] 28-28: shellcheck reported issue in this script: SC2086:info:4:26: Double quote to prevent globbing and word splitting
(shellcheck)
[error] 28-28: shellcheck reported issue in this script: SC2086:info:5:40: Double quote to prevent globbing and word splitting
(shellcheck)
[error] 28-28: shellcheck reported issue in this script: SC2086:info:6:38: Double quote to prevent globbing and word splitting
(shellcheck)
🪛 LanguageTool
PLAN.md
[style] ~157-~157: This phrase is redundant (‘I’ stands for ‘Interface’). Use simply “CLIInterface”.
Context: ...h 6. chmod +x dist/.sh ### CLI Interface bash # Build all scripts python3 sc...
(ACRONYM_TAUTOLOGY)
[uncategorized] ~222-~222: The official name of this software platform is spelled with a capital “H”.
Context: ... ### Separate CI workflow for PRs (`.github/workflows/ci.yml`) yaml name: CI o...
(GITHUB)
[style] ~357-~357: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...d reproducibility - scripts/build.py should be deterministic (same input → same out...
(MISSING_IT_THERE)
[uncategorized] ~473-~473: The official name of this software platform is spelled with a capital “H”.
Context: ...=========== #### 4. CI Automation (`.github/workflows/update-gpu-ids.yml`) yaml...
(GITHUB)
🪛 markdownlint-cli2 (0.20.0)
PLAN.md
[warning] 28-28: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 44-44: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 147-147: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 400-400: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 523-523: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.15.0)
scripts/update_gpu_ids.py
[error] 101-101: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
[error] 102-102: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
[warning] 150-150: Unused function argument: device_id
(ARG001)
[warning] 273-273: Do not catch blind exception: Exception
(BLE001)
scripts/build.py
[error] 146-146: subprocess call: check for execution of untrusted input
(S603)
[error] 147-147: Starting a process with a partial executable path
(S607)
🪛 Shellcheck (0.11.0)
data/gpu-ids.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 172-172: GPU_ARCH_DB appears unused. Verify use (or export if used externally).
(SC2034)
lang/cuda-install/ZH_CN.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 3-3: LANG_PACK_ZH_CN appears unused. Verify use (or export if used externally).
(SC2034)
lang/cuda-install/EN_US.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 3-3: LANG_PACK_EN_US appears unused. Verify use (or export if used externally).
(SC2034)
lang/nvidia-install/ZH_CN.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 4-4: LANG_PACK_ZH_CN appears unused. Verify use (or export if used externally).
(SC2034)
src/cuda-install.sh
[warning] 43-43: EXIT_SUCCESS appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 50-50: EXIT_NETWORK_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 51-51: EXIT_REPO_ADD_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 249-249: default appears unused. Verify use (or export if used externally).
(SC2034)
src/nvidia-install.sh
[warning] 44-44: EXIT_SUCCESS appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 48-48: EXIT_PERMISSION_DENIED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 54-54: EXIT_GPU_ARCH_INCOMPATIBLE appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 64-64: EXIT_MODULE_ARCH_MISMATCH appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 74-74: EXIT_DRIVER_UNINSTALL_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 78-78: EXIT_NETWORK_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 79-79: EXIT_REPO_DOWNLOAD_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 80-80: EXIT_KEYRING_DOWNLOAD_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 83-83: EXIT_PACKAGE_MANAGER_UNAVAILABLE appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 84-84: EXIT_REPO_ADD_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 85-85: EXIT_DEPENDENCY_INSTALL_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 86-86: EXIT_KERNEL_HEADERS_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 87-87: EXIT_NVIDIA_INSTALL_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 90-90: EXIT_KERNEL_VERSION_ISSUE appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 91-91: EXIT_DKMS_BUILD_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 92-92: EXIT_MODULE_SIGNING_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 93-93: EXIT_DRIVER_VERIFICATION_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 583-583: DRY_RUN appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 833-833: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 838-838: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 839-839: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 842-842: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1146-1146: Declare and assign separately to avoid masking return values.
(SC2155)
[error] 1229-1229: -f doesn't work with globs. Use a for loop.
(SC2144)
[warning] 1230-1230: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1312-1312: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1475-1475: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1482-1482: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1488-1488: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1770-1770: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1771-1771: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1953-1953: Quote this to prevent word splitting.
(SC2046)
[warning] 1953-1953: Quote this to prevent word splitting.
(SC2046)
[warning] 2062-2062: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2230-2230: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2258-2258: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2263-2263: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2307-2307: Quote this to prevent word splitting.
(SC2046)
[warning] 2307-2307: Quote this to prevent word splitting.
(SC2046)
[warning] 2307-2307: Quote this to prevent word splitting.
(SC2046)
[warning] 2307-2307: Quote this to prevent word splitting.
(SC2046)
[warning] 2313-2313: Declare and assign separately to avoid masking return values.
(SC2155)
[error] 2327-2327: -f doesn't work with globs. Use a for loop.
(SC2144)
[warning] 2328-2328: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2449-2449: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2492-2492: Quote this to prevent word splitting.
(SC2046)
[warning] 2492-2492: Quote this to prevent word splitting.
(SC2046)
[warning] 2494-2494: Quote this to prevent word splitting.
(SC2046)
[warning] 2494-2494: Quote this to prevent word splitting.
(SC2046)
[warning] 2495-2495: Quote this to prevent word splitting.
(SC2046)
[warning] 2495-2495: Quote this to prevent word splitting.
(SC2046)
[warning] 2496-2496: Quote this to prevent word splitting.
(SC2046)
[warning] 2496-2496: Quote this to prevent word splitting.
(SC2046)
[warning] 2497-2497: Quote this to prevent word splitting.
(SC2046)
[warning] 2497-2497: Quote this to prevent word splitting.
(SC2046)
[warning] 2570-2570: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2578-2578: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2579-2579: Declare and assign separately to avoid masking return values.
(SC2155)
lang/nvidia-install/EN_US.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 4-4: LANG_PACK_EN_US appears unused. Verify use (or export if used externally).
(SC2034)
data/gpu-ids-overrides.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
🔇 Additional comments (18)
src/nvidia-uninstall.sh (1)
6-7: 仅元数据变更,LGTM。日期和许可证头更新,无功能变更。
src/nvidia-container-installer.sh (1)
301-302: 良好的改进,LGTM。将
local lock_pid=$(...)拆分为声明和赋值两行,避免local掩盖cat的退出码。release_lock中的同类改动也一致。.gitignore (1)
1-5: LGTM。与仓库重构一致,
dist/和__pycache__/的忽略规则合理。src/nvidia-container-uninstall.sh (1)
168-215: 新增参数解析和非交互模式,LGTM。
parse_arguments实现简洁,与nvidia-uninstall.sh中的模式一致。AUTO_YES标志的集成逻辑正确——交互确认被正确跳过。.github/workflows/release.yml (1)
1-37: 发布工作流结构合理,LGTM。构建 → 语法检查 → 翻译验证 → 上传资产的流水线清晰。
permissions: contents: write正确配置了发布资产上传所需权限。注意:语法检查步骤存在与
ci.yml相同的 glob 防护问题(已在 CI 工作流评论中提及)。src/cuda-install.sh (1)
30-30:set -eo pipefail与模板占位符的整体设计合理。
set -eo pipefail提供了良好的错误检测,模板中# {{LANG_PACKS}}占位符的设计与构建管线(scripts/build.py)的替换逻辑一致。Also applies to: 72-74
data/gpu-ids-overrides.sh (1)
1-37: 覆盖文件结构清晰,格式规范。此文件仅为数据文件,由
scripts/update_gpu_ids.py中的load_overrides()通过正则解析,不直接执行。ShellCheck SC2148(缺少 shebang)为误报,可忽略。设备 ID 和架构映射均有效。lang/cuda-install/EN_US.sh (1)
1-101: 语言包结构完整,翻译键覆盖全面。ShellCheck 的 SC2148(缺少 shebang)和 SC2034(变量未使用)均为误报——此文件通过构建管线注入模板,不直接执行。与
src/cuda-install.sh中的gettext()调用交叉检查,翻译键覆盖一致。lang/nvidia-install/EN_US.sh (1)
1-350: NVIDIA 安装器英文语言包内容丰富,结构规范。约 350 个翻译键涵盖了 GPU 检测、Secure Boot、nouveau 处理、驱动验证等完整流程。ShellCheck 警告为误报(同其他语言包文件)。建议依赖 CI 中的
--check-keys验证翻译完整性。lang/nvidia-install/ZH_CN.sh (1)
1-350: 中文语言包翻译完整,与 EN_US 键集一致。抽查确认键覆盖与
lang/nvidia-install/EN_US.sh一致。翻译质量良好,ShellCheck 警告为误报。data/gpu-ids.sh (1)
1-175: 自动生成的 GPU 架构数据库结构正确,覆盖项已正确合并。抽查确认
data/gpu-ids-overrides.sh中的覆盖 ID(如1ffa→Turing、2545→Ampere、2688→Ada Lovelace)均已出现在对应架构数组中。ShellCheck 警告为误报(此文件由模板构建管线 source 引入)。scripts/update_gpu_ids.py (2)
96-106:fetch_pci_ids接受任意字符串作为 URL,但 CLI 已通过argparse.choices限制。Ruff S310 标记了
urllib.request.urlopen可能接受file://等非预期 scheme。由于argparse在 Line 250 限制了--source的选项,CLI 调用路径是安全的。但如果此函数被其他代码直接调用(如测试或其他脚本),SOURCES.get(source, source)的回退逻辑允许任意 URL。风险较低,仅作提示。
184-241:generate_gpu_ids_sh函数逻辑清晰,生成格式规范。按时间顺序排列架构、每行 10 个 ID 的格式化、以及模块支持注释的生成逻辑均正确。输出与
data/gpu-ids.sh的实际内容一致。PLAN.md (1)
1-546: 文档结构清晰,计划合理。PLAN.md 作为迁移计划文档,内容全面且组织良好,涵盖了模板构建、语言包分离、GPU ID 自动化和 CI/CD 流程等核心内容。
有几处小的 Markdown 格式问题:第 28、44、147、400、523 行的 fenced code block 缺少语言标识符(如
text或plaintext),这会触发 markdownlint MD040 警告。scripts/build.py (2)
144-155: Ruff S603/S607 关于subprocess的告警在此场景下属于误报。这是一个构建工具,调用
bash -n对构建系统自身生成的文件做语法检查是预期行为,不存在执行不可信输入的风险。可以通过添加# noqa: S603, S607注释来抑制告警。
202-216: 原子写入模式实现合理。使用临时文件 → 语法验证 → rename 的流程确保了只有通过验证的脚本才会出现在最终输出路径。
temp_path.rename(output_path)在同一目录下是原子操作,设计良好。src/nvidia-install.sh (2)
134-136: 模板占位符和 gettext 函数设计合理。
# {{LANG_PACKS}}占位符和gettext()函数的实现逻辑清晰:
- 占位符作为注释不影响模板直接运行
gettext()有完善的 fallback 机制(未找到翻译时返回 key 本身)- 使用
printf '%s'避免了echo对反斜杠的解释问题这些设计与 PLAN.md 中描述的 graceful degradation 策略一致。
2416-2459:main()中pre_parse_arguments和parse_arguments双重解析设计合理但有重复。
pre_parse_arguments提前处理-y、-q、--help、--lang等参数以在语言选择前生效,parse_arguments再次完整解析所有参数。这个设计是合理的,但需要注意两处解析逻辑的同步维护。另外,第 2443 行
parse_arguments "$@"会再次处理--help和--show-exit-codes,与pre_parse_arguments中的处理重复但行为一致,不会造成问题。
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/update_gpu_ids.py`:
- Around line 294-300: The loop over overrides in scripts/update_gpu_ids.py
currently skips entries whose arch isn't in CHIP_TO_ARCH.values() without
notifying the user; update the for device_id, arch in overrides.items() handling
to emit a warning (to stderr) when arch is not recognized (i.e., not in
CHIP_TO_ARCH.values()), referencing DEVICE_ID_RE, CHIP_TO_ARCH and arch_devices
so maintainers can locate the code; keep existing behavior of appending to
arch_devices[arch] when valid and only add the warning branch for invalid arch
values (include the unrecognized arch and device_id in the warning message).
In `@src/cuda-install.sh`:
- Around line 338-369: The rollback_installation function currently runs eval
"$action" on lines from ROLLBACK_FILE which allows command injection; fix by (1)
ensuring the file is created with strict permissions and ownership in
save_rollback_info (e.g., set umask, chown to root and chmod 600 immediately
after writing) and (2) eliminate eval by switching to a whitelist parser: change
rollback_installation to read each line, split into a safe token/args, and
dispatch to a predefined set of handlers (e.g., case statement or functions like
undo_install_package, remove_file, restore_config) that only accept validated
arguments; reference ROLLBACK_FILE, rollback_installation, save_rollback_info
and remove use of eval in the loop.
- Around line 247-267: The confirm() function sets a default variable but never
uses it, so an empty input always counts as "no"; update confirm() (function
name) to honor the default by treating an empty read as the default: after the
read into answer, if answer is empty assign answer="$default" (or assign a
lowercase version for comparison), then continue using case "${answer,,}" to
match y|yes -> return 0 and everything else -> return 1; keep existing AUTO_YES
and QUIET_MODE short-circuit behavior unchanged.
In `@src/nvidia-install.sh`:
- Around line 582-584: The script accepts -n/--dry-run but never uses the
DRY_RUN variable, so passing it does not prevent side effects; either remove the
flag or make it functional by initializing DRY_RUN="" by default and setting
DRY_RUN="echo" when -n/--dry-run is seen, then prefix every side-effecting
command with $DRY_RUN (examples: package managers like dnf/apt/rpm, systemctl,
modprobe, mv/cp/rm, tee/redirections, sed -i, install/uninstall commands)
throughout functions and top-level calls so the dry run prints actions rather
than executing them; also update usage/help text to reflect the behavior.
- Around line 722-728: The loop in rollback_installation reads $ROLLBACK_FILE
and uses eval "$action", which risks command injection; change it to safely
parse and execute allowed rollback actions without eval: have save_rollback_info
write a canonical, machine-parsable representation (e.g., a verb and
space-separated args or JSON) and in rollback_installation validate each line
against a whitelist of allowed verbs, reject/skip lines with unsafe
metacharacters, split the line into an array of argv-style tokens (no eval) and
call the corresponding handler function (e.g., handle_remove, handle_restore) or
exec the command with arguments using the array; ensure tac usage to preserve
reverse order and set rollback_failed when a validated execution returns
non-zero.
- Around line 524-568: In parse_arguments, several option handlers that consume
a value (-t|--type, -m|--modules, -v|--version, --lang) use $2 and shift 2
without verifying the argument exists; update each branch in the parse_arguments
function to first check existence like pre_parse_arguments does (e.g. [[ -n
"${2:-}" ]]) and call exit_with_code $EXIT_INVALID_ARGS with the same gettext
message when missing, only then assign
INSTALL_TYPE/USE_OPEN_MODULES/DRIVER_VERSION/LANG_CURRENT and shift 2; ensure
the -m branch still validates "open"/"proprietary" after the presence check.
- Around line 1229-1235: The glob check using [[ -f
/sys/firmware/efi/efivars/SecureBoot-* ]] fails because the wildcard isn't
expanded; replace it with a proper glob test (e.g. use compgen -G
"/sys/firmware/efi/efivars/SecureBoot-*" >/dev/null 2>&1) or iterate with for
file in /sys/firmware/efi/efivars/SecureBoot-*; do if [[ -f "$file" ]]; then
secure_boot_value=$(od -An -t u1 "$file" 2>/dev/null | tr -d ' '); if [[
"$secure_boot_value" =~ 1$ ]]; then secure_boot_enabled=true;
secure_boot_method="efivars"; fi; break; fi; done; and apply the same fix to the
other occurrence that uses the same wildcard pattern so secure_boot_value /
secure_boot_enabled / secure_boot_method are determined reliably.
---
Nitpick comments:
In @.github/workflows/ci.yml:
- Around line 23-27: The for-loop using the glob dist/*.sh (see "for f in
dist/*.sh", "bash -n \"$f\"", "echo \"OK: $f\"") will pass the literal pattern
when no files exist; update the CI step to guard the glob before attempting
syntax-checks — either enable bash nullglob for that shell run or add a
pre-check (e.g., verify files exist via a glob check/compgen or test -e) and
skip the loop when there are no matches so bash -n never receives the literal
pattern.
In @.github/workflows/update-gpu-ids.yml:
- Around line 28-35: Wrap the GitHub Actions output and step summary variable
expansions in double quotes to avoid SC2086: change all occurrences of
$GITHUB_OUTPUT and $GITHUB_STEP_SUMMARY used as redirection targets (e.g., echo
"changed=false" >> $GITHUB_OUTPUT, echo "## GPU ID Database Changes" >>
$GITHUB_STEP_SUMMARY, git diff --stat data/gpu-ids.sh >> $GITHUB_STEP_SUMMARY)
to use quoted expansions (>> "$GITHUB_OUTPUT" and >> "$GITHUB_STEP_SUMMARY") so
the shell treats them safely if they contain spaces or special characters.
- Around line 37-51: Replace the loose action tag
"peter-evans/create-pull-request@v7" with the fixed commit SHA
"peter-evans/create-pull-request@22a9089034f40e5a961c8808d113e2c98fb63676" in
the workflow step that uses the Create PR action (the step shown as uses:
peter-evans/create-pull-request@v7), and add an inline comment after the uses
reference containing the human-readable version "v7.0.11" so maintainers can see
the version while the action is pinned to the specific commit for supply-chain
security.
In `@lang/cuda-install/ZH_CN.sh`:
- Around line 1-3: Add a ShellCheck directive comment at the top of the file to
silence SC2148 for this sourced language pack: insert a line like a shell
directive comment (e.g. "# shellcheck shell=bash") before the existing declare
-A LANG_PACK_ZH_CN and LANG_PACK_ZH_CN=( lines so ShellCheck recognizes the
intended shell without adding a shebang; keep the rest of the file unchanged.
In `@scripts/build.py`:
- Around line 106-141: check_keys currently logs a warning when a template uses
gettext but find_lang_packs returns empty, leaving all_ok true so --check-keys
doesn't fail; change this to treat missing language packs as an error by setting
all_ok = False (and optionally change the warning message to an error) when
lang_packs is empty so the function returns False; update the behavior in
check_keys around the find_lang_packs branch (the block that prints "Warning:
{script_name} uses gettext but has no language packs") to set all_ok = False and
emit an error-level message instead.
- Around line 76-95: The regexes in extract_gettext_keys miss keys containing
escaped quotes; update the double- and single-quoted patterns to accept
backslash-escaped characters (e.g. use r'gettext\s+"((?:[^"\\]|\\.)*)"' and
r"gettext\s+'((?:[^'\\]|\\.)*)'"), then unescape the captured string (replace
backslash-escaped quotes and other escapes) before checking startswith("$") and
adding to the keys set; keep the existing dynamic-annotation handling unchanged.
- Around line 39-51: The function find_templates currently calls sys.exit(1) on
errors which prevents unit testing; change find_templates to raise explicit
exceptions (e.g., FileNotFoundError or a custom TemplateNotFoundError) instead
of calling sys.exit, update any callers (notably main) to catch these exceptions
and call sys.exit(1) there with the appropriate error message, and ensure the
error messages previously printed to stderr are preserved when handling the
exception in main.
- Around line 170-189: The code currently reads each language pack twice (once
in the join for lang_content and again during syntax validation); cache the
results from read_file by building a mapping like lang_contents = {p:
read_file(p) for p in lang_packs} (or a list of tuples) and then use those
cached strings both when creating lang_content (join cached values) and when
constructing test_script for validate_syntax; update the block that references
lang_packs, read_file, validate_syntax, test_script and test_file to use the
cached content and avoid repeat I/O, preserving the skip_syntax_check flow and
file unlink behavior.
In `@scripts/extract_keys.py`:
- Around line 79-101: The show_all() function currently only prints missing-key
information and never returns a failure status; change show_all() to return a
boolean (e.g., True when all templates have complete lang packs, False if any
missing keys) and propagate detection of missing translations into that boolean
by: 1) tracking a local flag (e.g., has_failures) inside show_all while
iterating templates/keys/lang_dir; 2) when lang_dir is missing or when calling
show_diff(template, lang_file) detect/receive a failure signal (modify show_diff
to return a bool or raise a specific exception) and set the flag; and 3) return
the flag (or its negation) at the end so callers can sys.exit(1) or use the
result in CI. Update callers to check the returned boolean and exit with a
nonzero code when false.
- Line 23: The import "from build import extract_gettext_keys,
extract_lang_pack_keys" relies on implicit sys.path behavior and can fail when
run from other CWDs or as a module; update scripts/extract_keys.py to perform an
explicit, robust import: either use a package-relative import ("from .build
import extract_gettext_keys, extract_lang_pack_keys") if the scripts folder is a
package, or programmatically resolve the script directory via
Path(__file__).resolve().parent and insert it into sys.path (or use
importlib.machinery.SourceFileLoader) before importing those symbols so
extract_gettext_keys and extract_lang_pack_keys are reliably loaded.
In `@scripts/update_gpu_ids.py`:
- Around line 150-165: The parameter device_id in classify_device is unused
(Ruff ARG001); rename it to _device_id in the function signature to indicate
it's intentionally unused and silence the linter, keeping the rest of the logic
that uses device_name, _NON_GPU_RE, CHIP_CODE_RE and CHIP_TO_ARCH unchanged;
also update any callers if they rely on a positional/keyword name to pass the
ID.
- Around line 330-334: The current non-atomic write using
output_path.write_text(output, encoding="utf-8") can leave data/gpu-ids.sh
corrupted if interrupted; modify the write to first write to a temporary file in
the same directory (e.g., output_path.with_suffix(".tmp") or using
tempfile.NamedTemporaryFile) and then atomically replace the target using
os.replace(temp_path, output_path); ensure parent dir creation remains
(output_path.parent.mkdir(...)) and update the final print to reference
output_path after the replace so the behavior of OUTPUT_FILE/output_path is
preserved.
In `@src/cuda-install.sh`:
- Around line 42-56: Remove or document the unused exit-code constants to
silence SC2034: either delete EXIT_SUCCESS, EXIT_NETWORK_FAILED, and
EXIT_REPO_ADD_FAILED if they are not used anywhere, or add a brief comment above
their declarations (e.g., "reserved for future use") to indicate they are
intentionally unused; update the block where EXIT_* constants are defined so the
intent is clear for readers and static analysis tools referencing the constants
EXIT_SUCCESS, EXIT_NETWORK_FAILED, and EXIT_REPO_ADD_FAILED.
- Around line 278-294: The current create_install_lock function has a TOCTOU
race between checking and writing $lock_file; replace the check-then-write with
an atomic lock operation: either attempt to create a lock directory (e.g. mkdir
on a dedicated lock dir under STATE_DIR) and treat mkdir failure as “another
instance running” (call exit_with_code $EXIT_STATE_FILE_CORRUPTED with the other
PID read from the existing lock if present), or open the lock file and acquire
an exclusive flock (use exec to open a file descriptor, then flock -n; on
failure call exit_with_code), and only write the current PID into the file after
the atomic acquisition; keep existing warn and debug calls (warn "$(gettext
"state.lock.cleaning_orphaned")" and debug "$(gettext "state.lock.created")
...") and ensure cleanup removes the lock dir/file on exit.
- Around line 871-912: The function named help shadows the bash builtin; rename
the function (e.g., to show_help or usage) and update every place that calls
help to the new name (preserve behavior such as printing the heredoc and exit
0), ensuring any exported/used symbols or traps referencing help are updated as
well so no callers remain pointing to the old help function.
In `@src/nvidia-install.sh`:
- Line 1953: The command substitution inside the log_step call can perform
word-splitting because the inner echo uses an unquoted $(gettext ...); update
the nested substitutions so translated strings are quoted (e.g. use "$(gettext
"nvidia_driver.type.open")" and "$(gettext "nvidia_driver.type.proprietary")")
and similarly quote any other echo $(gettext ...) occurrences (seen near lines
with USE_OPEN_MODULES, log_step, and the INSTALL_TYPE usage) to avoid spurious
word splitting while preserving the original behavior.
- Around line 44-98: Many EXIT_* readonly constants (e.g. EXIT_SUCCESS,
EXIT_PERMISSION_DENIED, EXIT_GPU_ARCH_INCOMPATIBLE, EXIT_NETWORK_FAILED,
EXIT_REPO_DOWNLOAD_FAILED) are defined but never used; either remove unused
declarations from the readonly block or explicitly mark them as intentionally
reserved to silence ShellCheck (add a brief comment like "reserved for future
use" above the block and apply a ShellCheck disable SC2034 for those symbols).
Update the readonly declarations in the existing constants block accordingly so
the intent is clear and ShellCheck warnings are addressed.
- Around line 831-862: The device detection loop currently uses patterns like
`local device_id=$(...)` which masks the command's exit status; separate
declaration and assignment for variables like `device_id` (and any other `local
var=$(cmd)` such as `gpu_info`/`pci_address`) so failures aren't silenced:
declare `local device_id` first, then run `device_id=$(lspci -s "$pci_address"
-nn | grep -oP '10de:\K[0-9a-fA-F]{4}' | tr '[:upper:]' '[:lower:]')` and
immediately check the command exit code or `-n "$device_id"`; if the command
fails, log an error and set `has_incompatible_gpu=true` accordingly before
calling `detect_gpu_architecture` or `is_open_module_supported` (respect
`USE_OPEN_MODULES` flow) to avoid proceeding with invalid data.
- Around line 402-459: The help text in show_usage() is hardcoded Chinese and
not internationalized; modify show_usage to use gettext (e.g., wrap all
user-visible strings with _() or call a language lookup) and ensure gettext is
initialized (bindtextdomain/textdomain) before parse_arguments may call
show_usage; alternatively move language selection earlier or have
pre_parse_arguments set LANG/initialize i18n so show_usage prints translated
text for en_US and zh_CN; update message keys for all option descriptions,
examples and notes in show_usage and add corresponding translations for en_US.
- Around line 256-258: The INT/TERM and EXIT traps can cause cleanup_on_exit to
run twice; change cleanup_on_exit to be idempotent by adding a shell-level guard
(e.g. a CLEANUP_DONE flag) that returns immediately if set, and set that flag at
the start of cleanup_on_exit before calling cleanup_temp_files and
cleanup_lock_files; ensure cleanup_on_exit still performs the appropriate exit
code behavior after setting the flag so subsequent EXIT/INT/TERM triggers do
nothing extra.
- Add warning for unknown architecture in GPU ID overrides - Fix confirm() default parameter being unused in cuda-install.sh - Add $2 existence check for --type/--modules/--version/--lang args - Remove dead --dry-run flag from nvidia-install.sh (never used) - Fix SecureBoot glob detection: -f doesn't expand wildcards, use compgen - Add args.error.missing_value translation key to both lang packs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (10)
scripts/update_gpu_ids.py (2)
206-206:arch_order列表在两处重复定义,违反 DRY。Lines 206 和 314 各自硬编码了相同的架构顺序列表。如果后续新增架构(如 Blackwell 之后的),容易遗漏其中一处。建议提取为模块级常量。
♻️ 建议提取为常量
+# Ordered architectures (chronological) +ARCH_ORDER = ["Kepler", "Maxwell", "Pascal", "Volta", "Turing", "Ampere", "Hopper", "Ada Lovelace", "Blackwell"] + ... def generate_gpu_ids_sh(...): ... - arch_order = ["Kepler", "Maxwell", "Pascal", "Volta", "Turing", "Ampere", "Hopper", "Ada Lovelace", "Blackwell"] + arch_order = ARCH_ORDER ... def main(): ... - arch_order = ["Kepler", "Maxwell", "Pascal", "Volta", "Turing", "Ampere", "Hopper", "Ada Lovelace", "Blackwell"] + arch_order = ARCH_ORDERAlso applies to: 314-314
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/update_gpu_ids.py` at line 206, The arch_order list is duplicated; extract it to a single module-level constant (e.g., GPU_ARCH_ORDER or ARCH_ORDER) and replace both inline definitions with references to that constant; update any functions or code that currently define or use the local arch_order (search for the occurrences at the two places where arch_order is declared) to import/use the new constant so future architecture additions are made in one place.
150-165:classify_device的device_id参数未被使用。静态分析 (Ruff ARG001) 也标记了此问题。当前函数签名包含
device_id但函数体内从未引用。保留它作为未来扩展预留是可以的,但建议加_前缀表示有意忽略,消除告警。♻️ 建议
-def classify_device(device_id: str, device_name: str) -> str | None: +def classify_device(_device_id: str, device_name: str) -> str | None:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/update_gpu_ids.py` around lines 150 - 165, The unused parameter warning comes from classify_device(device_id: str, device_name: str) where device_id is never referenced; to silence the linter and indicate the parameter is intentionally unused, rename the parameter to _device_id (or prefix it with an underscore) in the function signature and any callers if needed, leaving the function body and logic (CHIP_CODE_RE, _NON_GPU_RE, CHIP_TO_ARCH) unchanged so behavior is identical.src/cuda-install.sh (2)
320-323:save_rollback_info写入文件时未设置严格权限,与eval回滚组合增大攻击面。
ROLLBACK_FILE在写入时没有设置限制性权限。虽然STATE_DIR在 Line 278 设置了755,但这意味着任何用户都可以读取回滚文件,而 root 之外有写权限的进程也可能篡改。建议创建时将目录和文件权限收紧。🛡️ 建议
create_state_dir() { if ! mkdir -p "$STATE_DIR" 2>/dev/null; then exit_with_code $EXIT_STATE_DIR_FAILED "$(gettext "state.dir.create_failed") $STATE_DIR" fi - chmod 755 "$STATE_DIR" + chmod 700 "$STATE_DIR" create_install_lock }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuda-install.sh` around lines 320 - 323, The save_rollback_info function writes to ROLLBACK_FILE without enforcing restrictive permissions, increasing the attack surface when combined with eval-based rollback; change it to ensure the STATE_DIR and ROLLBACK_FILE are created/owned by root and use restrictive perms (e.g., STATE_DIR 700 and ROLLBACK_FILE 600) before writing, by creating the file with secure permissions (install -m 600 or touch + chown/chmod) or setting a local umask only around the write, then append the action to the file; reference save_rollback_info, ROLLBACK_FILE and STATE_DIR when making the change.
131-132: 日志输出到 stderr 的重定向>&$target依赖变量展开,可用更健壮的写法。
>&$target在 bash 中可以工作,但可读性不佳且在某些 shell 版本中行为不一致。可以考虑使用条件分支代替。♻️ 可选改进
- local target=1; [[ "$level" == "ERROR" ]] && target=2 - echo -e "[$(date +'%Y-%m-%d %H:%M:%S')] [${color}${level}${NC}] $*" >&$target + local msg="[$(date +'%Y-%m-%d %H:%M:%S')] [${color}${level}${NC}] $*" + if [[ "$level" == "ERROR" ]]; then + echo -e "$msg" >&2 + else + echo -e "$msg" + fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cuda-install.sh` around lines 131 - 132, The current echo redirection uses indirect variable expansion ("target" and ">&$target") which is fragile; change the echo call to an explicit conditional branch based on "level" (or reuse the existing "target" check) so that when level == "ERROR" you redirect the message to stderr (>&2) and otherwise to stdout (>&1) — e.g., use an if/else around the echo that uses explicit >&2 or >&1 instead of ">&$target" to improve readability and shell compatibility.src/nvidia-install.sh (4)
164-206: 信号处理:INT/TERM 触发后清理函数会执行两次。当收到 INT 信号时:INT trap →
cleanup_on_exit INT→ 清理 →exit 130→ 触发 EXIT trap →cleanup_on_exit EXIT→ 再次清理 →exit 130。由于
cleanup_temp_files和cleanup_lock_files都是幂等的(rm -f、find -delete),不会造成功能问题,但可以通过设置一个 guard 变量避免重复执行。这是一个低优先级的优化。Also applies to: 255-257
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nvidia-install.sh` around lines 164 - 206, The cleanup handler cleanup_on_exit is executed twice when a signal (INT/TERM) causes an exit because the handler runs for the signal and then again for EXIT; add a simple guard variable (e.g., CLEANUP_DONE) at the top of cleanup_on_exit to return immediately if already set, and set CLEANUP_DONE=true early in the handler before invoking cleanup_temp_files and cleanup_lock_files (and before calling exit) so subsequent ENTER/EXIT trap invocations are no-ops; update uses of cleanup_temp_files and cleanup_lock_files inside cleanup_on_exit to assume idempotence but rely on the guard to avoid duplicate work.
402-457:show_usage帮助信息未经gettext国际化,英文用户始终看到中文。
show_usage在parse_arguments(语言选择之后)中也会被调用,但整个帮助文本是硬编码中文的。如果面向国际用户,可考虑将帮助文本也纳入语言包或至少提供双语版本。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nvidia-install.sh` around lines 402 - 457, The show_usage help text is hardcoded in Chinese causing English users to always see Chinese; update show_usage to use gettext (or an internal _() wrapper) for all strings or provide bilingual branches based on the selected language variable set in parse_arguments so the displayed text respects NVIDIA_INSTALLER_LANG (or the language chosen via parse_arguments); specifically replace the literal Chinese text in the show_usage function with translatable strings (e.g., _("用法: %s [选项]") or the English equivalent) and ensure parse_arguments sets the active locale or language selection before show_usage is ever invoked so the correct translation is shown.
1962-1962:install_nvidia_driver中的$(if ...)命令替换未被引用,存在字段分割风险。
log_step的参数中使用了未引用的$(if $USE_OPEN_MODULES; then echo $(gettext ...); ... fi),ShellCheck SC2046 也标记了此问题。同样的模式出现在第 2316、2504-2509 行。虽然gettext返回值不太可能包含空格导致问题,但引用$()是更稳健的做法。示例修复(第 1962 行)
- log_step "$(gettext "nvidia_driver.install.starting") ($(if $USE_OPEN_MODULES; then echo $(gettext "nvidia_driver.type.open"); else echo $(gettext "nvidia_driver.type.proprietary"); fi), $INSTALL_TYPE)..." + local module_type_label + if [[ "$USE_OPEN_MODULES" == "true" ]]; then + module_type_label="$(gettext "nvidia_driver.type.open")" + else + module_type_label="$(gettext "nvidia_driver.type.proprietary")" + fi + log_step "$(gettext "nvidia_driver.install.starting") ($module_type_label, $INSTALL_TYPE)..."此模式也适用于第 2316、2504-2509 行类似的
$(if $VAR; ...)用法——使用字符串比较[[ "$VAR" == "true" ]]代替直接将变量作为命令执行更安全。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nvidia-install.sh` at line 1962, The log_step invocation inside install_nvidia_driver uses an unquoted command substitution ($(if $USE_OPEN_MODULES; ...)) which risks field-splitting; instead compute the driver type into a variable using a safe test (e.g., use [[ "$USE_OPEN_MODULES" == "true" ]] or an explicit comparison) and assign the result of the command substitution to that variable, then pass it to log_step quoted (e.g., driver_type="$(...)" and log_step "... ($driver_type, $INSTALL_TYPE)..."), and apply the same fix pattern to the other occurrences mentioned (the similar if-based substitutions at the other locations).
1094-1116: 在check_existing_nvidia_installation内部定义分派函数。
__check_pkg_nvidia_deb__、__check_pkg_nvidia_rpm__、__check_pkg_nvidia_suse__等函数每次调用外部函数时都会重新定义。虽然功能正确,但与脚本其他部分在外部定义分派函数的模式不一致(如__check_distro_support_*__)。可以考虑将这些函数提取到外部作用域。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nvidia-install.sh` around lines 1094 - 1116, The three package-check helper functions (__check_pkg_nvidia_deb__, __check_pkg_nvidia_rpm__, __check_pkg_nvidia_suse__) are being defined inside check_existing_nvidia_installation and thus redefined on every call; pull these functions out to the script's outer scope so they are defined once (keep their names and existing behaviors including setting existing_driver and installation_method and calling log_warning and the package listing commands). Leave the dynamic dispatch logic inside the original function (the local fn="__check_pkg_nvidia_${PKG_FAMILY}__" + declare -F check + call) so runtime selection still uses PKG_FAMILY; ensure any variables those helpers write (existing_driver, installation_method) have the intended scope (export or global) so behavior is unchanged after moving the functions.lang/nvidia-install/ZH_CN.sh (1)
1-4: 添加shellcheckshell 指令以消除 SC2148 警告。此文件会被构建管道 source 到主脚本中,不需要 shebang,但在文件顶部添加
# shellcheck shell=bash可以让静态分析工具正确推断目标 shell,同时避免 SC2148 和 SC2034 的误报。建议修改
+# shellcheck shell=bash +# shellcheck disable=SC2034 declare -A LANG_PACK_ZH_CN🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lang/nvidia-install/ZH_CN.sh` around lines 1 - 4, Add a ShellCheck directive comment at the top of the file so static analysis treats this snippet as bash and suppresses SC2148/SC2034; specifically, insert the comment "# shellcheck shell=bash" before the declaration of LANG_PACK_ZH_CN (the lines containing "declare -A LANG_PACK_ZH_CN" and "LANG_PACK_ZH_CN=(") so tools correctly infer the shell without adding a shebang.lang/nvidia-install/EN_US.sh (1)
1-4: 同 ZH_CN.sh,建议添加shellcheckshell 指令。建议修改
+# shellcheck shell=bash +# shellcheck disable=SC2034 declare -A LANG_PACK_EN_US🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lang/nvidia-install/EN_US.sh` around lines 1 - 4, Add the ShellCheck directive at the top of this file (same change as ZH_CN.sh): insert a comment like "# shellcheck shell=bash" (or "# shellcheck shell=sh" if you prefer POSIX) before the declare -A LANG_PACK_EN_US so ShellCheck knows the intended shell; this touches the file header near the existing "declare -A LANG_PACK_EN_US" / "LANG_PACK_EN_US=(" symbols.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
lang/nvidia-install/EN_US.shlang/nvidia-install/ZH_CN.shscripts/update_gpu_ids.pysrc/cuda-install.shsrc/nvidia-install.sh
🧰 Additional context used
🪛 Ruff (0.15.0)
scripts/update_gpu_ids.py
[error] 101-101: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
[error] 102-102: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
[warning] 150-150: Unused function argument: device_id
(ARG001)
[warning] 273-273: Do not catch blind exception: Exception
(BLE001)
🪛 Shellcheck (0.11.0)
src/nvidia-install.sh
[warning] 44-44: EXIT_SUCCESS appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 48-48: EXIT_PERMISSION_DENIED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 54-54: EXIT_GPU_ARCH_INCOMPATIBLE appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 64-64: EXIT_MODULE_ARCH_MISMATCH appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 74-74: EXIT_DRIVER_UNINSTALL_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 78-78: EXIT_NETWORK_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 79-79: EXIT_REPO_DOWNLOAD_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 80-80: EXIT_KEYRING_DOWNLOAD_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 83-83: EXIT_PACKAGE_MANAGER_UNAVAILABLE appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 84-84: EXIT_REPO_ADD_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 85-85: EXIT_DEPENDENCY_INSTALL_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 86-86: EXIT_KERNEL_HEADERS_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 87-87: EXIT_NVIDIA_INSTALL_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 90-90: EXIT_KERNEL_VERSION_ISSUE appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 91-91: EXIT_DKMS_BUILD_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 92-92: EXIT_MODULE_SIGNING_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 93-93: EXIT_DRIVER_VERIFICATION_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 839-839: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 844-844: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 845-845: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 848-848: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1152-1152: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1321-1321: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1484-1484: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1491-1491: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1497-1497: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1779-1779: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1780-1780: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 1962-1962: Quote this to prevent word splitting.
(SC2046)
[warning] 1962-1962: Quote this to prevent word splitting.
(SC2046)
[warning] 2071-2071: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2239-2239: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2267-2267: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2272-2272: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2316-2316: Quote this to prevent word splitting.
(SC2046)
[warning] 2316-2316: Quote this to prevent word splitting.
(SC2046)
[warning] 2316-2316: Quote this to prevent word splitting.
(SC2046)
[warning] 2316-2316: Quote this to prevent word splitting.
(SC2046)
[warning] 2322-2322: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2461-2461: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2504-2504: Quote this to prevent word splitting.
(SC2046)
[warning] 2504-2504: Quote this to prevent word splitting.
(SC2046)
[warning] 2506-2506: Quote this to prevent word splitting.
(SC2046)
[warning] 2506-2506: Quote this to prevent word splitting.
(SC2046)
[warning] 2507-2507: Quote this to prevent word splitting.
(SC2046)
[warning] 2507-2507: Quote this to prevent word splitting.
(SC2046)
[warning] 2508-2508: Quote this to prevent word splitting.
(SC2046)
[warning] 2508-2508: Quote this to prevent word splitting.
(SC2046)
[warning] 2509-2509: Quote this to prevent word splitting.
(SC2046)
[warning] 2509-2509: Quote this to prevent word splitting.
(SC2046)
[warning] 2582-2582: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2590-2590: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 2591-2591: Declare and assign separately to avoid masking return values.
(SC2155)
src/cuda-install.sh
[warning] 43-43: EXIT_SUCCESS appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 50-50: EXIT_NETWORK_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 51-51: EXIT_REPO_ADD_FAILED appears unused. Verify use (or export if used externally).
(SC2034)
lang/nvidia-install/EN_US.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 4-4: LANG_PACK_EN_US appears unused. Verify use (or export if used externally).
(SC2034)
lang/nvidia-install/ZH_CN.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 4-4: LANG_PACK_ZH_CN appears unused. Verify use (or export if used externally).
(SC2034)
🔇 Additional comments (8)
src/cuda-install.sh (1)
30-30:set -eo pipefail与main中多处命令交互需注意。
set -e会导致任何非零返回立即退出。main中 Line 802 的[ "$EUID" -ne 0 ]以及部分grep -q调用在条件语句中使用时不受影响,但注意is_step_completed(Line 314) 中的grep -q在if中使用是安全的。整体看set -e的使用是合理的,但要留意新增代码中的裸命令调用。Also applies to: 797-873
lang/nvidia-install/ZH_CN.sh (1)
4-351: 语言包键值完整性需要与 EN_US 保持同步。两个语言包的键集合必须完全一致,否则
gettext()在某个语言下会回退为返回 key 本身。CI 中已有--check-keys校验机制,但建议确认该检查在 PR 流水线中确实运行。经初步对比,两个文件的键数量和键名看起来一致,LGTM。lang/nvidia-install/EN_US.sh (1)
4-351: 语言包内容完整,翻译质量良好。键集合与 ZH_CN 一致,英文文本表达清晰。
src/nvidia-install.sh (5)
346-362:confirm()函数逻辑清晰,默认值处理正确。AUTO_YES 短路、default="Y" / "N" 的 read+regex 分支行为与调用点的语义一致。
515-596:parse_arguments已正确添加$2存在性检查,LGTM。
-t、-m、-v、--lang等选项现在都有[[ -z "${2:-}" ]]防护,与pre_parse_arguments保持一致。--dry-run也已移除。
1234-1244: Secure Boot glob 检测已使用compgen修复,LGTM。此前提出的
[[ -f /sys/firmware/efi/efivars/SecureBoot-* ]]问题已在第 1236 行和第 2337 行正确修复。
31-33:set -eo pipefail启用,确保错误及早暴露。与整个脚本中大量使用
|| true、|| log_warning来处理可预期失败的模式配合良好。
783-798: 所有必需的函数和数组声明都已正确配置,无需进一步处理。验证确认:
init_gpu_database函数定义于data/gpu-ids.sh第 12 行,GPU_ARCH_DB关联数组声明于第 9 行,并在第 140-172 行中使用 Kepler、Maxwell、Pascal、Volta、Turing、Ampere、Hopper、Ada Lovelace、Blackwell 等架构进行了正确填充。该文件存在且可读,第 830 行的函数调用和第 791 行的数组引用都能正确获取所需数据。
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/update_gpu_ids.py`:
- Around line 96-106: fetch_pci_ids currently uses url = SOURCES.get(source,
source) which allows arbitrary schemes (e.g. file://) to reach
urllib.request.urlopen; validate the resolved URL's scheme before opening: after
resolving url from SOURCES, parse it with urllib.parse.urlparse and only allow
http and https (or alternatively require source be a key in SOURCES), otherwise
raise a ValueError; update fetch_pci_ids to perform this scheme check (and
document the behavior) so only safe remote URLs are passed to
urllib.request.urlopen.
In `@src/cuda-install.sh`:
- Around line 960-963: 在处理 --lang 分支时为 $2 增加存在性检查:在 case 分支中判断 if [ -n "${2:-}"
]; then LANG_CURRENT="${2}"; shift; else print a clear error/usage message and
exit 1; fi。确保使用与其他参数相同的检查风格并引用变量名 LANG_CURRENT、选项 --lang 和 shift,以避免在 set -e/-o
pipefail 下因缺少参数导致脚本非预期退出。
- Around line 285-294: The error message construction when detecting a running
lock PID in the lock handling block is malformed due to an extra unmatched
closing parenthesis; update the exit_with_code invocation in the lock handling
code (where lock_file and lock_pid are used and EXIT_STATE_FILE_CORRUPTED is
passed) so the PID is correctly wrapped in parentheses and the quotes are
balanced (e.g. combine the gettext message and the PID as a single, properly
quoted string like: gettext message followed by space and the PID in
parentheses) to remove the stray ")" and ensure correct output formatting.
In `@src/nvidia-install.sh`:
- Around line 800-815: The default branch in is_open_module_supported currently
returns 0 (support open modules) for unknown architectures which can misselect
open drivers for unknown/old GPUs; update is_open_module_supported to treat
"Unknown" (and empty) architectures as requiring proprietary modules by
returning 1 in the *) case so its behavior matches detect_gpu_architecture's
handling of unknown device IDs and the translation key
detect.gpu.warning.unknown_device_id, ensuring the script falls back to
proprietary drivers when architecture is not recognized.
- Around line 193-205: The case handling for the "$signal" variable currently
maps both "INT" and "TERM" to exit 130; update the case in the shutdown/cleanup
logic so INT maps to exit 130 and TERM maps to exit 143 (standard
128+signal-number behavior), leaving the "EXIT" branch to exit $exit_code and
the default to exit 1; locate the case block that inspects "$signal" (the
switch/case around "INT"|"TERM"|"EXIT") and change the exit code for "TERM"
accordingly.
- Around line 2428-2458: In main(), create_state_dir is being called before
check_root which causes non-root users to hit a generic "failed to create state
dir" error; move the check_root call so it runs before create_state_dir (i.e.,
call check_root prior to invoking create_state_dir in main) so permission checks
produce the proper "requires root" message; update main's call order and ensure
no other early filesystem operations (like create_state_dir) occur before
check_root.
- Around line 2636-2643: The script currently treats AUTO_YES as consent to
reboot because the condition in the failure branch uses [[ "$AUTO_YES" == "true"
]] || [[ "$REBOOT_AFTER_INSTALL" == "true" ]]; change this so only explicit
reboot consent triggers a reboot: remove AUTO_YES from that conditional so it
checks only REBOOT_AFTER_INSTALL before calling log_info, cleanup_lock_files and
reboot; update the failure branch that removes
STATE_FILE/ROLLBACK_FILE/$STATE_DIR/nouveau_status/$STATE_DIR/driver_status to
behave the same as the successful-install branch (i.e., AUTO_YES should not
cause reboot), keeping the same cleanup and reboot actions under the
REBOOT_AFTER_INSTALL variable and leaving AUTO_YES solely as “auto-confirm” with
no reboot side-effect.
---
Duplicate comments:
In `@scripts/update_gpu_ids.py`:
- Around line 294-302: The override-handling loop in update_gpu_ids.py now
validates device IDs with DEVICE_ID_RE, maps known architectures using
CHIP_TO_ARCH, and appends to arch_devices[arch], and when arch is unknown it
emits a warning; no further changes needed—keep the current handling in the for
device_id, arch in overrides.items() block (DEVICE_ID_RE, CHIP_TO_ARCH,
arch_devices) as implemented.
In `@src/cuda-install.sh`:
- Around line 247-271: The confirm() function did not previously apply the
provided default when the user pressed Enter; fix it by setting
answer="$default" when read returns an empty string—this is now implemented: in
confirm(), after read -r answer check [[ -z "$answer" ]] and assign
answer="$default", then perform the case-insensitive match (case "${answer,,}")
to return 0 for y/yes; ensure this behavior coexists with AUTO_YES and
QUIET_MODE checks and the prompt uses gettext as shown.
- Around line 342-373: The rollback_installation function currently uses eval
"$action" to execute lines from $ROLLBACK_FILE (rollback.list), which allows
command injection; change it to avoid shell evaluation by parsing each line into
an argv array and executing it directly (e.g., split into parts and call
"${parts[@]}") and add a whitelist check against allowed command names before
running; update the loop that reads from $ROLLBACK_FILE to parse each action
into an array, validate the command (first element) is in a ROLLBACK_WHITELIST,
and only then execute the array form, setting rollback_failed on non-zero exit —
do not use eval anywhere in rollback_installation or related helpers.
In `@src/nvidia-install.sh`:
- Around line 728-734: The loop currently uses eval "$action" which allows
command-injection when processing ROLLBACK_FILE; change it to parse each line
into an argument array and execute it without eval: in the while loop that reads
into variable action, use read -r -a action_parts <<< "$action" (or use mapfile
-t and split), check command existence with command -v "${action_parts[0]}"
before running, then run the command as: if ! "${action_parts[@]}"; then ...;
this replaces eval "$action" with a safe exec path and keeps
log_info/log_warning and rollback_failed behavior intact.
---
Nitpick comments:
In `@lang/nvidia-install/EN_US.sh`:
- Around line 1-4: Add the ShellCheck directive at the top of this file (same
change as ZH_CN.sh): insert a comment like "# shellcheck shell=bash" (or "#
shellcheck shell=sh" if you prefer POSIX) before the declare -A LANG_PACK_EN_US
so ShellCheck knows the intended shell; this touches the file header near the
existing "declare -A LANG_PACK_EN_US" / "LANG_PACK_EN_US=(" symbols.
In `@lang/nvidia-install/ZH_CN.sh`:
- Around line 1-4: Add a ShellCheck directive comment at the top of the file so
static analysis treats this snippet as bash and suppresses SC2148/SC2034;
specifically, insert the comment "# shellcheck shell=bash" before the
declaration of LANG_PACK_ZH_CN (the lines containing "declare -A
LANG_PACK_ZH_CN" and "LANG_PACK_ZH_CN=(") so tools correctly infer the shell
without adding a shebang.
In `@scripts/update_gpu_ids.py`:
- Line 206: The arch_order list is duplicated; extract it to a single
module-level constant (e.g., GPU_ARCH_ORDER or ARCH_ORDER) and replace both
inline definitions with references to that constant; update any functions or
code that currently define or use the local arch_order (search for the
occurrences at the two places where arch_order is declared) to import/use the
new constant so future architecture additions are made in one place.
- Around line 150-165: The unused parameter warning comes from
classify_device(device_id: str, device_name: str) where device_id is never
referenced; to silence the linter and indicate the parameter is intentionally
unused, rename the parameter to _device_id (or prefix it with an underscore) in
the function signature and any callers if needed, leaving the function body and
logic (CHIP_CODE_RE, _NON_GPU_RE, CHIP_TO_ARCH) unchanged so behavior is
identical.
In `@src/cuda-install.sh`:
- Around line 320-323: The save_rollback_info function writes to ROLLBACK_FILE
without enforcing restrictive permissions, increasing the attack surface when
combined with eval-based rollback; change it to ensure the STATE_DIR and
ROLLBACK_FILE are created/owned by root and use restrictive perms (e.g.,
STATE_DIR 700 and ROLLBACK_FILE 600) before writing, by creating the file with
secure permissions (install -m 600 or touch + chown/chmod) or setting a local
umask only around the write, then append the action to the file; reference
save_rollback_info, ROLLBACK_FILE and STATE_DIR when making the change.
- Around line 131-132: The current echo redirection uses indirect variable
expansion ("target" and ">&$target") which is fragile; change the echo call to
an explicit conditional branch based on "level" (or reuse the existing "target"
check) so that when level == "ERROR" you redirect the message to stderr (>&2)
and otherwise to stdout (>&1) — e.g., use an if/else around the echo that uses
explicit >&2 or >&1 instead of ">&$target" to improve readability and shell
compatibility.
In `@src/nvidia-install.sh`:
- Around line 164-206: The cleanup handler cleanup_on_exit is executed twice
when a signal (INT/TERM) causes an exit because the handler runs for the signal
and then again for EXIT; add a simple guard variable (e.g., CLEANUP_DONE) at the
top of cleanup_on_exit to return immediately if already set, and set
CLEANUP_DONE=true early in the handler before invoking cleanup_temp_files and
cleanup_lock_files (and before calling exit) so subsequent ENTER/EXIT trap
invocations are no-ops; update uses of cleanup_temp_files and cleanup_lock_files
inside cleanup_on_exit to assume idempotence but rely on the guard to avoid
duplicate work.
- Around line 402-457: The show_usage help text is hardcoded in Chinese causing
English users to always see Chinese; update show_usage to use gettext (or an
internal _() wrapper) for all strings or provide bilingual branches based on the
selected language variable set in parse_arguments so the displayed text respects
NVIDIA_INSTALLER_LANG (or the language chosen via parse_arguments); specifically
replace the literal Chinese text in the show_usage function with translatable
strings (e.g., _("用法: %s [选项]") or the English equivalent) and ensure
parse_arguments sets the active locale or language selection before show_usage
is ever invoked so the correct translation is shown.
- Line 1962: The log_step invocation inside install_nvidia_driver uses an
unquoted command substitution ($(if $USE_OPEN_MODULES; ...)) which risks
field-splitting; instead compute the driver type into a variable using a safe
test (e.g., use [[ "$USE_OPEN_MODULES" == "true" ]] or an explicit comparison)
and assign the result of the command substitution to that variable, then pass it
to log_step quoted (e.g., driver_type="$(...)" and log_step "... ($driver_type,
$INSTALL_TYPE)..."), and apply the same fix pattern to the other occurrences
mentioned (the similar if-based substitutions at the other locations).
- Around line 1094-1116: The three package-check helper functions
(__check_pkg_nvidia_deb__, __check_pkg_nvidia_rpm__, __check_pkg_nvidia_suse__)
are being defined inside check_existing_nvidia_installation and thus redefined
on every call; pull these functions out to the script's outer scope so they are
defined once (keep their names and existing behaviors including setting
existing_driver and installation_method and calling log_warning and the package
listing commands). Leave the dynamic dispatch logic inside the original function
(the local fn="__check_pkg_nvidia_${PKG_FAMILY}__" + declare -F check + call) so
runtime selection still uses PKG_FAMILY; ensure any variables those helpers
write (existing_driver, installation_method) have the intended scope (export or
global) so behavior is unchanged after moving the functions.
Reject non-HTTPS URLs to prevent file:// or other scheme injection when fetch_pci_ids is called with arbitrary source strings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix unmatched ')' in cuda-install.sh lock error message - Add $2 existence check for --lang in cuda-install.sh parse_arguments - Fix TERM signal exit code: use 143 (128+15) instead of 130 in both scripts - Move check_root before create_state_dir in nvidia-install.sh main() - Fix AUTO_YES triggering reboot without --auto-reboot in failure branch - Add args.error.missing_value key to cuda-install lang packs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
src/as templates with# {{LANG_PACKS}}and# {{GPU_IDS}}placeholders.scripts/build.pymerges templates + language packs + GPU data → self-containeddist/*.shlang/<script>/{ZH_CN,EN_US}.sh, eliminating cross-script duplication and manual copy-paste workflowscripts/update_gpu_ids.pyfetches pci-ids.ucw.cz and auto-generatesdata/gpu-ids.sh— 857 GPU device IDs (up from 294 hand-curated), with 100% backward coverage + manual override supportci.yml(build + syntax + translation validation on PR),release.yml(build + upload assets on release),update-gpu-ids.yml(weekly GPU ID update with auto-PR)language_packs/,language_pack_generate.pyNew Repository Structure
Developer Workflow
Test plan
python3 scripts/build.pybuilds all 5 scripts successfullybash -n dist/*.shpasses for all outputspython3 scripts/build.py --check-keysreports no missing translationssudo bash dist/nvidia-install.sh --helpon target systemsudo bash dist/cuda-install.sh --helpon target system🤖 Generated with Claude Code
Summary by CodeRabbit
发布说明
新功能
改进
基础设施
移除