feat: add Nvidia Jetson Thor Developer Kit GPU detection / --runtime nvidia#700
feat: add Nvidia Jetson Thor Developer Kit GPU detection / --runtime nvidia#700whitesscott wants to merge 3 commits intoNVIDIA:mainfrom
Conversation
📝 WalkthroughWalkthroughConverted the Node nim library into a CLI entrypoint, added Jetson/Thor detection and system-memory GPU estimation, tightened NVIDIA VRAM parsing, conditionally enable Docker NVIDIA runtime on Jetson, refactored install.sh GPU probing and semver logic, and updated a pre-commit hook. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User/Caller
participant NimCLI as bin/lib/nim.js (Node CLI)
participant Installer as install.sh
participant OS as Host OS (/proc, /etc, system_profiler)
participant Docker as docker
User->>NimCLI: run `nim detect-gpu` or `nim start ...`
NimCLI->>OS: probe `/etc/nv_tegra_release`, `/proc/device-tree`, `/proc/meminfo`
NimCLI->>OS: run `nvidia-smi` when available
alt Jetson/Thor detected
NimCLI->>NimCLI: compute gpuMemory = systemMemory/2, set unifiedMemory:true
NimCLI->>Docker: docker run ... --runtime nvidia ...
else Standard NVIDIA or macOS
NimCLI->>OS: parse `nvidia-smi` or `system_profiler`
NimCLI->>Docker: docker run ... (no special runtime)
end
Note over Installer,OS: install.sh uses similar probes for installer-time decisions
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
install.sh (1)
339-342: Redundant stdout redirection.
>/dev/stdoutis a no-op since stdout already goes to stdout by default.♻️ Suggested fix
- if nvidia-smi -L >/dev/stdout; then + if nvidia-smi -L >/dev/null 2>&1; thenIf the intent is to suppress output while checking exit status, redirect to
/dev/null. If output should be visible, remove the redirection entirely.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install.sh` around lines 339 - 342, The call to nvidia-smi in the conditional uses a redundant redirection (nvidia-smi -L >/dev/stdout); update that invocation so it either emits output (remove the >/dev/stdout part) or suppresses output properly (redirect both stdout and stderr to /dev/null) depending on the intended behavior in the install.sh check; change the line containing the nvidia-smi -L invocation accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/nim.js`:
- Around line 27-50: There are two conflicting isJetsonPlatform()
implementations; remove the local isJetsonPlatform() inside detectGpu() and
consolidate detection into the single module-level isJetsonPlatform() (ensure it
checks /etc/nv_tegra_release, /proc/device-tree/compatible, and
/proc/device-tree/model as the fuller implementation does), then update
detectGpu() to call the module-level isJetsonPlatform() and ensure
startNimContainer() uses that exported/common function so the --runtime nvidia
decision is based on the complete detection logic.
- Around line 118-127: The current NVIDIA Jetson detection returns nimCapable:
true even when getSystemMemoryMB() fails (returns 0), producing totalMemoryMB: 0
and perGpuMB: 0; update the logic in the function that constructs this return
object (the block that sets type: "nvidia", name: "NVIDIA Jetson", count:
gpuCount, totalMemoryMB, perGpuMB, nimCapable, jetson, unifiedMemory) so that if
getSystemMemoryMB() === 0 you either return null (no GPU detected) or set
nimCapable: false and avoid advertising 0 MB memory — compute perGpuMB only when
totalMemoryMB > 0 and ensure nimCapable reflects successful memory detection.
In `@install.sh`:
- Around line 380-388: The Jetson/Tegra fallback in the install.sh Jetson
detection block currently returns total system RAM but should match nim.js
behavior (which uses Math.floor(getSystemMemoryMB() / 2)); modify the Jetson
branch that computes mem_kb to echo half the system RAM in MB instead of the
full amount (i.e., compute mem_kb -> MB and divide by 2) and then return, so the
echoed value aligns with nim.js’s getSystemMemoryMB halving logic.
- Line 381: The Jetson detection is inconsistent: detect_gpu() checks
/proc/device-tree/compatible for "nvidia,tegra" or "nvidia,thor" but
get_vram_mb() only checks /etc/nv_tegra_release and /proc/device-tree/model, so
devices detected via compatible will return 0 VRAM; update get_vram_mb() to also
test /proc/device-tree/compatible for those strings (or better, create a shared
helper is_jetson_platform() used by both detect_gpu() and get_vram_mb()) so the
same Jetson detection logic is reused and VRAM fallback runs when appropriate.
---
Nitpick comments:
In `@install.sh`:
- Around line 339-342: The call to nvidia-smi in the conditional uses a
redundant redirection (nvidia-smi -L >/dev/stdout); update that invocation so it
either emits output (remove the >/dev/stdout part) or suppresses output properly
(redirect both stdout and stderr to /dev/null) depending on the intended
behavior in the install.sh check; change the line containing the nvidia-smi -L
invocation accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f2d86b11-72ba-4fbd-9dc7-9573cc49220b
📒 Files selected for processing (2)
bin/lib/nim.jsinstall.sh
| return { | ||
| type: "nvidia", | ||
| name: "NVIDIA Jetson", | ||
| count: gpuCount, | ||
| totalMemoryMB: 0, | ||
| perGpuMB: 0, | ||
| nimCapable: true, | ||
| jetson: true, | ||
| unifiedMemory: true, | ||
| }; |
There was a problem hiding this comment.
Returning nimCapable: true with totalMemoryMB: 0 may cause downstream issues.
When getSystemMemoryMB() returns 0 (detection failure), this block returns nimCapable: true with totalMemoryMB: 0 and perGpuMB: 0. This could lead to unexpected behavior when model selection logic checks memory requirements against 0 MB.
Consider returning null (no GPU detected) or setting nimCapable: false when memory detection fails.
🔧 Suggested fix
- return {
- type: "nvidia",
- name: "NVIDIA Jetson",
- count: gpuCount,
- totalMemoryMB: 0,
- perGpuMB: 0,
- nimCapable: true,
- jetson: true,
- unifiedMemory: true,
- };
+ // Memory detection failed - still report Jetson but mark as not NIM-capable
+ return {
+ type: "nvidia",
+ name: "NVIDIA Jetson",
+ count: gpuCount,
+ totalMemoryMB: 0,
+ perGpuMB: 0,
+ nimCapable: false,
+ jetson: true,
+ unifiedMemory: true,
+ };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return { | |
| type: "nvidia", | |
| name: "NVIDIA Jetson", | |
| count: gpuCount, | |
| totalMemoryMB: 0, | |
| perGpuMB: 0, | |
| nimCapable: true, | |
| jetson: true, | |
| unifiedMemory: true, | |
| }; | |
| // Memory detection failed - still report Jetson but mark as not NIM-capable | |
| return { | |
| type: "nvidia", | |
| name: "NVIDIA Jetson", | |
| count: gpuCount, | |
| totalMemoryMB: 0, | |
| perGpuMB: 0, | |
| nimCapable: false, | |
| jetson: true, | |
| unifiedMemory: true, | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bin/lib/nim.js` around lines 118 - 127, The current NVIDIA Jetson detection
returns nimCapable: true even when getSystemMemoryMB() fails (returns 0),
producing totalMemoryMB: 0 and perGpuMB: 0; update the logic in the function
that constructs this return object (the block that sets type: "nvidia", name:
"NVIDIA Jetson", count: gpuCount, totalMemoryMB, perGpuMB, nimCapable, jetson,
unifiedMemory) so that if getSystemMemoryMB() === 0 you either return null (no
GPU detected) or set nimCapable: false and avoid advertising 0 MB memory —
compute perGpuMB only when totalMemoryMB > 0 and ensure nimCapable reflects
successful memory detection.
| # 2) Jetson / Thor fallback: use total system RAM as shared GPU memory budget proxy | ||
| if [[ -f /etc/nv_tegra_release ]] || [[ -r /proc/device-tree/model ]]; then | ||
| local mem_kb | ||
| mem_kb="$(awk '/MemTotal:/ {print $2}' /proc/meminfo)" | ||
| if [[ -n "${mem_kb:-}" && "$mem_kb" -gt 0 ]]; then | ||
| echo $((mem_kb / 1024)) | ||
| return | ||
| fi | ||
| fi |
There was a problem hiding this comment.
Inconsistent Jetson memory calculation between install.sh and nim.js.
install.sh returns full system RAM for Jetson devices, but nim.js (line 103) uses half: Math.floor(getSystemMemoryMB() / 2).
This inconsistency means:
install.shmight pullnemotron-3-super:120bon a 128GB Jetson Thor (reports 128GB)nim.jswould report only 64GB for the same device, potentially failing model requirements checks
🔧 Suggested fix: Align with nim.js by using half of system RAM
# 2) Jetson / Thor fallback: use total system RAM as shared GPU memory budget proxy
if [[ -f /etc/nv_tegra_release ]] || [[ -r /proc/device-tree/model ]]; then
local mem_kb
mem_kb="$(awk '/MemTotal:/ {print $2}' /proc/meminfo)"
if [[ -n "${mem_kb:-}" && "$mem_kb" -gt 0 ]]; then
- echo $((mem_kb / 1024))
+ echo $((mem_kb / 1024 / 2)) # Use half of system RAM as GPU budget (unified memory)
return
fi
fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # 2) Jetson / Thor fallback: use total system RAM as shared GPU memory budget proxy | |
| if [[ -f /etc/nv_tegra_release ]] || [[ -r /proc/device-tree/model ]]; then | |
| local mem_kb | |
| mem_kb="$(awk '/MemTotal:/ {print $2}' /proc/meminfo)" | |
| if [[ -n "${mem_kb:-}" && "$mem_kb" -gt 0 ]]; then | |
| echo $((mem_kb / 1024)) | |
| return | |
| fi | |
| fi | |
| # 2) Jetson / Thor fallback: use total system RAM as shared GPU memory budget proxy | |
| if [[ -f /etc/nv_tegra_release ]] || [[ -r /proc/device-tree/model ]]; then | |
| local mem_kb | |
| mem_kb="$(awk '/MemTotal:/ {print $2}' /proc/meminfo)" | |
| if [[ -n "${mem_kb:-}" && "$mem_kb" -gt 0 ]]; then | |
| echo $((mem_kb / 1024 / 2)) # Use half of system RAM as GPU budget (unified memory) | |
| return | |
| fi | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@install.sh` around lines 380 - 388, The Jetson/Tegra fallback in the
install.sh Jetson detection block currently returns total system RAM but should
match nim.js behavior (which uses Math.floor(getSystemMemoryMB() / 2)); modify
the Jetson branch that computes mem_kb to echo half the system RAM in MB instead
of the full amount (i.e., compute mem_kb -> MB and divide by 2) and then return,
so the echoed value aligns with nim.js’s getSystemMemoryMB halving logic.
| # macOS — report unified memory as VRAM | ||
|
|
||
| # 2) Jetson / Thor fallback: use total system RAM as shared GPU memory budget proxy | ||
| if [[ -f /etc/nv_tegra_release ]] || [[ -r /proc/device-tree/model ]]; then |
There was a problem hiding this comment.
Missing /proc/device-tree/compatible check in get_vram_mb() Jetson fallback.
detect_gpu() (line 351) checks /proc/device-tree/compatible for nvidia,tegra|nvidia,thor, but get_vram_mb() only checks /etc/nv_tegra_release and /proc/device-tree/model.
A device detected solely via compatible would report 0 VRAM, causing incorrect model selection.
🔧 Suggested fix: Add compatible check
# 2) Jetson / Thor fallback: use total system RAM as shared GPU memory budget proxy
- if [[ -f /etc/nv_tegra_release ]] || [[ -r /proc/device-tree/model ]]; then
+ if [[ -f /etc/nv_tegra_release ]] \
+ || { [[ -r /proc/device-tree/compatible ]] && tr '\0' '\n' </proc/device-tree/compatible | grep -qiE 'nvidia,tegra|nvidia,thor'; } \
+ || { [[ -r /proc/device-tree/model ]] && tr '\0' '\n' </proc/device-tree/model | grep -qiE 'jetson|thor|nvidia'; }; then
local mem_kbConsider extracting a shared is_jetson_platform() function to avoid duplicating detection logic.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if [[ -f /etc/nv_tegra_release ]] || [[ -r /proc/device-tree/model ]]; then | |
| if [[ -f /etc/nv_tegra_release ]] \ | |
| || { [[ -r /proc/device-tree/compatible ]] && tr '\0' '\n' </proc/device-tree/compatible | grep -qiE 'nvidia,tegra|nvidia,thor'; } \ | |
| || { [[ -r /proc/device-tree/model ]] && tr '\0' '\n' </proc/device-tree/model | grep -qiE 'jetson|thor|nvidia'; }; then | |
| local mem_kb |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@install.sh` at line 381, The Jetson detection is inconsistent: detect_gpu()
checks /proc/device-tree/compatible for "nvidia,tegra" or "nvidia,thor" but
get_vram_mb() only checks /etc/nv_tegra_release and /proc/device-tree/model, so
devices detected via compatible will return 0 VRAM; update get_vram_mb() to also
test /proc/device-tree/compatible for those strings (or better, create a shared
helper is_jetson_platform() used by both detect_gpu() and get_vram_mb()) so the
same Jetson detection logic is reused and VRAM fallback runs when appropriate.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
bin/lib/nim.js (2)
27-50:⚠️ Potential issue | 🔴 CriticalHoist
isJetsonPlatform()to module scope.Line 223 calls
isJetsonPlatform()outsidedetectGpu(), but the helper now only exists insidedetectGpu(). That raisesReferenceErrorbeforedocker runon every call tostartNimContainer(), not just on Jetson.🔧 Minimal fix
+function isJetsonPlatform() { + try { + const nvTegra = runCapture("test -f /etc/nv_tegra_release && echo yes", { + ignoreError: true, + }); + if (nvTegra && nvTegra.trim() === "yes") return true; + } catch {} + + try { + const compat = runCapture("tr '\\0' '\\n' < /proc/device-tree/compatible", { + ignoreError: true, + }); + if (compat && /nvidia,tegra|nvidia,thor/i.test(compat)) return true; + } catch {} + + try { + const model = runCapture("tr '\\0' '\\n' < /proc/device-tree/model", { + ignoreError: true, + }); + if (model && /jetson|thor|nvidia/i.test(model)) return true; + } catch {} + + return false; +} + function detectGpu() { - function isJetsonPlatform() { - try { - const nvTegra = runCapture("test -f /etc/nv_tegra_release && echo yes", { - ignoreError: true, - }); - if (nvTegra && nvTegra.trim() === "yes") return true; - } catch {} - - try { - const compat = runCapture("tr '\\0' '\\n' < /proc/device-tree/compatible", { - ignoreError: true, - }); - if (compat && /nvidia,tegra|nvidia,thor/i.test(compat)) return true; - } catch {} - - try { - const model = runCapture("tr '\\0' '\\n' < /proc/device-tree/model", { - ignoreError: true, - }); - if (model && /jetson|thor|nvidia/i.test(model)) return true; - } catch {} - - return false; - } - function getSystemMemoryMB() {Also applies to: 223-225
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/nim.js` around lines 27 - 50, The helper isJetsonPlatform() is currently defined inside detectGpu(), but startNimContainer() (and other callers) invoke isJetsonPlatform() at module scope, causing a ReferenceError; move the isJetsonPlatform() function definition out of detectGpu() into the module/global scope so it is available to startNimContainer() and any other top-level callers, then update detectGpu() to call the now-top-level isJetsonPlatform() (no other logic changes required).
118-127:⚠️ Potential issue | 🟡 MinorDon’t mark Jetson as NIM-capable when memory probing failed.
This branch returns
nimCapable: truewith both memory fields at0. That makes the GPU descriptor contradictory and can mislead downstream model gating.🔧 Minimal fix
return { type: "nvidia", name: "NVIDIA Jetson", count: gpuCount, totalMemoryMB: 0, perGpuMB: 0, - nimCapable: true, + nimCapable: false, jetson: true, unifiedMemory: true, };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/nim.js` around lines 118 - 127, The NVIDIA Jetson branch currently returns nimCapable: true even when memory probing failed (totalMemoryMB and perGpuMB are 0), which is contradictory; update the return object in the Jetson branch (the object with fields type, name, count, totalMemoryMB, perGpuMB, nimCapable, jetson, unifiedMemory) so that nimCapable is set to true only when memory probing succeeded (e.g., totalMemoryMB and perGpuMB are non-zero or a probeSuccess flag) and set to false when memory values are 0; ensure any downstream callers still receive the jetson/unifiedMemory flags but rely on nimCapable to indicate usable memory info.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@bin/lib/nim.js`:
- Around line 27-50: The helper isJetsonPlatform() is currently defined inside
detectGpu(), but startNimContainer() (and other callers) invoke
isJetsonPlatform() at module scope, causing a ReferenceError; move the
isJetsonPlatform() function definition out of detectGpu() into the module/global
scope so it is available to startNimContainer() and any other top-level callers,
then update detectGpu() to call the now-top-level isJetsonPlatform() (no other
logic changes required).
- Around line 118-127: The NVIDIA Jetson branch currently returns nimCapable:
true even when memory probing failed (totalMemoryMB and perGpuMB are 0), which
is contradictory; update the return object in the Jetson branch (the object with
fields type, name, count, totalMemoryMB, perGpuMB, nimCapable, jetson,
unifiedMemory) so that nimCapable is set to true only when memory probing
succeeded (e.g., totalMemoryMB and perGpuMB are non-zero or a probeSuccess flag)
and set to false when memory values are 0; ensure any downstream callers still
receive the jetson/unifiedMemory flags but rely on nimCapable to indicate usable
memory info.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 82bb7514-3484-4304-a3d4-06bf8d5a0870
📒 Files selected for processing (1)
bin/lib/nim.js
The pyright pre-push hook type-checks test files that import pytest. Adds pytest to uv run.
8d418f0 to
5b549a6
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
bin/lib/nim.js (1)
212-235:⚠️ Potential issue | 🟠 MajorDon't force-delete containers under a generic
nim-prefix.Line 235 removes any existing container with the computed name, and
nim-${sandboxName}is generic enough to collide with unrelated local containers. Starting NemoClaw can now delete something it does not own; keep a project-specific prefix or verify an ownership label beforedocker rm -f.🔧 Safer default
function containerName(sandboxName) { - return `nim-${sandboxName}`; + return `nemoclaw-nim-${sandboxName}`; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/nim.js` around lines 212 - 235, The current startNimContainer/containerName logic uses a generic "nim-" prefix and unconditionally runs `docker rm -f`, which can delete unrelated containers; change container ownership handling so we only remove containers we own: update containerName to include a project-specific unique prefix (e.g., include a fixed project id or env var) or, better, before calling `docker rm -f` in startNimContainer, verify ownership by checking for a specific Docker label (e.g., "nim.sandbox.owner" or "nim.project=<project>") on the container via `docker ps`/`docker inspect` and only remove if the label matches; alternatively stop/remove without forcing all containers by name if label absent. Ensure references to containerName and startNimContainer are updated to use the new prefix/label scheme consistently.
♻️ Duplicate comments (1)
bin/lib/nim.js (1)
133-142:⚠️ Potential issue | 🟡 MinorTreat a
0 MBunified-memory probe as detection failure.Lines 133-142 and 153-161 still return
nimCapable: truewhengetSystemMemoryMB()comes back0, so downstream code sees a usable NVIDIA target with no memory. Returnnullor setnimCapable: falsewhenever the unified-memory probe fails.🔧 Suggested fix
return { type: "nvidia", name: "NVIDIA Jetson", count: gpuCount, totalMemoryMB: 0, perGpuMB: 0, - nimCapable: true, + nimCapable: false, jetson: true, unifiedMemory: true, };Apply the same
totalMemoryMB > 0guard to the GB10 branch before returningnimCapable: true.Also applies to: 153-161
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/nim.js` around lines 133 - 142, The NVIDIA/Jetson detection currently returns nimCapable: true even when getSystemMemoryMB() yields 0; update the logic in the Jetson branch (the block that builds the object with properties type:"nvidia", name:"NVIDIA Jetson", totalMemoryMB, perGpuMB, nimCapable, jetson, unifiedMemory) and the GB10 branch to treat totalMemoryMB <= 0 as a probe failure by either returning null or setting nimCapable: false instead of true; ensure the same totalMemoryMB > 0 guard is applied before setting nimCapable to true so downstream code never receives a nimCapable device with 0 MB.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/nim.js`:
- Around line 41-63: The code currently uses isJetsonPlatform() (which matches
both Jetson and Thor) to decide halving MemTotal; change the logic so only Thor
triggers the halving: add a new isThorPlatform() (or extend isJetsonPlatform()
to expose a Thor-specific check) that replicates the existing
compatible/model/nv_tegra checks but only matches "thor", then replace the
MemTotal /= 2 decision to use isThorPlatform() instead of isJetsonPlatform();
keep the original isJetsonPlatform() unchanged so regular Jetson boards are not
halved.
- Line 66: Restore detectGpu by exporting it from this module (e.g., add it to
module.exports or export the function so callers can call nim.detectGpu()), and
prevent main() from running on import by wrapping the unconditional call in a
CLI guard (use the Node check like if (require.main === module) { main(); } or
the ESM equivalent) so main() only executes when this file is the entrypoint;
update the exports to include detectGpu and leave main() guarded.
---
Outside diff comments:
In `@bin/lib/nim.js`:
- Around line 212-235: The current startNimContainer/containerName logic uses a
generic "nim-" prefix and unconditionally runs `docker rm -f`, which can delete
unrelated containers; change container ownership handling so we only remove
containers we own: update containerName to include a project-specific unique
prefix (e.g., include a fixed project id or env var) or, better, before calling
`docker rm -f` in startNimContainer, verify ownership by checking for a specific
Docker label (e.g., "nim.sandbox.owner" or "nim.project=<project>") on the
container via `docker ps`/`docker inspect` and only remove if the label matches;
alternatively stop/remove without forcing all containers by name if label
absent. Ensure references to containerName and startNimContainer are updated to
use the new prefix/label scheme consistently.
---
Duplicate comments:
In `@bin/lib/nim.js`:
- Around line 133-142: The NVIDIA/Jetson detection currently returns nimCapable:
true even when getSystemMemoryMB() yields 0; update the logic in the Jetson
branch (the block that builds the object with properties type:"nvidia",
name:"NVIDIA Jetson", totalMemoryMB, perGpuMB, nimCapable, jetson,
unifiedMemory) and the GB10 branch to treat totalMemoryMB <= 0 as a probe
failure by either returning null or setting nimCapable: false instead of true;
ensure the same totalMemoryMB > 0 guard is applied before setting nimCapable to
true so downstream code never receives a nimCapable device with 0 MB.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 58f27f0d-d8e9-4881-be61-af4d480deda1
📒 Files selected for processing (2)
.pre-commit-config.yamlbin/lib/nim.js
✅ Files skipped from review due to trivial changes (1)
- .pre-commit-config.yaml
| function isJetsonPlatform() { | ||
| try { | ||
| const nvTegra = runCapture("test -f /etc/nv_tegra_release && echo yes", { | ||
| ignoreError: true, | ||
| }); | ||
| if (nvTegra && nvTegra.trim() === "yes") return true; | ||
| } catch {} | ||
|
|
||
| try { | ||
| const compat = runCapture("tr '\\0' '\\n' < /proc/device-tree/compatible", { | ||
| ignoreError: true, | ||
| }); | ||
| if (compat && /nvidia,tegra|nvidia,thor/i.test(compat)) return true; | ||
| } catch {} | ||
|
|
||
| try { | ||
| const model = runCapture("tr '\\0' '\\n' < /proc/device-tree/model", { | ||
| ignoreError: true, | ||
| }); | ||
| if (model && /jetson|thor|nvidia/i.test(model)) return true; | ||
| } catch {} | ||
|
|
||
| return false; |
There was a problem hiding this comment.
Only halve unified memory on Thor.
isJetsonPlatform() intentionally matches both Jetson and Thor, but Line 118 halves MemTotal for every match. The Thor-specific fallback in the PR objective now underreports regular Jetson boards by 50%, which can incorrectly disqualify models that should fit; split Thor detection out before applying the / 2 adjustment.
Also applies to: 105-126
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bin/lib/nim.js` around lines 41 - 63, The code currently uses
isJetsonPlatform() (which matches both Jetson and Thor) to decide halving
MemTotal; change the logic so only Thor triggers the halving: add a new
isThorPlatform() (or extend isJetsonPlatform() to expose a Thor-specific check)
that replicates the existing compatible/model/nv_tegra checks but only matches
"thor", then replace the MemTotal /= 2 decision to use isThorPlatform() instead
of isJetsonPlatform(); keep the original isJetsonPlatform() unchanged so regular
Jetson boards are not halved.
| return false; | ||
| } | ||
|
|
||
| function detectGpu() { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify that callers still use bin/lib/nim.js as a library and that this file exports nothing.
printf '\n-- nim.detectGpu() call sites --\n'
rg -n -C2 --type=js --type=ts '\bnim\.detectGpu\s*\('
printf '\n-- exports currently declared in bin/lib/nim.js --\n'
sed -n '1,320p' bin/lib/nim.js | rg -n '^\s*export\s+'Repository: NVIDIA/NemoClaw
Length of output: 1248
Restore detectGpu() as an importable API and guard the CLI entrypoint.
bin/lib/onboard.js at line 405 (and test files) still call nim.detectGpu(), but detectGpu() is no longer exported here and line 279 now runs main() unconditionally on import, breaking the GPU-detection flow before callers can use the symbol.
- Export
detectGpufunction - Guard
main()execution to only run when the file is the CLI entrypoint, not when imported as a module
🔧 Proposed fix
+import { pathToFileURL } from "node:url";-function detectGpu() {
+export function detectGpu() {-main();
+if (process.argv[1] && import.meta.url === pathToFileURL(process.argv[1]).href) {
+ main();
+}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function detectGpu() { | |
| import { pathToFileURL } from "node:url"; | |
| // ... other code ... | |
| export function detectGpu() { | |
| // function body | |
| } | |
| // ... other code ... | |
| if (process.argv[1] && import.meta.url === pathToFileURL(process.argv[1]).href) { | |
| main(); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bin/lib/nim.js` at line 66, Restore detectGpu by exporting it from this
module (e.g., add it to module.exports or export the function so callers can
call nim.detectGpu()), and prevent main() from running on import by wrapping the
unconditional call in a CLI guard (use the Node check like if (require.main ===
module) { main(); } or the ESM equivalent) so main() only executes when this
file is the entrypoint; update the exports to include detectGpu and leave main()
guarded.
Summary
This PR adds basic Jetson/Thor support to NemoClaw install-time GPU detection and NIM container startup.
Changes
/etc/nv_tegra_releasenvidia-smidoes not report desktop-style VRAMinto containers.
Why
On Jetson/Thor, GPU discovery and memory reporting differ from discrete GPU systems. These changes let NemoClaw recognize the platform and proceed further on Jetson devices.
Notes
pyrightfailure resolvingpytestinnemoclaw-blueprint/orchestrator/test_endpoint_validation.pySummary by CodeRabbit
New Features
Bug Fixes
Chores