feat(tooling): SPDX --fix, hook ordering, and strict shellcheck fixes#670
feat(tooling): SPDX --fix, hook ordering, and strict shellcheck fixes#670HagegeR wants to merge 7 commits intoNVIDIA:mainfrom
Conversation
Drop --severity=warning and --exclude=SC1091 so local hooks match strict shellcheck; script issues are fixed in-tree instead.
Add shellcheck shell=bash directive so SC2148 is satisfied without weakening global hook rules.
Use >/path form so shellcheck does not flag spacing around redirects.
Replace SC2015 if/else chains with explicit if/else, add source directives for dynamic sources, fix exit-status checks, and remove unused variables so strict shellcheck passes.
📝 WalkthroughWalkthroughThis pull request introduces automated SPDX license header insertion across the codebase. The Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment Tip CodeRabbit can use Trivy to scan for security misconfigurations and secrets in Infrastructure as Code files.Add a .trivyignore file to your project to customize which findings Trivy reports. |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@package.json`:
- Around line 40-42: The lint-staged glob "nemoclaw-blueprint/**/*.py" is
including blueprint __init__.py files (e.g.
nemoclaw-blueprint/orchestrator/__init__.py) while .pre-commit-config.yaml
explicitly excludes them; update the lint-staged entry for
"nemoclaw-blueprint/**/*.py" to also exclude __init__.py files (for example by
adding a negative glob such as "nemoclaw-blueprint/**/__init__.py" prefixed with
"!" or an equivalent exclusion) so the SPDX header fixer and lint-staged
behavior match.
In `@scripts/check-spdx-headers.sh`:
- Around line 65-82: The temp-file rewrite currently creates a 0600 file and
then mv's it over the original, losing executable bits; before mv "$tmp" "$file"
set the temp file's mode to match the original (e.g. run chmod
--reference="$file" "$tmp" or use stat to get and apply the original mode)
inside the insert_spdx block that creates tmp so the original file permissions
(including +x) are preserved when moving the temp into place.
In `@test/e2e/test-double-onboard.sh`:
- Around line 125-129: The pipeline using echo "$outputN" | grep -q should be
replaced with here-strings to avoid SIGPIPE when running with set -o pipefail:
for each check that currently uses echo "$output1" | grep -q "..." (and the
similar occurrences for output2 and output3 around the other ranges), change to
using grep -q with a here-string (grep -q "pattern" <<< "$output1") so grep
reads directly from the here-string; update the three blocks that test for
"Sandbox '${SANDBOX_A}' created" and the other two sandbox/step checks (the ones
referencing output1/output2/output3 and variables like SANDBOX_A) to use this
here-string form and keep the existing pass/fail calls unchanged.
In `@test/e2e/test-full-e2e.sh`:
- Around line 207-211: Replace the pipeline-based grep checks that read from
shell variables (e.g., echo "$list_output" | grep -Fq -- "$SANDBOX_NAME", echo
"$policy_output" | grep -q ...) with here-strings to avoid pipefail/SIGPIPE
issues; locate the grep invocations that use echo of variables such as
list_output, policy_output (and similar checks around the sandbox/name
assertions) and change them to use grep -Fq -- "$SANDBOX_NAME" <<<"$list_output"
(and grep -q <<<"$policy_output" for the policy checks) so grep reads directly
from the variable without a pipe.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3f48d530-7af6-4c51-aa87-eb084f0456c9
📒 Files selected for processing (8)
.pre-commit-config.yamlpackage.jsonscripts/check-spdx-headers.shscripts/husky-env.shscripts/test-inference-local.shscripts/test-inference.shtest/e2e/test-double-onboard.shtest/e2e/test-full-e2e.sh
| "nemoclaw-blueprint/**/*.py": [ | ||
| "bash scripts/check-spdx-headers.sh", | ||
| "bash scripts/check-spdx-headers.sh --fix", | ||
| "uvx ruff check --fix", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
find nemoclaw-blueprint -type f -name '__init__.py' | sortRepository: NVIDIA/NemoClaw
Length of output: 102
🏁 Script executed:
cat -n .pre-commit-config.yaml | head -100Repository: NVIDIA/NemoClaw
Length of output: 4520
🏁 Script executed:
cat -n package.json | sed -n '35,50p'Repository: NVIDIA/NemoClaw
Length of output: 719
Keep blueprint __init__.py files out of the lint-staged fixer.
.pre-commit-config.yaml explicitly excludes nemoclaw-blueprint/**/__init__.py from the SPDX headers fixer (line 50), but the lint-staged glob nemoclaw-blueprint/**/*.py (lines 40–43) sends every staged blueprint __init__.py file to check-spdx-headers.sh --fix. At least one such file exists (nemoclaw-blueprint/orchestrator/__init__.py). This creates inconsistent behavior between the two hooks and should be excluded from lint-staged as well.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@package.json` around lines 40 - 42, The lint-staged glob
"nemoclaw-blueprint/**/*.py" is including blueprint __init__.py files (e.g.
nemoclaw-blueprint/orchestrator/__init__.py) while .pre-commit-config.yaml
explicitly excludes them; update the lint-staged entry for
"nemoclaw-blueprint/**/*.py" to also exclude __init__.py files (for example by
adding a negative glob such as "nemoclaw-blueprint/**/__init__.py" prefixed with
"!" or an equivalent exclusion) so the SPDX header fixer and lint-staged
behavior match.
| local tmp | ||
| tmp="$(mktemp "${TMPDIR:-/tmp}/nemoclaw-spdx.XXXXXX")" | ||
| { | ||
| IFS= read -r first || true | ||
| if [[ "$first" == '#!'* ]]; then | ||
| printf '%s\n' "$first" | ||
| spdx_block "$style" | ||
| printf '\n' | ||
| cat | ||
| else | ||
| spdx_block "$style" | ||
| printf '\n' | ||
| if [[ -n "${first:-}" ]]; then | ||
| printf '%s\n' "$first" | ||
| fi | ||
| cat | ||
| fi | ||
| } <"$file" >"$tmp" && mv "$tmp" "$file" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
tmpdir="$(mktemp -d)"
trap 'rm -rf "$tmpdir"' EXIT
src="$tmpdir/pre-commit"
printf '#!/usr/bin/env bash\necho ok\n' >"$src"
chmod 755 "$src"
tmp="$(mktemp "${TMPDIR:-/tmp}/nemoclaw-spdx.XXXXXX")"
printf '#!/usr/bin/env bash\n# SPDX-License-Identifier: Apache-2.0\n\necho ok\n' >"$tmp"
mv "$tmp" "$src"
stat -c '%a %n' "$src" 2>/dev/null || stat -f '%Lp %N' "$src"Repository: NVIDIA/NemoClaw
Length of output: 104
🏁 Script executed:
#!/bin/bash
# First, find the script file
find . -name "check-spdx-headers.sh" -type fRepository: NVIDIA/NemoClaw
Length of output: 90
🏁 Script executed:
#!/bin/bash
# Test mktemp + mv behavior without stat (which isn't available)
# Create a test directory
tmpdir="$(mktemp -d)"
trap 'rm -rf "$tmpdir"' EXIT
# Create original file with executable permissions
src="$tmpdir/test-script"
printf '#!/usr/bin/env bash\necho ok\n' >"$src"
chmod 755 "$src"
echo "Original file permissions (using ls -l):"
ls -l "$src"
# Simulate what mktemp does
tmp="$(mktemp "${TMPDIR:-/tmp}/test.XXXXXX")"
printf '#!/usr/bin/env bash\n# Header\necho ok\n' >"$tmp"
echo "Temp file permissions after creation (using ls -l):"
ls -l "$tmp"
# Now do the mv
mv "$tmp" "$src"
echo "Original file permissions after mv (using ls -l):"
ls -l "$src"
# Clean up
rm -f "$src"Repository: NVIDIA/NemoClaw
Length of output: 423
🏁 Script executed:
cat ./scripts/check-spdx-headers.shRepository: NVIDIA/NemoClaw
Length of output: 2527
🏁 Script executed:
# Check if this script is used on executable files/hooks
grep -r "check-spdx-headers.sh" --include="*.md" --include="*.yaml" --include="*.yml" --include="*.json"Repository: NVIDIA/NemoClaw
Length of output: 390
Preserve the original file permissions when rewriting scripts with mktemp and mv.
The insert_spdx function uses mktemp to create a temporary file (which defaults to mode 0600) and then overwrites the original file with mv "$tmp" "$file". Since mv preserves the source file's permissions, any executable script or hook processed by --fix will lose its executable bit (+x), breaking subsequent hook execution and failing shebang/executable validation.
Suggested fix
insert_spdx() {
local file=$1
local style
style=$(comment_style_for "$file")
local tmp
+ local mode
tmp="$(mktemp "${TMPDIR:-/tmp}/nemoclaw-spdx.XXXXXX")"
+ mode="$(stat -c '%a' "$file" 2>/dev/null || stat -f '%Lp' "$file")"
{
IFS= read -r first || true
if [[ "$first" == '#!'* ]]; then
printf '%s\n' "$first"
spdx_block "$style"
printf '\n'
cat
else
spdx_block "$style"
printf '\n'
if [[ -n "${first:-}" ]]; then
printf '%s\n' "$first"
fi
cat
fi
- } <"$file" >"$tmp" && mv "$tmp" "$file"
+ } <"$file" >"$tmp" && chmod "$mode" "$tmp" && mv "$tmp" "$file"
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| local tmp | |
| tmp="$(mktemp "${TMPDIR:-/tmp}/nemoclaw-spdx.XXXXXX")" | |
| { | |
| IFS= read -r first || true | |
| if [[ "$first" == '#!'* ]]; then | |
| printf '%s\n' "$first" | |
| spdx_block "$style" | |
| printf '\n' | |
| cat | |
| else | |
| spdx_block "$style" | |
| printf '\n' | |
| if [[ -n "${first:-}" ]]; then | |
| printf '%s\n' "$first" | |
| fi | |
| cat | |
| fi | |
| } <"$file" >"$tmp" && mv "$tmp" "$file" | |
| local tmp | |
| local mode | |
| tmp="$(mktemp "${TMPDIR:-/tmp}/nemoclaw-spdx.XXXXXX")" | |
| mode="$(stat -c '%a' "$file" 2>/dev/null || stat -f '%Lp' "$file")" | |
| { | |
| IFS= read -r first || true | |
| if [[ "$first" == '#!'* ]]; then | |
| printf '%s\n' "$first" | |
| spdx_block "$style" | |
| printf '\n' | |
| cat | |
| else | |
| spdx_block "$style" | |
| printf '\n' | |
| if [[ -n "${first:-}" ]]; then | |
| printf '%s\n' "$first" | |
| fi | |
| cat | |
| fi | |
| } <"$file" >"$tmp" && chmod "$mode" "$tmp" && mv "$tmp" "$file" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/check-spdx-headers.sh` around lines 65 - 82, The temp-file rewrite
currently creates a 0600 file and then mv's it over the original, losing
executable bits; before mv "$tmp" "$file" set the temp file's mode to match the
original (e.g. run chmod --reference="$file" "$tmp" or use stat to get and apply
the original mode) inside the insert_spdx block that creates tmp so the original
file permissions (including +x) are preserved when moving the temp into place.
| if echo "$output1" | grep -q "Sandbox '${SANDBOX_A}' created"; then | ||
| pass "Sandbox '$SANDBOX_A' created (step 3 completed)" | ||
| else | ||
| fail "Sandbox creation not confirmed in output" | ||
| fi |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
blob="$( { printf 'needle\n'; seq 1 200000; } )"
if printf '%s\n' "$blob" | grep -q 'needle'; then
echo "pipeline returned success"
else
echo "pipeline returned failure"
fiRepository: NVIDIA/NemoClaw
Length of output: 102
🏁 Script executed:
git ls-files test/e2e/test-double-onboard.shRepository: NVIDIA/NemoClaw
Length of output: 90
🏁 Script executed:
cat -n test/e2e/test-double-onboard.sh | head -200 | tail -50Repository: NVIDIA/NemoClaw
Length of output: 2020
🏁 Script executed:
# Create a simpler test without seq to verify SIGPIPE behavior
#!/bin/bash
# Test 1: with pipefail using echo | grep -q
set -euo pipefail
# Create a blob without seq
blob=$(printf 'needle\n'; for i in {1..10000}; do echo "line $i"; done)
echo "=== Test 1: echo pipe grep -q with pipefail ==="
if printf '%s\n' "$blob" | grep -q 'needle'; then
echo "Test 1: PASSED - grep found the pattern"
else
echo "Test 1: FAILED - grep did not find the pattern"
fi
echo ""
echo "=== Test 2: using here-string instead ==="
set +e # disable pipefail for comparison
if grep -q 'needle' <<<"$blob"; then
echo "Test 2: PASSED - grep found the pattern"
else
echo "Test 2: FAILED - grep did not find the pattern"
fiRepository: NVIDIA/NemoClaw
Length of output: 229
🏁 Script executed:
cat -n test/e2e/test-double-onboard.sh | sed -n '20,30p'Repository: NVIDIA/NemoClaw
Length of output: 254
🏁 Script executed:
cat -n test/e2e/test-double-onboard.sh | sed -n '125,130p'Repository: NVIDIA/NemoClaw
Length of output: 284
🏁 Script executed:
cat -n test/e2e/test-double-onboard.sh | sed -n '226,250p'Repository: NVIDIA/NemoClaw
Length of output: 1010
🏁 Script executed:
# Test with much larger output to try to trigger SIGPIPE
#!/bin/bash
set -euo pipefail
# Create a very large blob (simulate real onboard logs)
blob=$(printf 'needle\n'; for i in {1..100000}; do echo "Line $i with some content that makes it bigger"; done)
echo "=== Test with 100k lines ==="
if printf '%s\n' "$blob" | grep -q 'needle'; then
echo "PASSED - grep found pattern despite large output"
else
echo "FAILED - pattern not found or SIGPIPE killed the pipeline"
fiRepository: NVIDIA/NemoClaw
Length of output: 145
🌐 Web query:
bash pipefail grep -q SIGPIPE issue known problem
💡 Result:
This is a well-known interaction between set -o pipefail and commands like grep -q in pipelines.
What’s happening
grep -q patternexits as soon as it finds a match, and in doing so it closes its read end of the pipe. (stackoverflow.com)- The upstream command (the producer) may still be writing; when it writes to a pipe with no reader, it typically gets SIGPIPE and exits with status 128+13 = 141. (blog.philz.dev)
- With
pipefailenabled, bash sets the pipeline’s status to the rightmost non-zero exit in the pipeline (rather than justgrep’s status). (cl.cam.ac.uk)
So you can get a failing pipeline (141) even thoughgrep -qsucceeded (0).
Typical symptom
set -o pipefail
producer | grep -q 'match' # pipeline may exit 141 due to producer SIGPIPECommon mitigations
- Disable
pipefailjust for that pipeline:
(set +o pipefail; producer | grep -q 'match')- Treat SIGPIPE from the producer as non-fatal (if that’s correct for your case):
producer | grep -q 'match'
rcs=("${PIPESTATUS[@]}")
if (( rcs[1] == 0 )) && (( rcs[0] == 0 || rcs[0] == 141 )); then
: matched
fiThis isn’t specific to grep; any downstream stage that intentionally stops early (like head, sed 1q, etc.) can trigger the same SIGPIPE + pipefail behavior. (stackoverflow.com)
Citations:
- 1: https://stackoverflow.com/questions/77843350/bash-surprising-behavior-of-fc-list-grep-q-with-pipefail?utm_source=openai
- 2: https://blog.philz.dev/blog/pipefail-fail/?utm_source=openai
- 3: https://www.cl.cam.ac.uk/teaching/2526/UnixTools/bash.pdf?utm_source=openai
- 4: https://stackoverflow.com/questions/32684119/exit-when-one-process-in-pipe-fails?utm_source=openai
Replace echo "$outputN" | grep -q with here-string syntax to avoid SIGPIPE failures under pipefail.
The script runs with set -o pipefail. When output1, output2, or output3 contain large logs, grep -q exits immediately after finding a match and closes the pipe. The upstream echo then receives SIGPIPE and fails. With pipefail, this causes the pipeline to exit non-zero even though the pattern was found, triggering false test failures.
Applies to lines 125–129, 175–197, and 226–248.
Fix
Replace all instances of:
- if echo "$outputN" | grep -q "pattern"; then
+ if grep -q "pattern" <<<"$outputN"; then📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if echo "$output1" | grep -q "Sandbox '${SANDBOX_A}' created"; then | |
| pass "Sandbox '$SANDBOX_A' created (step 3 completed)" | |
| else | |
| fail "Sandbox creation not confirmed in output" | |
| fi | |
| if grep -q "Sandbox '${SANDBOX_A}' created" <<<"$output1"; then | |
| pass "Sandbox '$SANDBOX_A' created (step 3 completed)" | |
| else | |
| fail "Sandbox creation not confirmed in output" | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/test-double-onboard.sh` around lines 125 - 129, The pipeline using
echo "$outputN" | grep -q should be replaced with here-strings to avoid SIGPIPE
when running with set -o pipefail: for each check that currently uses echo
"$output1" | grep -q "..." (and the similar occurrences for output2 and output3
around the other ranges), change to using grep -q with a here-string (grep -q
"pattern" <<< "$output1") so grep reads directly from the here-string; update
the three blocks that test for "Sandbox '${SANDBOX_A}' created" and the other
two sandbox/step checks (the ones referencing output1/output2/output3 and
variables like SANDBOX_A) to use this here-string form and keep the existing
pass/fail calls unchanged.
| if echo "$list_output" | grep -Fq -- "$SANDBOX_NAME"; then | ||
| pass "nemoclaw list contains '${SANDBOX_NAME}'" | ||
| else | ||
| fail "nemoclaw list does not contain '${SANDBOX_NAME}'" | ||
| fi |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
blob="$( { printf 'needle\n'; seq 1 200000; } )"
if printf '%s\n' "$blob" | grep -q 'needle'; then
echo "pipeline returned success"
else
echo "pipeline returned failure"
fiRepository: NVIDIA/NemoClaw
Length of output: 102
🏁 Script executed:
# First, let's examine the test file to see what's actually at those lines
head -n 350 test/e2e/test-full-e2e.sh | tail -n +200Repository: NVIDIA/NemoClaw
Length of output: 5892
🏁 Script executed:
# Also check if set -uo pipefail is actually set
grep -n "set.*pipefail" test/e2e/test-full-e2e.shRepository: NVIDIA/NemoClaw
Length of output: 78
🏁 Script executed:
# Let's verify the SIGPIPE issue with a corrected test that doesn't require seq
cat > /tmp/test_pipefail.sh << 'EOF'
#!/bin/bash
set -euo pipefail
# Create a blob without seq
blob="$( { printf 'needle\n'; printf 'line2\n'; printf 'line3\n'; } )"
echo "=== Test 1: echo | grep -q (potentially problematic) ==="
if printf '%s\n' "$blob" | grep -q 'needle'; then
echo "Test 1: pipeline returned success"
else
echo "Test 1: pipeline returned failure"
fi
echo "=== Test 2: here-string with grep -q (suggested fix) ==="
if grep -q 'needle' <<<"$blob"; then
echo "Test 2: here-string returned success"
else
echo "Test 2: here-string returned failure"
fi
EOF
bash /tmp/test_pipefail.shRepository: NVIDIA/NemoClaw
Length of output: 243
Use here-strings for these variable-based grep -q checks.
The outputs are already in shell variables, so echo "$policy_output" | grep -q ... introduces a pipefail/SIGPIPE risk where grep -q exits early after finding a match, causing a broken pipe to the echo process. With set -uo pipefail enabled at line 27, this can cause false assertion failures. Use here-strings instead: grep -q ... <<<"$policy_output" is deterministic and avoids this hazard.
Suggested fix
- if echo "$policy_output" | grep -qi "network_policies"; then
+ if grep -qi "network_policies" <<<"$policy_output"; thenThe same change applies to lines 207–211, 226–230, and 345–349.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/test-full-e2e.sh` around lines 207 - 211, Replace the pipeline-based
grep checks that read from shell variables (e.g., echo "$list_output" | grep -Fq
-- "$SANDBOX_NAME", echo "$policy_output" | grep -q ...) with here-strings to
avoid pipefail/SIGPIPE issues; locate the grep invocations that use echo of
variables such as list_output, policy_output (and similar checks around the
sandbox/name assertions) and change them to use grep -Fq -- "$SANDBOX_NAME"
<<<"$list_output" (and grep -q <<<"$policy_output" for the policy checks) so
grep reads directly from the variable without a pipe.
Summary
Adds --fix option to scripts/check-spdx-headers.sh (with exclusions for init.py), run the corresponding hook at higher priority with the new option, and align to lint-staged in package.json.
Fill missing SPDX headers on shell scripts, align shellcheck with default pre-commit rules (no relaxed args), and updates husky-env.sh, test-inference*.sh, and e2e scripts so strict shellcheck passes.
Changes
Type of Change
Testing
make checkpasses.npm testpasses.make docsbuilds without warnings. (for doc-only changes)Checklist
General
Code Changes
make formatapplied (TypeScript and Python).Doc Changes
update-docsagent skill to draft changes while complying with the style guide. For example, prompt your agent with "/update-docscatch up the docs for the new changes I made in this PR."Summary by CodeRabbit
New Features
Chores