test(e2e): migrate test-skill-agent-e2e.sh to vitest#5222
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds gated live Vitest scenarios (skill-agent, gateway-guard-recovery), extends Sandbox/Gateway fixture clients with in-sandbox probe/disruption helpers, adds unit tests and ScriptedRunner, updates workflow to accept a jobs selector with new free-standing jobs, and tightens workflow-boundary validation and artifact rules. ChangesE2E Tests & Fixtures
Workflows, Validators & CI
Sequence Diagram(s)sequenceDiagram
participant GitHubActions
participant Job as skill-agent-vitest
participant Docker
participant Nemoclaw as NemoclawOnboard
participant Sandbox
participant SkillInjector
participant Agent
GitHubActions->>Job: triggered via inputs.jobs selector
Job->>Docker: attempt docker login (up to 3 tries) / anonymous fallback
Job->>Nemoclaw: run onboarding with injected env
Nemoclaw->>Sandbox: create/recreate sandbox
Job->>SkillInjector: run add-skill helper to inject fixture
SkillInjector->>Sandbox: place SKILL.md fixture
Job->>Agent: run verification script (retry loop)
Agent->>Sandbox: read SKILL.md and verify
Agent-->>Job: exit code + delimited agent output
Job->>Job: evaluate output, detect flakes, write scenario-result.json
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
Vitest E2E Scenario RecommendationRequired Vitest E2E scenarios: Dispatch required Vitest E2E scenarios:
Full Vitest E2E advisor summaryVitest E2E Scenario AdvisorBase: Required Vitest E2E scenarios
Optional Vitest E2E scenarios
Relevant changed files
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/e2e-scenario/live/skill-agent.test.ts (1)
74-85: ⚡ Quick winConsider refactoring the inline shell script for readability.
Line 78 contains a 500+ character inline shell script that is difficult to read and maintain. Consider extracting it to a multi-line template literal or a separate helper function.
♻️ Suggested refactor
async function verifySkillFixturePresent( sandbox: SandboxClient, sandboxName: string, ): Promise<boolean> { - const script = `token=${shellQuote(VERIFY_PHRASE)}; skill=${shellQuote(SKILL_ID)}; found=0; for path in "/sandbox/.openclaw/skills/${SKILL_ID}/SKILL.md" "\${HOME:-/home/sandbox}/.openclaw/skills/${SKILL_ID}/SKILL.md" "/home/sandbox/.openclaw/skills/${SKILL_ID}/SKILL.md" "/home/openclaw/.openclaw/skills/${SKILL_ID}/SKILL.md"; do if [ -f "$path" ] && grep -Fq "$token" "$path"; then echo "SKILL_TOKEN_PATH=$path"; found=1; fi; done; test "$found" = 1`; + const script = ` + token=${shellQuote(VERIFY_PHRASE)} + skill=${shellQuote(SKILL_ID)} + found=0 + for path in \\ + "/sandbox/.openclaw/skills/${SKILL_ID}/SKILL.md" \\ + "\${HOME:-/home/sandbox}/.openclaw/skills/${SKILL_ID}/SKILL.md" \\ + "/home/sandbox/.openclaw/skills/${SKILL_ID}/SKILL.md" \\ + "/home/openclaw/.openclaw/skills/${SKILL_ID}/SKILL.md" + do + if [ -f "$path" ] && grep -Fq "$token" "$path"; then + echo "SKILL_TOKEN_PATH=$path" + found=1 + fi + done + test "$found" = 1 + `.trim(); const result = await sandbox.execShell(sandboxName, trustedSandboxShellScript(script), {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-scenario/live/skill-agent.test.ts` around lines 74 - 85, The inline one-liner shell script assigned to the variable script in verifySkillFixturePresent is hard to read and maintain; refactor by extracting the shell logic into a clearly formatted multi-line template literal or a small helper function (e.g., buildVerifySkillScript or verifySkillShell) and then pass the returned script into trustedSandboxShellScript when calling sandbox.execShell; keep the same token/tokenization (VERIFY_PHRASE, SKILL_ID) and preserve env/timeout/artifactName behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/e2e-scenario/live/skill-agent.test.ts`:
- Around line 74-85: The inline one-liner shell script assigned to the variable
script in verifySkillFixturePresent is hard to read and maintain; refactor by
extracting the shell logic into a clearly formatted multi-line template literal
or a small helper function (e.g., buildVerifySkillScript or verifySkillShell)
and then pass the returned script into trustedSandboxShellScript when calling
sandbox.execShell; keep the same token/tokenization (VERIFY_PHRASE, SKILL_ID)
and preserve env/timeout/artifactName behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 1426949d-9f5d-4057-a085-174ac87fe5ca
📒 Files selected for processing (1)
test/e2e-scenario/live/skill-agent.test.ts
PR Review AdvisorFindings: 0 needs attention, 1 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Consider writing more tests for
This is an automated advisory review. A human maintainer must make the final merge decision. |
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27354513166
|
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27355031920
|
Vitest E2E Scenario Results — ❌ Some jobs failedRun: 27364843413
|
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27365416785
|
Vitest E2E Scenario Results — ❌ Some jobs failedRun: 27365712997
|
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27368166084
|
Vitest E2E Scenario Results — ❌ Some jobs failedRun: 27368570241
|
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27369508373
|
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27370270382
|
Vitest E2E Scenario Results — ✅ All jobs passedRun: 27371575011
|
Vitest E2E Scenario Results — ❌ Some jobs failedRun: 27371918493
|
…l-agent-e2e # Conflicts: # .github/workflows/e2e-vitest-scenarios.yaml # test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts # tools/e2e-scenarios/workflow-boundary.mts
Summary
Migrate
test/e2e/test-skill-agent-e2e.shwith the simplest equivalent Vitest coverage.Related Issues
Refs #5098
Refs #2644
Contract mapping
test/e2e-scenario/live/skill-agent.test.tschecksdocker infoandNVIDIA_API_KEYshape.skill-agent.test.tsrunsnode bin/nemoclaw.js onboard --fresh --non-interactive --yes --yes-i-accept-third-party-softwarewithNEMOCLAW_SANDBOX_NAME=e2e-skill-agent.install.sh, shell profile/NVM/local-bin sourcing, installednemoclawlookup); this conversion preserves the skill-agent/onboard boundary and leaves installer fidelity to existing installer coverage / separate migrations.skill-smoke-fixtureis injected and queryable in sandbox skill roots.skill-agent.test.tsinvokes existingadd-sandbox-skill.shand verifiesSKILL_SMOKE_VERIFY_K9X2viaopenshell sandbox exec.SKILL.mdand returnsSKILL_SMOKE_VERIFY_K9X2, with retry/fuzzy token matching and external provider/transport timeout handling.skill-agent.test.tsinvokes existingverify-sandbox-skill-via-agent.sh, retries, fuzzy-checks the delimited agent output section, preserves helper fail-closed markers before fuzzy token success, and only skips timeout/rate-limit external flakes after fixture presence is proven; OpenClaw tool/runtime errors fail closed.Simplicity check
.github/workflows/nightly-e2e.yamljobskill-agent-e2e,ubuntu-latest, Docker/OpenShell,NVIDIA_API_KEY, 30 minute timeout..github/workflows/e2e-vitest-scenarios.yamljobskill-agent-vitest, sameubuntu-latest+ Docker/OpenShell +NVIDIA_API_KEYrunner class, 30 minute timeout.skill-agent-vitestjob, validates it in the workflow boundary test, and adds allowlisted free-standingjobs=dispatch support ine2e-vitest-scenarios.yaml.gh workflow run e2e-vitest-scenarios.yaml --repo NVIDIA/NemoClaw --ref e2e-migrate/test-skill-agent-e2e -f jobs=skill-agent-vitest -f pr_number=5222.Verification
npm run build:clinpx tsc --noEmit --allowImportingTsExtensions --module NodeNext --moduleResolution NodeNext --target ES2022 --types vitest,node test/e2e-scenario/live/skill-agent.test.ts test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts tools/e2e-scenarios/workflow-boundary.mtsNEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/skill-agent.test.ts --silent=false --reporter=default(local: classifier tests pass; live test skips withoutNVIDIA_API_KEY)npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts --silent=false --reporter=defaultnpx @biomejs/biome check test/e2e-scenario/live/skill-agent.test.ts test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.tsgit diff --checkskill-agent-vitestpassed.skill-agent-vitestpassed.Summary by CodeRabbit
New Features
Tests
Chores