-
Notifications
You must be signed in to change notification settings - Fork 351
Description
Agent Diagnostic
Loaded skill: openshell-cli (from .agents/skills/openshell-cli/SKILL.md)
Checked Quick Reference table — confirmed correct syntax is:
openshell sandbox create -- claude ← double-dash (run command)
openshell sandbox create --from openclaw ← --from (community image)
Following Workflow 1 (Getting Started) from the skill:
Step 1 — gateway start:
openshell gateway start --port 8082 --recreate
✓ Gateway ready at https://127.0.0.1:8082
Step 3 — sandbox create:
openshell sandbox create --from claude
(used --from instead of -- by mistake — this is the bug trigger)
Pod did not start. Ran openshell doctor exec per debug-openshell-cluster skill:
openshell doctor exec -- kubectl describe pod -n openshell
Output (attempt 1, no registry token):
Failed to pull image "ghcr.io/nvidia/openshell-community/sandboxes/claude:latest"
failed to authorize: unexpected status 403 Forbidden
Skill says --registry-token can be used for private community images.
Added GitHub PAT, recreated gateway. Re-ran sandbox create --from claude.
Output (attempt 2, with registry token):
Failed to pull image "ghcr.io/nvidia/openshell-community/sandboxes/claude:latest"
rpc error: code = NotFound
ghcr.io/nvidia/openshell-community/sandboxes/claude:latest: not found
Checked GHCR package list for nvidia/openshell-community — images present:
sandboxes/base, sandboxes/openclaw, sandboxes/openclaw-nvidia, sandboxes/ollama
sandboxes/claude does NOT exist.
Consulted skill Workflow 5 (BYOC) — --from resolves a name to
ghcr.io/nvidia/openshell-community/sandboxes/:latest.
claude is not a community image. It is an agent binary inside sandboxes/base.
Correct command per Quick Reference table:
openshell sandbox create -- claude
✓ Sandbox created, claude launched successfully.
Root cause: --from accepts any string without validating against the known
community image catalog. Agent names (claude, opencode, codex) are not valid
--from values but the CLI does not reject them — it lets Kubernetes fail at
image pull time with an error that gives no hint the CLI syntax is wrong.
###Description
openshell sandbox create --from claude is accepted by the CLI but fails at
runtime with ImagePullBackOff because sandboxes/claude:latest does not exist.
The error output (403 or "not found") gives no indication that the syntax is
wrong. The user is sent down an auth debugging path for what is a syntax error.
The --from flag accepts four kinds of values (per Workflow 5 in openshell-cli skill):
- community image name → openshell sandbox create --from openclaw
- local Dockerfile path → openshell sandbox create --from ./Dockerfile
- image reference → openshell sandbox create --from registry.io/img:v1
- directory → openshell sandbox create --from ./my-sandbox-dir
claude, opencode, and codex are not community image names — they are agent
commands run inside sandboxes/base using the double-dash syntax:
openshell sandbox create -- claude
Expected: the CLI should reject --from at parse time:
error: 'claude' is not a community sandbox image
hint: to run claude as an agent, use: openshell sandbox create -- claude
Reproduction Steps
- uv tool install -U openshell
- openshell gateway start
- openshell sandbox create --from claude
- Pod enters ErrImagePull / ImagePullBackOff
- openshell doctor exec -- kubectl describe pod -n openshell
→ Error says 403 or "not found" — no hint that the syntax is wrong - openshell sandbox create -- claude ← this is the correct syntax
Environment
OS: macOS 14.4 (Apple Silicon)
Docker: Docker Desktop 28.1.1
OpenShell: v0.0.9
Logs
# Without registry token:
Failed to pull image "ghcr.io/nvidia/openshell-community/sandboxes/claude:latest":
failed to authorize: failed to fetch anonymous token: unexpected status 403 Forbidden
# With registry token:
rpc error: code = NotFound
ghcr.io/nvidia/openshell-community/sandboxes/claude:latest: not foundAgent-First Checklist
- I pointed my agent at the repo and had it investigate this issue
- I loaded relevant skills (e.g.,
debug-openshell-cluster,debug-inference,openshell-cli) - My agent could not resolve this — the diagnostic above explains why