diff --git a/SKILL.md b/SKILL.md index 4a0cfd4..7913819 100644 --- a/SKILL.md +++ b/SKILL.md @@ -1,255 +1,105 @@ --- name: marimo-pair description: >- - Work inside a running marimo notebook's kernel — execute code, create cells, - and build a notebook as an artifact. Use when the user wants to start a - marimo notebook or work in an active marimo session. + Execute code in a running marimo notebook via HTTP API. Use ONLY when the + user explicitly asks to work with a marimo notebook or marimo session. allowed-tools: Bash(bash **/scripts/discover-servers.sh *), Bash(bash **/scripts/execute-code.sh *), Read --- # marimo Pair Programming Protocol -This skill gives you full access to a running marimo notebook. You can read -cell code, create and edit cells, install packages, run cells, and inspect -the reactive graph — all programmatically. The user sees results live in their -browser while you work through bundled scripts or MCP. - -## Philosophy - -marimo notebooks are a dataflow graph — cells are the fundamental unit of -computation, connected by the variables they define and reference. When a cell -runs, marimo automatically re-executes downstream cells. You have full access -to the running notebook. - -- **Cells are your main lever.** Use them to break up work and choose how and - when to bring the human into the loop. Not every cell needs rich output — - sometimes the object itself is enough, sometimes a summary is better. - Match the presentation to the intent. -- **Understand intent first.** When clear, act. When ambiguous, clarify. -- **Follow existing signal.** Check imports, `pyproject.toml`, existing cells, - and `dir(ctx)` before reaching for external tools. -- **Stay focused.** Build first, polish later — cell names, layout, and styling - can wait. - -## Prerequisites - -### How to invoke marimo - -Only servers started with `--no-token` register in the local server registry -and are auto-discoverable — starting without a token makes discovery easier. -If a server has a token, set the `MARIMO_TOKEN` environment variable before -calling the execute script (avoids leaking the token in process listings). The -right way to invoke marimo depends on context (project -tooling, global install, sandbox mode). See -[finding-marimo.md](reference/finding-marimo.md) for the full decision tree. - -**Do NOT use `--headless` unless the user asks for it.** Omitting it lets -marimo auto-open the browser, which is the expected pairing experience. If the -user explicitly requests headless, offer to open `http://localhost:` -in their browser (`open` on macOS, `xdg-open` on Linux, `start` on Windows). - -## Troubleshooting - -### `SyntaxError` or `ImportError` from `execute-code.sh` - -Code runs **inside the running marimo kernel** — `execute-code.sh` POSTs it -over HTTP and never invokes a local Python. So errors here are not caused by -the local Python version, missing venv, or `uv` vs `pip` — they're problems -with the code being sent. Fix the code (use a heredoc for anything -multiline; don't try to one-line compound statements with `;`). - -### User keeps getting prompted to allow Bash commands - -The skill declares `allowed-tools` in its frontmatter, but Claude Code may -still prompt for each Bash call. To fix this, the user should add the absolute -paths to the scripts to their `.claude/settings.json` (project-level) or -`~/.claude/settings.json` (global): - -```json -{ - "permissions": { - "allow": [ - "Bash(bash /absolute/path/to/skills/marimo-pair/scripts/discover-servers.sh *)", - "Bash(bash /absolute/path/to/skills/marimo-pair/scripts/execute-code.sh *)" - ] - } -} -``` - -## How to Discover Servers and Execute Code - -Two operations: **discover servers** and **execute code**. - -| Operation | Script | MCP | -|-----------|--------|-----| -| Discover servers | `bash scripts/discover-servers.sh` | `list_sessions()` tool | -| Execute code | `bash scripts/execute-code.sh -c "code"` | `execute_code(code=..., session_id=...)` tool | -| Execute code (multiline) | `bash scripts/execute-code.sh <<'EOF'` | same | -| Execute code (by URL) | `bash scripts/execute-code.sh --url http://localhost:2718 -c "code"` | same (with `url` param) | - -Scripts auto-discover sessions from the local server registry. Use -`--port` to target a specific server when multiple are running, -`--session` to target a specific session when multiple notebooks are -open on the same server, or `--url` to skip discovery and connect to a -server by URL (e.g. `--url http://localhost:2718`). **On Windows, prefer -direct `--url` when registry discovery is empty** — see the next section -for why. Set the `MARIMO_TOKEN` env var to authenticate when the server -has token auth enabled (`--token` flag also works but exposes the token -in process listings). If the server was started with `--mcp`, you'll -have MCP tools available as an alternative. - -### Discovery finds nothing but the user has a server running? - -Only `--no-token` servers are in the registry. If discovery comes up empty, -the server likely has token auth — ask the user for the token and set it as -the `MARIMO_TOKEN` environment variable. - -On **Windows (Git Bash / MSYS2)**, discovery can also come up empty even for -a running `--no-token` server. If the user confirms marimo is reachable -locally, fall back to `--url http://127.0.0.1:` (ask for the port). - -### No servers running? +Pair-program inside a running marimo notebook. You execute code via bundled +scripts that talk to marimo's HTTP API — no marimo install needed on your side. -**Always discover before starting.** Background task "completed" notifications -do not mean the server died — check the output or run discover first. +## Discover and Execute -If no servers are found, read the user's intent — if they want a notebook, -start one. **Always start marimo as a background task** (using -`run_in_background` on the Bash tool) so the server automatically gets cleaned -up when the session ends and doesn't block the conversation. See -[finding-marimo.md](reference/finding-marimo.md). - -If there's no `.py` file yet, pick a descriptive filename based on context -(e.g., `exploration.py`, `analysis.py`, `dashboard.py`). Don't ask — just -pick something reasonable. +```bash +# find running servers +bash scripts/discover-servers.sh -**Avoid shell escaping issues.** `-c` works for simple one-liners, but for -multiline code or code with quotes/backticks/`${}`, use a heredoc or a file: +# execute code (one-liner) +bash scripts/execute-code.sh -c "1 + 1" -```bash -# heredoc (single-quoted delimiter prevents shell interpolation) +# execute code (multiline — use heredoc, NOT -c with semicolons) bash scripts/execute-code.sh <<'EOF' import marimo._code_mode as cm - async with cm.get_context() as ctx: ctx.create_cell("x = 1") EOF -# file -bash scripts/execute-code.sh /tmp/code.py - -# target a specific port (skips auto-selection when multiple servers run) -bash scripts/execute-code.sh --port 2718 -c "1 + 1" +# target specific server +bash scripts/execute-code.sh --port 2718 -c "print('hello')" +bash scripts/execute-code.sh --url http://localhost:2718 -c "print('hello')" ``` -## Executing Code +Use `--session ID` to target a specific notebook when multiple are open +on the same server. -Every execute-code call runs inside the notebook's kernel. All cell variables -are in scope — `print(df.head())` just works. Nothing you define persists -between calls (variables, imports, side-effects all reset), but you can freely -introspect the notebook: inspect variables, test code snippets, check types -and shapes. Use this to explore, prototype, and validate before committing -anything to the notebook — then create cells to persist state and make results -visible to the user. +Auth: set `MARIMO_TOKEN` env var if the server has token auth. +Only `--no-token` servers are auto-discoverable in the registry. -To mutate the notebook's dataflow graph — create, edit, and delete cells, -install packages, and run cells — use `marimo._code_mode`: +## Starting marimo -```python -import marimo._code_mode as cm +**Always discover before starting.** If no server is running, start one +as a **background task** (use `run_in_background` on the Bash tool): -async with cm.get_context() as ctx: - cid = ctx.create_cell("x = 1") - ctx.packages.add("pandas") - ctx.run_cell(cid) +```bash +# inside a uv project with marimo in deps +uv run marimo edit notebook.py --no-token +# outside a project +uvx marimo@latest edit notebook.py --no-token --sandbox ``` -You **must** use `async with` — without it, operations silently do nothing. -All `ctx.*` methods are **synchronous** — they queue operations and the -context manager flushes them on exit. Do **not** `await` them. - -The kernel supports top-level `await`, so use `async with` directly. Do -**not** wrap calls in `async def main(): ...` + `asyncio.run(main())` — it's -unnecessary and easy to get wrong (compound statements like `async with` -can't follow `def name():` on the same line, so cramming it into a `-c` -one-liner produces a `SyntaxError`). - -**Cells are not auto-executed.** `create_cell` and `edit_cell` are structural -changes only — use `run_cell` to queue execution. - -`code_mode` is a tested, safe API for notebook mutations — prefer it for all -structural changes. You also have access to marimo internals from the kernel, -but treat that as a last resort and only with high confidence after exploration. +Do NOT use `--headless` unless the user asks. -**Edit cells through `code_mode`, never the file system. Direct file writes -are silently lost.** It is tempting to reach for `Edit`/`Write` for a small -tweak since `edit_cell` requires the full new cell body. Don't — without -`--watch` (off by default) the kernel never sees those edits and overwrites -them on its next save, so the user sees nothing. (`Read` on the `.py` is -okay, but content may lag the live kernel; prefer `ctx.cells[target].code`.) - -**UI state lives outside the reactive graph.** Anywidget traitlets can be read -or set directly (e.g., `slider.value = 5`). For `mo.ui.*` elements, use -`ctx.set_ui_value(element, new_value)` inside `code_mode`. +## Executing Code -### First Step: Explore the API +Code runs in the notebook kernel. Variables from executed cells are in scope +(cells that haven't been run yet in this session are not available). Nothing +persists between calls (variables, imports reset), but you can inspect state. -The `code_mode` API can change between marimo versions. Explore it at the -start of each session — dig deeper into anything you're unsure about. +To mutate the notebook (create/edit/delete cells, install packages): ```python import marimo._code_mode as cm -help(cm) +async with cm.get_context() as ctx: + cid = ctx.create_cell("x = 1") + ctx.packages.add("pandas") + ctx.run_cell(cid) + ctx.edit_cell(cid, code="x = 2") ``` -## Guard Rails +- **`async with` is required** — without it, operations silently do nothing. + Use it directly (kernel supports top-level await). Do NOT wrap in + `async def main()` + `asyncio.run()`. +- `ctx.*` methods are synchronous — they queue; the context manager flushes. + Do NOT `await` them. +- `create_cell`/`edit_cell` are structural — use `run_cell` to execute. +- Explore the API with `help(cm)` at the start of each session. -Skip these and the UI breaks: +## Critical Rules +- **NEVER `Edit`/`Write` the `.py` file while a session is running.** Direct + writes are silently destroyed. Use `ctx.edit_cell()` for all changes. + (`Read` is okay but may lag — prefer `ctx.cells[target].code`.) - **Install packages via `ctx.packages.add()`, not `uv add` or `pip`.** - The code API handles kernel restarts and dependency resolution correctly. - Only fall back to external CLIs if the API is unavailable or fails. -- **Custom widget = anywidget.** For bespoke visual components, use anywidget - with HTML/CSS/JS. Composed `mo.ui` is fine for simple forms and controls. - See [rich-representations.md](reference/rich-representations.md). -- **NEVER `Edit`, `Write`, or `NotebookEdit` the notebook `.py` file while a - session is running. Direct writes are silently destroyed and never reach the - user.** marimo only watches the file with `--watch`, which is off by - default. Without it, the kernel doesn't pick up file edits — and on its - next save, the kernel writes its own state and clobbers yours. The user sees - no change, you think the work landed, and the bug is invisible. Always use - `ctx.edit_cell(target, code=...)` with the full new cell body — even for a - one-character change. (`Read` is allowed, but disk content may lag the live - kernel; for the current truth prefer `ctx.cells[target].code`.) -- **No temp-file deps in cells.** `pathlib.Path("/tmp/...")` in cell code is a bug. -- **Avoid empty cells.** Prefer `edit_cell` into existing empty cells rather - than creating new ones. Clean up any cells that end up empty after edits. -- **Don't worry about cell names.** Most cells don't need explicit names — - see [notebook-improvements.md](reference/notebook-improvements.md#cell-names). - -## Widgets and Reactivity - -Anywidget state (traitlets) lives outside marimo's reactive graph. To hook a -widget trait into the graph, pick one strategy per widget — never mix them: - -- **`mo.state` + `.observe()`** — you pick specific traits to bridge. Default choice. -- **`mo.ui.anywidget()`** — wraps all synced traits into one reactive `.value`. Convenient but coarser. - -Read [rich-representations.md](reference/rich-representations.md) before wiring either. - -## Keep in Mind - -- **The user is editing too.** The notebook can change between your calls — - re-inspect notebook state if it's been a while since you last looked. +- **No temp-file deps in cells** (`/tmp/...` paths break on restart). +- **Variables with `_` prefix are cell-private** (can't reference from other cells). +- **Duplicate public imports across cells** cause `Multiply-defined names` errors. - **Deletions are destructive.** Deleting a cell removes its variables from - kernel memory — restoring means recreating the cell and re-running it and - its dependents. If intent seems ambiguous, ask first. -- **Installing packages changes the project.** `ctx.packages.add()` adds - real dependencies — confirm when it's not obvious from context. + kernel memory. If intent is ambiguous, ask first. +- **Installing packages changes the project** — confirm when not obvious. +- **The user is editing too** — re-inspect notebook state if it's been a while. + +## Widgets + +For `mo.ui.*` elements, use `ctx.set_ui_value(element, new_value)` in code_mode. +For anywidgets, set traitlets directly: `widget.value = 5`. -## References +## Reference docs (read on demand) -- [finding-marimo.md](reference/finding-marimo.md) — how to find and invoke the right marimo -- [gotchas.md](reference/gotchas.md) — cached module proxies and other traps -- [rich-representations.md](reference/rich-representations.md) — custom widgets and visualizations -- [notebook-improvements.md](reference/notebook-improvements.md) — improving existing notebooks +Detailed guides are in `reference/` — read them when you need specifics: +- `reference/finding-marimo.md` — invocation decision tree (uv, pixi, global, sandbox) +- `reference/gotchas.md` — cached module proxies, polars+pyarrow workaround +- `reference/rich-representations.md` — anywidget, `_display_()`, reactive widgets +- `reference/notebook-improvements.md` — setup cells, `mo.persistent_cache`