Skip to content

[WIP] Reduce SKILL.md token consumption#46

Draft
atemate wants to merge 2 commits into
marimo-team:mainfrom
atemate:reduce-token-consumption
Draft

[WIP] Reduce SKILL.md token consumption#46
atemate wants to merge 2 commits into
marimo-team:mainfrom
atemate:reduce-token-consumption

Conversation

@atemate
Copy link
Copy Markdown

@atemate atemate commented May 4, 2026

This pull request was authored by a coding agent.

Summary

SKILL.md is injected into every agent conversation's system prompt. At 12KB it's the single largest contributor to per-conversation token cost. The 4 linked reference docs add another 17KB when the agent eagerly Reads them (which it does often, given the broad trigger description).

This PR reduces the skill body from 12KB → 4KB (67%) while preserving all critical rules, safety warnings, and on-demand access to detailed guides.

What changed

  • Tighten trigger description — from "Work inside a running marimo notebook" (fires broadly) to "Use ONLY when the user explicitly asks to work with a marimo notebook" (fewer false activations)
  • Collapse verbose sections to essential commands and rules
  • Inline key gotchas (private vars, duplicate imports, destructive deletions) instead of linking to reference docs
  • Reference docs listed as plain-text paths — agent knows they exist and can read them on demand, but is no longer prompted to eagerly load 17KB of anywidget tutorials, finding-marimo decision trees, etc.
  • Cut content that doesn't affect agent behavior — philosophy section, Windows-specific notes, troubleshooting for one-time setup issues, MCP table (redundant with script --help)
  • Fix inaccuracy: clarified that cell variables are only in scope after cells have been executed, not just defined (found during sub-agent testing)

What's preserved

All critical rules and behavioral guidance:

  • Never edit .py files directly — use ctx.edit_cell()
  • async with is required — without it, operations silently do nothing
  • Use ctx.packages.add() instead of pip/uv add
  • Heredoc for multiline code (not -c with semicolons)
  • create_cell/edit_cell are structural — use run_cell to execute
  • Private variable scoping (_ prefix), duplicate import errors
  • Deletions are destructive — ask first if ambiguous
  • Installing packages changes the project — confirm when not obvious
  • The user is editing too — re-inspect state
  • Always discover before starting; start as background task
  • --session ID for multi-notebook servers
  • Reference docs in reference/ for widgets, gotchas, invocation, improvements

What's removed

Section Size Why safe to cut
Philosophy ~500B Abstract guidance — doesn't change agent actions
Troubleshooting (SyntaxError, permissions) ~1KB One-time setup issues, not needed every conversation
MCP/script comparison table ~500B Scripts have --help; agents don't use MCP via this skill
Windows-specific discovery notes ~500B Platform-specific edge case
Verbose discover/execute prose ~2KB Replaced with concise examples
finding-marimo.md inline content ~500B Collapsed to 2-line "Starting marimo" section; full guide still in reference/
Widgets/Reactivity detail ~1.5KB Kept 2-line summary; full docs still in reference/rich-representations.md
Markdown links to reference docs ~200B Key change — replaced with plain-text paths so agent can read on demand without being nudged to eagerly load them

Context

Related discussion in marimo-team/marimo#8177 (MCP agent friction points) — token overhead from the bash-based skill architecture is a compounding cost. This PR addresses the most controllable part: the static prompt size.

The reference docs (reference/*.md) are not deleted — they remain available for on-demand Read access. The change is that the agent is no longer nudged to read all of them on every session.

Sub-agent test results

Tested locally with a Claude sub-agent that was given ONLY the patched SKILL.md (no prior context). The agent was tasked with: discover servers → execute code → create a cell via code_mode → verify.

Results:

  • All 4 steps completed successfully
  • Agent correctly used async with cm.get_context(), ctx.create_cell(), ctx.run_cell()
  • Agent did NOT try to Edit/Write the .py file
  • Agent did NOT eagerly read any reference docs
  • Agent found a pre-existing doc inaccuracy ("All cell variables are in scope" is misleading — variables require cell execution first), which is fixed in the second commit

Test plan

Install the patched skill

# Option A: clone the branch
git clone -b reduce-token-consumption \
  https://github.com/atemate/marimo-pair.git \
  ~/.claude/skills/marimo-pair

# Option B: if already installed via /plugin, replace in-place
find ~/.claude -name "SKILL.md" -path "*marimo-pair*"
# then curl the patched version over the found path:
curl -sL https://raw.githubusercontent.com/atemate/marimo-pair/reduce-token-consumption/SKILL.md \
  -o <path>/SKILL.md

Functional checks

  • bash scripts/discover-servers.sh — finds running servers
  • bash scripts/execute-code.sh --port 2718 -c "1+1" — returns result
  • Multiline heredoc execution — works
  • code_mode API (create_cell, run_cell, edit_cell) — works
  • Agent does NOT edit .py file directly
  • Agent does NOT eagerly read reference docs
  • Reference docs (reference/*.md) still readable when explicitly needed
  • Compare conversation token usage against original SKILL.md (qualitative — should be noticeably lower)

@atemate atemate marked this pull request as draft May 4, 2026 11:29
@atemate atemate changed the title Reduce SKILL.md token consumption by 75% [WIP] Reduce SKILL.md token consumption May 4, 2026
The full SKILL.md body is injected into every agent conversation's
system prompt. At 12KB it was the single largest contributor to
per-conversation token cost, and the linked reference docs added
another 17KB when eagerly read.

Changes:
- Tighten trigger description to reduce false activations
- Collapse verbose sections to essential commands and rules
- Inline key gotchas instead of linking to reference docs
- List reference docs as plain-text paths (on-demand, not eager)
- Cut platform-specific notes, philosophy, troubleshooting
- Preserve all critical rules and safety warnings
@atemate atemate force-pushed the reduce-token-consumption branch from 344b5ca to 1c706b9 Compare May 4, 2026 11:35
The sub-agent test revealed that "All cell variables are in scope"
is misleading — variables are only available from cells that have
been run in the current session, not just defined.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant