diff --git a/docs/project/research/current/research-2026-06-13-skill-distribution-landscape.md b/docs/project/research/current/research-2026-06-13-skill-distribution-landscape.md new file mode 100644 index 00000000..a72bc880 --- /dev/null +++ b/docs/project/research/current/research-2026-06-13-skill-distribution-landscape.md @@ -0,0 +1,470 @@ +# Research Brief: Agent-Skill Distribution Landscape and Popular-Skill Patterns + +**Last Updated**: 2026-06-13 + +**Status**: Complete + +**Related**: + +- [CLI as Agent Skill](./research-cli-as-agent-skill.md) (foundational research behind + the guideline) +- [Agent Skills Standard Paths](./research-agent-skills-standard-paths.md) +- [Skills vs Meta-Skill Architecture](./research-skills-vs-meta-skill-architecture.md) +- Guideline this informs: `tbd guidelines cli-agent-skill-patterns` + ([source](../../../../packages/tbd/docs/guidelines/cli-agent-skill-patterns.md)) +- Issue motivating this pass: [jlevy/tbd#173](https://github.com/jlevy/tbd/issues/173) + (five gaps surfaced by the `pprose` L2-ish CLI) + +* * * + +## Executive Summary + +`cli-agent-skill-patterns` is a strong, current guideline, but it was shaped mostly by +two reference points: `tbd` itself (a full L3 platform) and `qmd` (an L2 +self-installer). This brief widens the evidence base by auditing how a spread of +**actually popular and actually shipped** skills are distributed, then maps the +resulting decision framework. +It exists to make sure our research reflects mid-2026 practice and to ground the five +edits proposed in issue #173. + +Two headline findings. +First, **the most-installed skills in the ecosystem are simple L0/L1 knowledge skills**, +not platforms: the skills.sh all-time leaderboard is led by Vercel’s `find-skills` +(~2.0M installs), Anthropic’s `frontend-design` (~540K), and Microsoft’s Azure skills +(~5.8M across the set). +Distribution complexity does not correlate with adoption; the opposite, if anything. +Second, there is a **real, populated rung between the guideline’s L2 and L3** that we +currently under-name: a self-installing tool that writes a compact managed `AGENTS.md` +block (with format stamping and a forward-compat guard) but takes on none of the L3 +platform machinery (no hooks, no `prime`/`setup`, no DocCache). +`pprose` is a clean reference for it, and its install code already solves three problems +our guideline either ties to L3 or leaves unspecified. + +The practical output is a distribution decision matrix (which channel/rung fits which +tool), a set of current-practice corrections (native per-agent skill dirs are +proliferating; Cursor now scans `.agents/skills/` natively), and concrete backing for +the issue #173 edits. + +**Research Questions**: + +1. How are popular, real-world skills actually distributed, and does distribution + complexity track adoption? + +2. What distribution channels exist in mid-2026, and when is each the right choice? + +3. Is the L0–L3 ladder complete, or is there a missing rung between L2 and L3? + +4. Which install-time concerns (format stamping, dev-build version pinning, marker + collapse, redundant in-file tags) are universal vs rung-specific? + +5. What has changed in the ecosystem since the guideline’s last substantive update + (native per-agent skill directories, Cursor’s discovery behavior)? + +* * * + +## Research Methodology + +### Approach + +Source-level audit of cloned repositories (via the `checkout-third-party-repo` shortcut, +into `attic/`), cross-checked against the skills.sh leaderboard and curated catalogs. +Where the issue cited `pprose` source lines, each claim was verified against the working +code rather than the prose. +Ecosystem-wide claims (per-agent skill paths, Cursor discovery) were checked against +vendor docs and flagged where only community catalogs corroborate them. + +### Sources + +- **Cloned and read at source** (`attic/`): `Leonxlnx/taste-skill`, + `jlevy/practical-prose` (pprose), `tobi/qmd`, `anthropics/skills`, + `VoltAgent/awesome-agent-skills`, `heilcheng/awesome-agent-skills`. +- **Leaderboard / discovery**: [skills.sh](https://skills.sh) all-time install rankings. +- **Vendor docs**: Cursor Agent Skills docs (native `.agents/skills/` discovery), Codex + skills loader scopes (already cited in the guideline). +- **Internal**: the guideline, the foundational research brief, the `pprose` install + spec (`plan-2026-05-30-pprose-install-scopes-and-surfaces.md` in the pprose repo). + +* * * + +## Research Findings + +### 1. What is actually popular (and what shape is it?) + +**Status**: ✅ Complete + +The skills.sh all-time leaderboard (mid-2026) is dominated by knowledge/best-practice +skills published by vendors, not by self-installing platforms: + +| Rank | Skill | Publisher | Installs | Rung | +| --- | --- | --- | --- | --- | +| 1 | `find-skills` | vercel-labs/skills | ~2.0M | L0 (meta: finds other skills) | +| 2 | `frontend-design` | anthropics/skills | ~540K | L0 (prompt and scripts) | +| 3 | `vercel-react-best-practices` | vercel-labs | ~474K | L0/L1 | +| 4 | `agent-browser` | vercel-labs | ~447K | L1 (points at a tool) | +| 5 | `microsoft-foundry` | microsoft/azure-skills | ~390K | L1 (vendor SDK/CLI) | + +Microsoft’s Azure skill set alone reports ~5.8M installs. +The leaders are vendor knowledge skills (React, Azure, Remotion best-practices) and +design skills. + +**Assessment**: Adoption is driven by *usefulness and reach*, not by install machinery. +The highest-leverage distribution move is shipping a spec-compliant `SKILL.md` that +skills.sh and the scrapers can find. +This validates the guideline’s “stop as low as you can” stance and argues the simple +baseline (§0) should stay the loud default. +It also means our L3-heavy reference set (`tbd`) is the *exception*, and the guideline +should not read as if self-installing platforms are the norm. + +* * * + +### 2. The distribution channels, and when each is right + +**Status**: ✅ Complete + +Distinct, composable channels observed in the wild (a single public repo can satisfy +several at once): + +| Channel | Mechanism | Best when | Seen in | +| --- | --- | --- | --- | +| **Repo-root `skills//SKILL.md`** | `npx skills add owner/repo` (skills.sh) scans it; scrapers index it | Always—the universal, zero-infra baseline | taste-skill, anthropics/skills, qmd | +| **Claude/Codex plugin marketplace** | `.claude-plugin/marketplace.json` (plus `.agents/plugins/...`); `/plugin marketplace add` | You want a *bundle* (skills, MCP, and hooks) installable as a unit, or in-Claude discovery | anthropics/skills, qmd, taste-skill | +| **Native per-agent skill dir** | Commit `.claude/skills/`, `.cursor/skills/`, `.github/skills/`, … | Targeting one agent that reads only its own dir (Claude Code) | All (Claude mirror) | +| **Self-installing CLI (`tool install`)** | Tool writes its own skill into discovery dirs | The skill ships *with* a binary and should track its version | qmd (L2), pprose (L2b), tbd (L3) | +| **Managed `AGENTS.md` block** | Marker-bounded, format-stamped block | You want an always-on bootstrap pointer that every agent reads | pprose (L2b), tbd (L3) | +| **Agent-specific instruction file** | `.github/copilot-instructions.md`, `GEMINI.md` | A target agent auto-reads a bespoke file | taste-skill (Copilot) | +| **`llms.txt` manifest** | Plain catalog of skills and one-line descriptions | Cheap machine-readable index for crawlers/agents | taste-skill | +| **MCP server** | `npx`/`uvx` server in marketplace or agent config | No CLI fits, or you need OAuth/multi-tenant/remote | qmd (alongside CLI) | + +**Assessment**: The guideline covers most of these but frames the repo-root and +skills.sh path as one option among many. +The evidence says it is *the* default and everything else is additive. +Two channels are under-documented: agent-specific instruction files +(`copilot-instructions.md`) and `llms.txt`, both of which taste-skill uses to widen +reach at near-zero cost. + +* * * + +### 3. Case studies across the ladder + +**Status**: ✅ Complete + +- **taste-skill (L0, 13-skill collection)**—pure prompt skills, no binary. + Distributed through *five* channels simultaneously: `npx skills add`, a Claude plugin + marketplace (`.claude-plugin/marketplace.json` and `plugin.json`), + `.github/copilot-instructions.md`, a `skills/llms.txt` manifest, and manual copy. + **Versioning is by stable install-name, not format stamps**: the v2 rewrite keeps the + same `name: design-taste-frontend` (so re-running `npx skills add` overwrites in + place) and preserves v1 under a separate install-name `design-taste-frontend-v1` for + callers that pinned to its behavior. + No `format=fNN`, no `DO NOT EDIT`—none is needed, because nothing writes a managed + file it must later migrate. + The folder name and the frontmatter `name` deliberately differ (e.g. + `brutalist-skill/` → `name: industrial-brutalist-ui`); the install-name is the + contract. +- **anthropics/skills (L0 and bundled scripts)**—the official examples. + Repo-root `skills//SKILL.md`, some with `scripts/` (xlsx, + web-artifacts-builder). + Primary distribution is a **Claude plugin marketplace** that groups the 18 skills into + three plugins (`document-skills`, `example-skills`, `claude-api`). No self-installing + CLI, no AGENTS.md. This is the canonical “skill collection as a plugin bundle” shape. +- **qmd (L2)**—Rust binary and embedded skills. + Self-installs via `qmd skill install` (`--global`), writes discovery dirs only, + **never** touches `AGENTS.md`. Ships a `.claude-plugin/marketplace.json` that points + at `./skills/` *and* declares an MCP server. + Its `SKILL.md` carries **no `format`/`DO NOT EDIT` stamp**—consistent with the + guideline’s claim that an L2 discovery-dir installer is a clean overwrite that needs + no migration apparatus. + Offers dual project/global scope because it is a general-purpose utility with no + per-project config. +- **pprose (L2b—the missing rung)**—Python CLI that self-installs into + `.agents/skills/`, `.claude/skills/`, **and a marker-bounded `AGENTS.md` block**, with + `format=f01` stamps and a forward-compat guard—but **no hooks, no `prime`/`setup` + context machinery, no DocCache**. See §4. +- **tbd (L3)**—the full platform: managed `AGENTS.md` block, `SessionStart`/`PreCompact` + hooks, `prime`/`setup`, format-versioned migration, knowledge-injection meta-skill + over a path-ordered DocCache. + Project-pinned, single-scope by design. + +**Assessment**: Five real tools cleanly populate L0, L0+bundle, L2, L2b, and L3. The +ladder is a good model; its gap is that L2b is real and common enough to name. + +* * * + +### 4. The missing rung: L2b (self-install and managed `AGENTS.md`, no platform) + +**Status**: ✅ Complete (issue #173 gap 1) + +The guideline’s L2 stops at discovery dirs and “**no managed `AGENTS.md` block**”; L3 +adds the block *plus* hooks, `prime`/`setup`, format migration, and a meta-skill. +pprose sits squarely between: it writes a ~13-line marker-bounded `AGENTS.md` block +alongside its skills, but has none of the L3 platform. +Its own AGENTS.md, verified: + +``` + +…compact bootstrap: what pprose is, when to use it, the pinned uvx runner… + +``` + +A small always-on bootstrap block (the tool exists, here is its trigger, here is the +pinned fallback runner) is a worthwhile, low-cost lift that does not require the +platform. + +**Assessment**: Name the rung. +Two shapes are possible: split L2 into **L2a** (discovery-dirs only—`qmd`) and **L2b** +(discovery-dirs and managed `AGENTS.md` block—`pprose`), or add a named sentence +acknowledging the variant. +**Adopted (per owner): the single named sentence**—L2b is named as a variant within L2, +with no L2a/L2b renumber. +The key correction holds either way: **a managed `AGENTS.md` block is separable from the +rest of the L3 platform**—it is the *one* L3 surface a small tool can adopt on its own. + +* * * + +### 5. Format-versioning is artifact-driven, not rung-driven + +**Status**: ✅ Complete (issue #173 gap 2) + +The guideline currently scopes its `format=fNN` discipline to L3 ("The self-upgrade and +format-versioning rules in §6.6 apply **only to L3**"). The evidence contradicts the +sharpness of that line. +pprose, well short of L3, stamps `format=f01` on **both** its generated `SKILL.md` files +(`DO NOT EDIT: generated by pprose install (format=f01)`) and its `AGENTS.md` block, and +runs the **same forward-compat guard** tbd uses—refusing to clobber any artifact stamped +newer than it understands (`install.py` `_write_skill_file` returns `blocked-newer` when +`existing > _format_num()`; `_update_agents_md` does likewise on the begin-marker +stamp). + +The distinguishing factor is not the ladder rung; it is **whether the tool writes a +generated artifact it may need to upgrade or refuse-to-clobber later**: + +- taste-skill / qmd write skills they always overwrite whole and never migrate → **no + stamp needed**. +- pprose writes a managed `AGENTS.md` block (merged into user content, not overwritten + whole) and wants cross-version safety → **stamp needed**, despite being L2-ish. + +**Assessment**: Reframe §6.6’s format-versioning as applying to **any rung that writes a +generated artifact it must upgrade or guard** (most concretely: a managed `AGENTS.md` +block, or any committed generated skill shared across tool versions). +The *hooks/`prime`/`setup`/DocCache* machinery is what stays L3-specific; the format +stamp is universal hygiene the moment a managed/merged artifact exists. + +* * * + +### 6. Dev-build generator pinning: what version does a dev checkout bake? + +**Status**: ✅ Complete (issue #173 gap 3) + +§6.7 is thorough on *pinned zero-install runners* but assumes the generator can pin to +its own running version. +A developer running `mytool install` from an editable/dev checkout has a running version +like `0.1.1.dev49+abc1234` that **was never published**, so +`uvx mytool@0.1.1.dev49+abc1234` cannot resolve. +Baking that pin ships a skill that fails the moment a teammate runs it—a single +works-on-my-machine install poisons the clone. + +pprose solves this with two pieces, both verified in `install.py`: + +- A `DISCOVERY_VERSION` constant—“the last real PyPI release”—used as the fallback pin + when the running version is not publishable (`DISCOVERY_VERSION = "0.2.0"`, with a + release-time guard `devtools/check_release_version.py` that fails publish unless it + equals the release tag). +- `is_pypi_release(version_str)`—a PEP 440 release-form check + (`^\d+(?:\.\d+)*(?:\.post\d+)?$`) that rejects dev (`.devN`), pre-release + (`aN`/`bN`/`rcN`), and local (`+hash`) versions. + `pinned_version()` returns the installed version when it is a real release, else + `DISCOVERY_VERSION`. + +**Assessment**: Worth a paragraph in §6.7. The rule generalizes across ecosystems: +**bake the running version only if it is a real, resolvable published release; otherwise +fall back to a known-good published pin.** npm has the same trap (`0.0.0-dev.` / +`x.y.z-canary` from `npm version`/CI never resolves on a teammate’s machine). +This is the inverse of the §6.7 consumer-side rule—it is the *generator-side* +pin-selection rule. + +* * * + +### 7. The `surface=` in-file tag is redundant + +**Status**: ✅ Complete (issue #173 gap 4) + +The guideline’s §2 and §6.6 examples show +``, but the surrounding +prose (§6.6, §6.6.2) argues the artifact’s identity is clear from its **location**, “so +no in-file `surface=` tag is needed beyond the load-bearing `format=fNN`.” The examples +and the reasoning contradict each other. + +Field evidence resolves it. +In one real `AGENTS.md` (pprose’s repo, which hosts three tools’ blocks): + +``` + # no surface= + # has surface= + # has surface= +``` + +pprose deliberately **dropped** `surface=` (its install spec’s “One namespace for +surface” locked decision: the in-file marker carries only `format=fNN`; the artifact +type is identified by location, which also keeps the portable and Claude `SKILL.md` +copies byte-identical). +tbd (`setup.ts:167`) and flowmark still emit `surface=agents-md`. + +**Assessment**: Drop `surface=` from the guideline examples to match the prose and the +cleaner reference. It is not load-bearing for detection—tbd’s own detection anchors on +the `BEGIN TBD INTEGRATION` prefix and reads `format=fNN`; `surface=agents-md` is +decorative. **Implication for tbd itself**: tbd’s generator emits a tag the guideline +would now call unnecessary. +Dropping it from tbd is a safe, tiny follow-on (the block is always rewritten whole and +detection ignores the field), but it is a *generator* change, not a guideline change, +and can be tracked separately. + +* * * + +### 8. Multi-block collapse: stale blocks accumulate across marker renames + +**Status**: ✅ Complete (issue #173 gap 5) + +The guideline covers “`AGENTS.md` exists without your markers → append, preserve user +content” but is silent on **multiple stale blocks**—e.g. a tool that shipped +`BEGIN PPROSE BLOCK` in v0.0.x and switched to `BEGIN PPROSE INTEGRATION` in v0.1.x +leaves users with both until upgraded. +pprose’s `_update_agents_md` (`install.py` L401–433) handles it: find every match of the +begin/end regex, replace **all** of them with one current block at the position of the +**first** stale block, discard the rest, and preserve everything outside the markers. + +**Assessment**: Worth a sentence in §2 or §6.6: on install, collapse all matching +managed blocks to a single current block at the first match’s position. +Note the practical caveat the issue does not: collapse only works for blocks that still +share the stable begin-prefix; a *renamed* prefix (`BLOCK` → `INTEGRATION`) needs the +installer to match a small set of known legacy prefixes, not just the current one. +This is the same “keep the marker name stable / match on prefix” rule the guideline +already states for `format=`, extended to prefix renames. + +* * * + +### 9. Current-practice corrections (ecosystem moved since last update) + +**Status**: ⏳ Verify per-agent before asserting in the guideline + +- **Cursor now scans `.agents/skills/` natively.** Per Cursor’s mid-2026 Agent Skills + docs, Cursor auto-discovers skills from `.agents/skills/`, `.cursor/skills/`, + `~/.agents/skills/`, and `~/.cursor/skills/` (including nested monorepo dirs). + The guideline currently states Cursor reaches `.agents/skills/` “via the skills.sh + installer … not natively”—that is now **stale** for Cursor. + (Re-verify against current Cursor docs before editing, per the guideline’s own “verify + native scanning” rule.) +- **Native per-agent skill directories are proliferating.** Community catalogs + (VoltAgent) report native project/global skill dirs across agents: Claude + `.claude/skills/`, Codex `.agents/skills/`, Cursor `.cursor/skills/`, Copilot + `.github/skills/`, Gemini `.gemini/skills/`, OpenCode `.opencode/skills/`, Windsurf + `.windsurf/skills/`, plus Google’s new **Antigravity** (`.agent/skills/`). Treat as + directional (community-sourced) and verify per agent, but the trend is real: the + ecosystem is growing per-agent native dirs *in addition to* the portable + `.agents/skills/`, which raises the value of the `npx skills add` symlink-fan-out and + of a tool emitting only the portable and Claude surfaces and letting the installer + mirror the rest. + +**Assessment**: The “who reads `.agents/skills/` natively vs via the installer” table in +§6.6 needs a refresh, with Cursor moved to native (pending re-verification) and a note +that native per-agent dirs are multiplying. + +* * * + +## Comparative Analysis: The Distribution Decision Matrix + +| If your tool is… | Distribute via | Format stamp? | Scope | +| --- | --- | --- | --- | +| **Prompt-only (L0)** | repo-root `skills/` and `npx skills add` (plus a plugin marketplace for a bundle, and `copilot-instructions.md` or `llms.txt` for extra reach) | No | n/a | +| **Prompt-only collection needing versions** | same, **version by stable install-name**; keep old names alive for pinned callers (taste-skill) | No | n/a | +| **Skill that points at a CLI (L1)** | repo-root `skills/` whose body uses a **pinned** `npx`/`uvx pkg@ver` runner | No | n/a | +| **CLI that self-installs, discovery dirs only (L2)** | `tool install [--global]`, overwrite skills; optional plugin marketplace and MCP (qmd) | Optional (clean overwrite) | dual if general-purpose | +| **CLI that self-installs and managed `AGENTS.md` block (L2b)** | as L2 **plus** a marker-bounded, **format-stamped** `AGENTS.md` block with a forward-compat guard and multi-block collapse (pprose) | **Yes** (the block is merged, not overwritten) | usually project | +| **Full platform (L3)** | L2b plus hooks, `prime`/`setup`, the meta-skill and DocCache, and format migration (tbd) | **Yes** (shared format code across surfaces) | project-pinned, single-scope | +| **No CLI, or OAuth/multi-tenant/remote** | MCP server (optionally bundled in a plugin marketplace) | n/a | per server | + +**Strengths/weaknesses**: + +- **repo-root and skills.sh**: maximum reach, zero infra, no version control over the + consumer—the right default and what the leaderboard rewards. +- **plugin marketplace**: best versioning (SHA pin, install preview, bundles MCP and + hooks); costs a manifest and is not centrally indexed unless you also list on the + community marketplace. +- **self-installing CLI**: tracks the binary’s version and customizes per project; only + worth it once the skill genuinely ships with a tool. +- **managed `AGENTS.md` block**: an always-on bootstrap pointer for *every* agent; the + one L3 surface adoptable alone, but the moment you write it you owe format stamping + + collapse and a forward-compat guard. + +* * * + +## Best Practices (Distilled) + +1. **Default to the simple baseline.** A spec-compliant repo-root + `skills//SKILL.md` plus `npx skills add` is what the most-installed skills do. + Stop there unless a binary or cross-session state forces you up the ladder. +2. **Version L0 collections by stable install-name**, not format stamps; keep superseded + names alive for callers who pinned to old behavior (taste-skill). +3. **Stamp `format=fNN` the moment you write a managed/merged artifact** (an `AGENTS.md` + block, or a committed generated skill shared across tool versions)—independent of + ladder rung—and pair it with a forward-compat guard. +4. **Pin generator-baked versions to a publishable release.** Never bake a + dev/pre-release/ local version into a generated skill; fall back to a known-good + published pin and guard it at release time. +5. **Identify artifacts by location, not in-file tags.** Carry only `format=fNN` in + markers; skip `surface=`. Keep portable and Claude `SKILL.md` copies byte-identical. +6. **Collapse stale managed blocks on install** to one current block at the first match, + matching a small set of known legacy begin-prefixes, preserving user content outside + the markers. +7. **Widen reach cheaply** with additive channels (plugin marketplace bundle, + `copilot-instructions.md`, `llms.txt`) without changing the canonical skill. + +* * * + +## Recommendations + +### Summary + +Fold the eight findings above into `cli-agent-skill-patterns` as targeted edits (mapped +gap-by-gap in +[plan-2026-06-13-cli-skill-guideline-pprose-gaps.md](../../specs/active/plan-2026-06-13-cli-skill-guideline-pprose-gaps.md)), +keep the simple baseline loud, and refresh the stale Cursor/native-dir facts. +Track the tbd-generator `surface=` cleanup (finding 7) as a separate small bead. + +### Recommended Approach + +Surgical edits, not a rewrite: name L2b (gap 1), reframe format-versioning as +artifact-driven (gap 2), add the generator-side dev-build pin rule to §6.7 (gap 3), drop +`surface=` from examples (gap 4), add multi-block collapse to §2/§6.6 (gap 5), and +update the §6.6 native-scanning table (finding 9). Each tagged with its ladder rung so +§0 stays untouched. + +### Alternative Approaches + +The full L2a/L2b ladder split is the heavier alternative; it was considered and **not** +adopted. The chosen form is a single named sentence ("L2 may optionally add a managed +`AGENTS.md` block—the one L3 surface adoptable without the platform; `pprose` is the +reference"), which captures most of the value with no renumbering. + +* * * + +## References + +- Issue: https://github.com/jlevy/tbd/issues/173 +- pprose install source (verified): + `attic/practical-prose/tools/pprose/src/pprose/install.py` (`DISCOVERY_VERSION` L56, + `is_pypi_release` L96, `pinned_version` L101–112, `_update_agents_md` collapse + L401–433, forward-compat guards L391–393 / L409–411) +- pprose install spec: `plan-2026-05-30-pprose-install-scopes-and-surfaces.md` (pprose + repo)—“One namespace for surface” locked decision +- taste-skill: https://github.com/Leonxlnx/taste-skill (skills.sh, a plugin marketplace, + copilot-instructions, and llms.txt; install-name versioning) +- qmd: https://github.com/tobi/qmd (L2 reference) +- anthropics/skills: https://github.com/anthropics/skills (plugin-bundle reference) +- skills.sh leaderboard: https://skills.sh +- catalogs: https://github.com/VoltAgent/awesome-agent-skills, + https://github.com/heilcheng/awesome-agent-skills +- Cursor Agent Skills docs (native `.agents/skills/` discovery; mid-2026) + + diff --git a/docs/project/research/current/research-agent-skills-standard-paths.md b/docs/project/research/current/research-agent-skills-standard-paths.md index d9feec94..ce303310 100644 --- a/docs/project/research/current/research-agent-skills-standard-paths.md +++ b/docs/project/research/current/research-agent-skills-standard-paths.md @@ -134,6 +134,26 @@ No refreshed source changed the main recommendation: `.agents/skills/` should be portable project skill path, `.claude/skills/` should remain a Claude Code mirror, and `AGENTS.md` should remain a separate always-on instruction surface. +### 2026-06-13 Refresh Notes (native per-agent skill dirs) + +A mid-2026 distribution audit +([research-2026-06-13-skill-distribution-landscape.md](./research-2026-06-13-skill-distribution-landscape.md)) +found the ecosystem growing **native per-agent skill directories** alongside the +portable `.agents/skills/`, which shifts two facts the guideline currently states: + +- **Cursor now scans `.agents/skills/` natively** (Cursor Agent Skills docs, verified + 2026-06-13: native recursive discovery of `.agents/skills/` and `.cursor/skills/` plus + the `~/` user-level variants, including nested monorepo dirs, and compatibility + loading from `.claude/skills/` and `.codex/skills/`). The guideline’s older “Cursor + reaches `.agents/skills/` only via the skills.sh installer, not natively” is now + stale; it has been corrected to native (verified). +- **Per-agent native dirs are multiplying** (community catalogs, directional): Cursor + `.cursor/skills/`, Copilot `.github/skills/`, Gemini `.gemini/skills/`, OpenCode + `.opencode/skills/`, Windsurf `.windsurf/skills/`, Google Antigravity + `.agent/skills/`. This does not change the recommendation (portable `.agents/skills/` + \+ Claude mirror), but it raises the value of letting the `npx skills add` installer + fan out per-agent mirrors rather than a tool writing each one. + * * * ## Research Findings @@ -180,7 +200,7 @@ project-local path. Native paths should be mirrors or compatibility targets. | --- | --- | --- | --- | | Claude Code | `.claude/skills//SKILL.md` | `~/.claude/skills//SKILL.md` | Official docs still center `.claude/skills/`. | | Codex | `.agents/skills/` from CWD through repo root (`SkillScope::Repo`) | `~/.agents/skills/` (`User`); admin; plugin roots; `$CODEX_HOME/skills` | **Source-verified** (not just docs): `codex-rs/core-skills/src/loader.rs` `repo_agents_skill_roots()` reads a bare repo-root `.agents/skills/` directly — no manifest. See 2026-05-26 note. | -| Cursor | `.agents/skills/` per skills CLI support | `~/.cursor/skills/` | Changelog and best-practices docs support Agent Skills; path docs are thinner than Claude/Gemini. | +| Cursor | `.agents/skills/` and `.cursor/skills/` (native recursive scan) | `~/.agents/skills/` and `~/.cursor/skills/` | **Docs-verified** (Cursor Agent Skills docs, 2026-06-13): native scan, plus compatibility loading from `.claude/skills/` and `.codex/skills/`. | | Gemini CLI | `.gemini/skills/` or `.agents/skills/` alias | `~/.gemini/skills/` or `~/.agents/skills/` alias | Official Gemini docs explicitly mention `.agents/skills` alias. | | GitHub Copilot | `.agents/skills/` per skills CLI support | `~/.copilot/skills/` | Included in Vercel skills supported-agent table. | | Amp | `.agents/skills/` per skills CLI support | `~/.config/agents/skills/` | Included in Vercel skills supported-agent table. | @@ -192,16 +212,20 @@ Skills as capability packages. **Native scan vs. installer reach (be precise).** The `.agents/skills/` rows above mix two different mechanisms, and conflating them caused a downstream confusion (see -2026-05-26 note): - -- **Scans repo-root `.agents/skills/` natively**: Codex (verified at source) and Gemini - CLI (documents the alias). - pi/OpenCode scan project Agent Skills dirs. -- **Reached via the `npx skills add` installer**: for Cursor, Copilot, Cline, Amp, - Windsurf, the installer copies `SKILL.md` into `.agents/skills/` and **symlinks it - into each agent’s own dir** — the *installer* binds the path, not the agent. - “Works with Cursor/Copilot” means “via skills.sh”, not “Cursor scans `.agents/skills/` - itself.” +2026-05-26 note). The bar for the first bucket is a **verified vendor-native scan** +(source or vendor docs)—not mere presence in Vercel’s `skills` supported-agent table, +which is evidence of installer reach, not proof a vendor’s own loader reads the path: + +- **Scans `.agents/skills/` natively (verified)**: Codex (verified at source), Gemini + CLI (documents the alias), and **Cursor** (Cursor Agent Skills docs, 2026-06-13: + native recursive scan of `.agents/skills/` and `.cursor/skills/`, plus + `.claude`/`.codex` compatibility loading). +- **Reached via the `npx skills add` installer** (the *installer* binds the path, not + the agent): Copilot, Cline, Amp, Windsurf, OpenCode—the installer copies `SKILL.md` + into `.agents/skills/` and **symlinks it into each agent’s own dir**. “Works with + Copilot” means “via skills.sh,” not “Copilot scans `.agents/skills/` itself.” + Treat Vercel’s supported-agent table as installer-reach evidence only, not native + verification. - **Claude Code does not scan `.agents/` at all** — only `.claude/skills/` (confirmed; see claude-code#31005), which is why the mirror is mandatory. diff --git a/docs/project/research/current/research-cli-as-agent-skill.md b/docs/project/research/current/research-cli-as-agent-skill.md index be613e7b..700900cc 100644 --- a/docs/project/research/current/research-cli-as-agent-skill.md +++ b/docs/project/research/current/research-cli-as-agent-skill.md @@ -1,6 +1,16 @@ # Research Brief: CLI as Agent Skill - Best Practices for TypeScript CLIs in Claude Code -**Last Updated**: 2026-05-24 +**Last Updated**: 2026-05-24 (addendum 2026-06-13) + +> **2026-06-13 addendum.** This brief is foundational but now lags the shipped guideline +> (`cli-agent-skill-patterns`), which has since added the L0–L3 integration ladder, +> format-versioned migration, and the §6.6.2 scope mechanics. +> For the current distribution landscape (skills.sh leaderboard reality, the L2b +> variant, dev-build pin selection, the `surface=` drop, multi-block collapse, and +> native per-agent skill dirs), see +> [research-2026-06-13-skill-distribution-landscape.md](./research-2026-06-13-skill-distribution-landscape.md) +> and +> [plan-2026-06-13-cli-skill-guideline-pprose-gaps.md](../../specs/active/plan-2026-06-13-cli-skill-guideline-pprose-gaps.md). **Related**: @@ -195,7 +205,8 @@ letting agents use progressive disclosure. #### 1.3 Pinned CLI Invocation Fallbacks -**Status**: Planned +**Status**: ✅ Complete (shipped in guideline §6.7; generator-side rule open—see +addendum) **Details**: @@ -218,6 +229,18 @@ The pprose installer is a useful reference implementation: it injects a generate **Assessment**: tbd’s current guidance is too npm-global-specific. The guideline should teach pinned runner fallbacks as a supply-chain hardening pattern for all CLI-as-skill projects. +(Shipped: guideline §6.7 now covers the consumer-side pinned `npx`/`uvx`/`pipx`/`go run` +chain.) + +**2026-06-13 follow-up (generator-side pin selection).** A gap remains on the +*generator* side: a tool installing from an editable/dev checkout has an unpublishable +running version (`0.1.1.dev49+abc1234`, npm `0.0.0-dev.`) that `uvx pkg@` +cannot resolve, so the baked pin ships a broken skill. +pprose solves it with a `DISCOVERY_VERSION` constant (last real PyPI release) gated by a +PEP 440 release check `is_pypi_release` and a release-time guard (`install.py` +L56/L96/L101–112). The rule: bake the running version only if it is a real, resolvable +published release; otherwise fall back to a known-good published pin. +This is issue #173 gap 3 — see the distribution-landscape brief. * * * diff --git a/docs/project/research/current/research-skills-vs-meta-skill-architecture.md b/docs/project/research/current/research-skills-vs-meta-skill-architecture.md index b24bb6a3..d427241d 100644 --- a/docs/project/research/current/research-skills-vs-meta-skill-architecture.md +++ b/docs/project/research/current/research-skills-vs-meta-skill-architecture.md @@ -42,6 +42,18 @@ The single tbd meta-skill should now be installed as a portable project skill at Code, and exposed as `skills/tbd/SKILL.md` for distribution. This keeps the meta-skill architecture while making the install path multi-agent. +**2026-06-13 update**: Two corrections from the mid-2026 distribution audit +([research-2026-06-13-skill-distribution-landscape.md](./research-2026-06-13-skill-distribution-landscape.md)). +First, the “15K default limit” below is superseded: Claude Code now budgets the skill +listing at roughly 1 percent of the model’s context window +(`skillListingBudgetFraction`), truncating per-skill text at 1,536 characters, and drops +least-recently-used descriptions on overflow. +The meta-skill argument is unchanged and arguably stronger, since the budget is now a +shared, contested pool. +Second, the leaderboard supports the thesis directly: the single most-installed skill in +the ecosystem is Vercel’s `find-skills` (a meta-skill that routes to other skills), and +vendor knowledge skills dominate the top ranks over many-skill platforms. + **Research Questions**: 1. Should each tbd shortcut/guideline be a separate Claude Code skill? diff --git a/docs/project/specs/active/plan-2026-06-13-cli-skill-guideline-pprose-gaps.md b/docs/project/specs/active/plan-2026-06-13-cli-skill-guideline-pprose-gaps.md new file mode 100644 index 00000000..1c0a9a5c --- /dev/null +++ b/docs/project/specs/active/plan-2026-06-13-cli-skill-guideline-pprose-gaps.md @@ -0,0 +1,248 @@ +--- +title: cli-agent-skill-patterns—Five pprose-Surfaced Gaps (issue #173) +description: Plan to fold the five small gaps from issue #173 (surfaced by the pprose L2-ish CLI) into the cli-agent-skill-patterns guideline, plus the ecosystem current-practice corrections from the mid-2026 distribution audit +author: Joshua Levy (github.com/jlevy) with LLM assistance +--- +# Feature: cli-agent-skill-patterns—Five pprose-Surfaced Gaps (issue #173) + +**Date:** 2026-06-13 + +**Status:** Implemented—guideline edits applied and under review in +[PR #175](https://github.com/jlevy/tbd/pull/175). The decisions below are settled (see +the resolved Open Questions); this spec is a record, not an open proposal. + +## Overview + +[Issue #173](https://github.com/jlevy/tbd/issues/173) re-reads +`cli-agent-skill-patterns` against +[`pprose`](https://github.com/jlevy/practical-prose)—a Python CLI that self-installs +into `.agents/skills/`, `.claude/skills/`, and a marker-bounded `AGENTS.md` block—and +surfaces five small, self-contained gaps. +All five were verified against pprose’s working `install.py` (file:line cites in the +research brief). The verdict in the issue is correct: the guideline is in great shape; +these are precision fixes, not a rewrite. + +This plan maps each gap to a concrete edit, tagged with the ladder rung it touches so +the simple baseline (§0) stays untouched, and adds the ecosystem current-practice +corrections from the wider distribution audit. +Evidence: +[research-2026-06-13-skill-distribution-landscape.md](../../research/current/research-2026-06-13-skill-distribution-landscape.md). + +Canonical edit target: +[packages/tbd/docs/guidelines/cli-agent-skill-patterns.md](../../../../packages/tbd/docs/guidelines/cli-agent-skill-patterns.md) +(ships via `tbd guidelines`). The `.tbd/docs/` copy is a regenerated cache—never edit it +directly. + +## Goals + +- Fold the five issue #173 gaps into the guideline as surgical, evidenced edits. +- Resolve the guideline’s one internal contradiction (examples emit `surface=`; prose + says it is unneeded). +- Refresh stale ecosystem facts (Cursor native `.agents/skills/` discovery; native + per-agent skill dirs). +- Keep the guideline non-dogmatic and ladder-shaped: every addition states its rung. + +## Non-Goals + +- Rewriting the guideline or its §0 baseline. +- Changing tbd runtime behavior, except to track the small `surface=` generator cleanup + (gap 4) as a *separate* follow-on bead, not part of this doc edit. +- Re-litigating the Clerk/Render/Convex edits already planned in + `plan-2026-06-03-tbd-agent-cli-guideline-improvements.md` (independent; can land in + the same editing pass). + +## Background + +The guideline was shaped mainly by `tbd` (L3) and `qmd` (L2). pprose is the first +well-built **L2b** reference in our evidence base—self-install and a managed `AGENTS.md` +block, but no hooks/`prime`/`setup`/DocCache—and it exercises exactly the seams the +guideline draws too sharply at the L2/L3 boundary. +Separately, the wider audit (skills.sh leaderboard, vendor catalogs, taste-skill, +anthropics/skills) confirms the simple baseline is the dominant real-world pattern and +surfaces two stale facts. + +## Design: The Five Gaps and Their Edits + +### Gap 1—Name the rung between L2 and L3 (§6.0) + +**Finding**: pprose writes a compact managed `AGENTS.md` block but none of the L3 +platform. +The ladder forces such tools to either drop the block (L2) or appear to take on +the whole L3 burden. + +**Edit** (rung: L2/L3 boundary): in §6.0, either split L2 into **L2a** (discovery-dirs +only—`qmd`) and **L2b** (discovery-dirs and managed `AGENTS.md` block—`pprose`), keeping +L3 (`tbd`) as-is; **or** add one named sentence: “L2 may optionally add a managed +`AGENTS.md` block—the one L3 surface adoptable without the rest of the platform; pprose +is the reference.” Recommended: the single-sentence acknowledgment first (no +renumbering), upgraded to a full L2a/L2b split only if the §0.3 decision guide and the +§6.0 prose read cleanly with it. +Add pprose to the worked-examples set wherever `qmd`/`tbd` are cited. + +### Gap 2—Format-versioning is artifact-driven, not L3-only (§6.0, §6.6) + +**Finding**: pprose (sub-L3) stamps `format=f01` on its `SKILL.md` files *and* its +`AGENTS.md` block and runs the same forward-compat guard. +The distinguishing factor is whether a tool writes a generated artifact it must +upgrade/guard, not its rung. + +**Edit** (rung: any rung that writes a managed/merged artifact): soften the §6.0 closer +("§6.6 applies **only to L3**") and reframe §6.6’s “Upgrade existing installs +deliberately (L3 only)” as **“any rung that writes a generated artifact it may need to +upgrade or refuse-to-clobber”**—most concretely a managed `AGENTS.md` block or a +committed generated skill shared across tool versions. +Keep the *hooks/`prime`/`setup`/DocCache* machinery explicitly L3-specific. +State the test: overwrite-whole-and-never-migrate (taste-skill, qmd) needs no stamp; +merged-or-shared-and-guarded (pprose) does. + +### Gap 3—Generator-side dev-build pin selection (§6.7) + +**Finding**: §6.7 covers consumer-side pinned runners but assumes the generator can pin +to its own running version. +A dev/editable checkout reports an unpublishable version (`0.1.1.dev49+abc1234`) that +`uvx pkg@` cannot resolve, so the baked pin ships a broken skill. + +**Edit** (rung: any self-installer that bakes a pin, L1+): add one paragraph to §6.7: +bake the running version only if it is a **real, resolvable published release**; +otherwise fall back to a known-good published constant (pprose’s `DISCOVERY_VERSION`, +gated by a PEP 440 release check `is_pypi_release` and a release-time guard). +Note the npm analog (`0.0.0-dev.`/`-canary` from CI). Frame it as the inverse of +the existing consumer-side rule: the generator-side pin-selection rule. + +### Gap 4—Drop the redundant `surface=` in-file tag (§2, §6.6) + +**Finding**: §2 (line ~206) and §6.6 (line ~697) examples show +`format=f02 surface=agents-md`, but §6.6/§6.6.2 prose says the artifact’s identity is +its location, “so no in-file `surface=` tag is needed.” +pprose dropped `surface=` deliberately; tbd (`setup.ts:167`) and flowmark still emit it. +It is not load-bearing for detection. + +**Edit** (rung: any tool with a managed `AGENTS.md` block): drop `surface=agents-md` +from both examples so they carry only `format=fNN`, matching the prose and the cleaner +reference. Add one sentence: `surface=` is optional and decorative—detection anchors on +the stable begin-prefix and reads `format=fNN`; identify artifacts by location, which +also keeps portable and Claude `SKILL.md` copies byte-identical. + +**tbd-code implication (separate bead, not this edit)**: tbd’s generator emits a tag the +guideline would now call unnecessary. +Dropping it from tbd is safe (block is rewritten whole; detection ignores the field) but +is a generator change. +File a small follow-on bead; do not block the guideline edit on it. + +### Gap 5—Collapse multiple stale managed blocks (§2 or §6.6) + +**Finding**: the guideline covers “no marker block yet → append” but not **multiple +stale blocks** (e.g. a begin-prefix rename across versions leaves two). +pprose’s `_update_agents_md` replaces all matches with one current block at the first +match’s position, preserving content outside markers. + +**Edit** (rung: any tool with a managed `AGENTS.md` block): add a sentence—on install, +collapse all matching managed blocks to a single current block at the first match’s +position. Extend the existing “keep the begin/end marker names stable / match on prefix” +rule to cover prefix *renames*: match a small set of known legacy begin-prefixes, not +just the current one (collapse only works if the installer recognizes the old prefix). + +### Ecosystem current-practice corrections (§6.6 native-scanning table) + +(Rung: reference material, all rungs.) +From the wider audit: + +- **Cursor now scans `.agents/skills/` natively** (Cursor mid-2026 docs: + `.agents/skills/`, `.cursor/skills/`, and `~/` variants). + The guideline currently lists Cursor as reached “via the skills.sh installer … not + natively”—move it to native. + **Re-verify against current Cursor docs before editing**, per the guideline’s own + rule. +- **Native per-agent skill dirs are proliferating** (Claude `.claude/skills/`, Cursor + `.cursor/skills/`, Copilot `.github/skills/`, Gemini `.gemini/skills/`, OpenCode + `.opencode/skills/`, Windsurf `.windsurf/skills/`, Google Antigravity + `.agent/skills/`). Add a directional note (community-sourced; verify per agent) that + the ecosystem is growing per-agent native dirs alongside portable `.agents/skills/`, + which raises the value of letting the `npx skills add` installer fan out the mirrors. + +### §11 / §12 updates + +Reflect gaps 1–5 as new bullets/checklist items, each tagged with its rung. +Refresh the “Last Updated” line and add pprose (and, where useful, +taste-skill/anthropics-skills) to the worked-examples and References lists. + +## Implementation Plan + +### Phase 1—Guideline edits (done; owner chose the single named-variant form for gap 1) + +- [x] §6.0: name **L2b** as a single named variant within L2 (no L2a/L2b renumber)—the + owner chose the lighter form over a full ladder split. +- [x] §6.0 and §6.6: reframe format-versioning as artifact-driven, not L3-only; keep + hooks/`prime`/`setup`/DocCache L3-specific. +- [x] §6.7: add the generator-side dev-build pin-selection paragraph (DISCOVERY_VERSION + / is_pypi_release, with the npm analog). +- [x] §2 and §6.6: drop `surface=agents-md` from both examples; add the one-sentence + “identify by location” note. +- [x] §2 or §6.6: add the multi-block-collapse sentence and the legacy-prefix-rename + note. +- [x] §6.6 native-scanning table: move Cursor to native (re-verified); add the + per-agent-native-dirs directional note. +- [x] §11 Summary and §12 Checklist: new bullets, each tagged with its rung; refresh + “Last Updated”; add pprose to worked examples/References. +- [x] Format with flowmark (never Prettier); unspaced em dashes; keep the doc footer; + check all internal cross-references resolve. + +### Phase 2—Follow-on (separate beads) + +- [ ] Bead: drop `surface=agents-md` from tbd’s generated `AGENTS.md` begin line + (`setup.ts`), keeping detection prefix-anchored; emit the cleaner marker going + forward, accept old `surface=...` markers, collapse duplicates, and guard newer + `format=fNN` blocks; update goldens/drift tests. +- [ ] Optional: have tbd’s generated artifacts demonstrate any newly recommended + pattern, so the reference implementation stays ahead of the guideline. + +From the [PR #175](https://github.com/jlevy/tbd/pull/175) senior review (dogfooding +roadmap—tracked here, intentionally not in this pure-docs PR): + +- [ ] Make tbd’s distribution contract explicit (docs): state that tbd is intentionally + L3, not the baseline for ordinary CLIs; list the surfaces tbd owns (generated skills, + managed `AGENTS.md` block, hooks, setup/prime, DocCache/meta-skill); name which lower + rungs simpler tools should prefer and why. +- [ ] Generate and test tbd’s canonical skill from one source: ensure repo-local + `.agents/skills/tbd/SKILL.md` and the `.claude/skills/tbd/` mirror come from a single + source of truth, keep the drift test, and keep the skill small (route to `tbd prime`, + `tbd shortcut`, `tbd guidelines`—don’t embed the manual). +- [ ] Make format migration artifact-driven in code and tests: test the managed + `AGENTS.md` block independently from hook and skill files; treat each generated + artifact as having its own format/migration contract; don’t imply stamps exist only + because a tool is L3. +- [ ] Improve `tbd setup`/`doctor` output around rungs and surfaces: explain which + surfaces are installed and why tbd is L3, and reinforce that simpler tools usually + want L1, plain L2, or the L2b variant. + +## Testing Strategy + +Documentation change—review-based. +Re-read end-to-end: every new rule states its rung; nothing contradicts §0; the +`surface=` example/prose contradiction is gone. +Confirm each claim maps to a real, cited example (the research brief is the evidence +index). Confirm cross-references resolve and flowmark is a no-op. + +## Open Questions (resolved) + +1. **L2a/L2b split vs single sentence (gap 1)**—RESOLVED: the owner chose the single + named sentence. L2b is named as a variant within L2; the four-rung ladder (L0–L3) is + unchanged, with no L2a/L2b renumber. +2. **tbd `surface=` cleanup (gap 4)**—RESOLVED: tracked as a Phase 2 follow-on; the + guideline edit stays pure-docs. +3. **Cursor native-scanning**—RESOLVED: verified against current Cursor docs before + asserting native `.agents/skills/` scanning in §6.6. + +## References + +- Issue: https://github.com/jlevy/tbd/issues/173 +- Evidence: + [research-2026-06-13-skill-distribution-landscape.md](../../research/current/research-2026-06-13-skill-distribution-landscape.md) +- Guideline: + [packages/tbd/docs/guidelines/cli-agent-skill-patterns.md](../../../../packages/tbd/docs/guidelines/cli-agent-skill-patterns.md) +- Related plan: `plan-2026-06-03-tbd-agent-cli-guideline-improvements.md` (Clerk/Render/ + Convex; independent, can co-land) + + diff --git a/packages/tbd/docs/guidelines/cli-agent-skill-patterns.md b/packages/tbd/docs/guidelines/cli-agent-skill-patterns.md index 5a95865d..109b25d0 100644 --- a/packages/tbd/docs/guidelines/cli-agent-skill-patterns.md +++ b/packages/tbd/docs/guidelines/cli-agent-skill-patterns.md @@ -6,7 +6,13 @@ category: general --- # Agent Skills and CLI Integration Patterns -**Last Updated**: 2026-06-04 (§4.6 adds the “attention routing” framing—the slopdocs vs. +**Last Updated**: 2026-06-13 (issue #173: named the L2b variant—an L2 self-installer +that also writes a managed `AGENTS.md` block, `pprose` reference—reframed +format-versioning as artifact-driven rather than L3-only, added the §6.7 generator-side +dev-build pin rule, dropped the redundant `surface=` tag from the marker examples, added +multi-block collapse in §2, and refreshed §6.6 for Cursor’s native `.agents/skills/` +scanning and the spread of native per-agent skill dirs. +Earlier 2026-06-04: §4.6 adds the “attention routing” framing—the slopdocs vs. hiding-details failure modes—as a contained section, and the §3.1 mechanism callback points at it. Earlier 2026-06-02: §6.8 now covers Anthropic’s official and community plugin marketplaces—the reviewed, gated submission channel—and the `/plugin` install @@ -127,9 +133,10 @@ So: - **Maximum reach across many agents** → layer them: AGENTS.md, SKILL.md, CLI, and MCP. (§1) - **Self-installs into agents?** → climb the integration ladder (§6.0): L0 pure skill → - L1 skill + pinned `npx`/`uvx` → L2 self-install into skill dirs only (`qmd`) → L3 full - platform with managed `AGENTS.md`, hooks, and format versioning (`tbd`). Most tools - are L1; stop as low as you can. + L1 skill and pinned `npx`/`uvx` → L2 self-install into skill dirs (`qmd`; its **L2b** + variant also writes a managed `AGENTS.md` block, `pprose`) → L3 full platform with + hooks, `prime`/`setup`, and format-versioned migration (`tbd`). Most tools are L1; + stop as low as you can. Everything below is reference material. You do not need most of it for most tools. @@ -203,7 +210,7 @@ Version the managed block by carrying the format on the **begin marker line itse generated content without touching user-authored text: ```markdown - + ## mycli - Run `mycli prime` for current project context. @@ -216,6 +223,17 @@ Keep the begin/end marker *names* stable (` + ## mycli - Run `mycli prime` for current project context. @@ -771,10 +813,14 @@ Do not make Codex hooks call scripts stored under `.claude/`—that couples Code Claude setup. If a script must move out of `.claude/scripts/`, update the tbd-owned hook commands (or leave a wrapper) so existing Claude hooks keep working. -**Upgrade existing installs deliberately (L3 only).** A self-installing tool whose skill -content evolves *will* leave older generated files in users’ repos. -(An L2 tool that only writes discovery-dir skills can skip all of this: a re-install is -a clean full overwrite, with no managed `AGENTS.md` block or hooks to migrate.) +**Upgrade managed generated artifacts deliberately.** A self-installing tool whose +generated content evolves *will* leave older artifacts in users’ repos. +(A tool that only writes fully-overwritten discovery-dir skills—plain L2 or below—can +skip all of this: a re-install is a clean full overwrite, with no managed `AGENTS.md` +block or hooks to migrate. +The L2b variant, which merges a managed `AGENTS.md` block into a human-authored file, +**does** need the format stamp and forward-compatibility guard even though it is not +L3—this is the artifact-driven rule from §6.0, not an L3-only one.) Treat generated integration files like config migrations: Reserve an `fNN` **format bump** for changes big enough to need an explicit migration: a @@ -816,7 +862,8 @@ it; it does not license the tool to mutate the repo on its own. `AGENTS.md` block, generated skills, tool-owned hooks, `.codex/` config), re-running cleanly with no change when already current. - Treat old marked `AGENTS.md` blocks with no metadata as legacy generated content and - replace only the managed region. + replace only the managed region, collapsing any duplicate stale blocks to one current + block at the first block’s position (§2). - Detect tool-owned hook entries by command/path/signature, replace only those entries, and preserve unrelated user hooks. - **Forward-compatibility guard.** When the tool finds a generated artifact whose @@ -1063,6 +1110,20 @@ This is a supply-chain control, not just ergonomics: an unpinned runner re-resol the latest published version on every run and bypasses any cool-off window. See `tbd guidelines supply-chain-hardening` for the cross-ecosystem policy. +**Bake a pin the generator can actually resolve.** The rules above are consumer-side: +they assume the generating tool can pin to its own running version. +A tool installing from an editable or dev checkout cannot. +Its running version is an unpublished dev, pre-release, or local build +(`0.1.1.dev49+abc1234`, npm `0.0.0-dev.` or `x.y.z-canary`) that `uvx pkg@` +or `npx pkg@` will never resolve, so baking it ships a skill that fails the moment +a teammate runs it—one works-on-my-machine install poisons the clone. +Bake the running version only when it is a real, resolvable published release; otherwise +fall back to a known-good published pin. +`pprose` does this with a `DISCOVERY_VERSION` constant (its last real PyPI release) +gated by a PEP 440 release check (`is_pypi_release`), and enforces at release time that +the constant equals the release tag. +This is the generator-side mirror of the consumer-side pin rule. + **Current tooling (May 2026)** - **Node / TypeScript**: zero-install via `npx @` (`-y` to skip the prompt), @@ -1355,6 +1416,10 @@ going: This is tbd’s validated approach. - Path-ordered resource cache for project/user shadowing; generate `--list` dynamically. - Context-injection loop with explicit `cli command arg` references; depth ≤ 3. +- Self-installing tools climb the §6.0 ladder and stop at the lowest rung. + A managed `AGENTS.md` block (the L2b variant) is the one L3 surface worth adopting + alone, but any rung that writes one must format-stamp it and refuse to clobber a newer + block. **Reach and surface** @@ -1371,6 +1436,8 @@ going: - Scope `allowed-tools` tightly; gate destructive skills; design for sandboxes. - Idempotent multi-agent install with marker-bounded sections; version source files, not fully generated install artifacts; mark generated files “DO NOT EDIT.” +- Pin a generated runner to a resolvable published release; never bake a dev or + pre-release version a teammate’s `uvx`/`npx` cannot resolve (§6.7). * * * @@ -1393,16 +1460,26 @@ going: - [ ] `AGENTS.md` with build/test/style/conventions (concise) - [ ] Managed `AGENTS.md` block uses a stable begin/end marker with a `format=fNN` field on the begin line +- [ ] Marker carries only `format=fNN` (artifact type is clear from location, no + `surface=` tag); duplicate stale blocks collapsed to one on install (the L2b variant + and L3) - [ ] `CLAUDE.md` strategy decided (symlink to `AGENTS.md`, copy, or separate) **CLI tool (if applicable)** - [ ] `--json` on all commands; `--brief`/`--quiet`; actionable errors + - [ ] Idempotent `setup --auto`; `init` for surgical config + - [ ] Help epilog with `IMPORTANT:` + Getting Started one-liner + - [ ] `prime` (status/context) and `skill` (pure docs) commands + - [ ] Invocation strategy chosen (§6.7): local-first plus pinned zero-install fallback by default, or global install + `SessionStart` bootstrap for cloud/ephemeral agents +- [ ] Generated skill bakes a resolvable published pin, not the running dev or + pre-release version (§6.7) + **Advanced (many subcommands / knowledge library)** - [ ] Meta-skill composition (header + baseline + dynamic directory) - [ ] Informational commands (`guidelines`/`shortcut`/`template`) with `--list` @@ -1487,8 +1564,16 @@ going: https://clau.de/plugin-directory-submission) - gstack: https://github.com/garrytan/gstack - Beads (bd): https://github.com/gastownhall/beads -- qmd (L2 reference: self-installing skill, discovery-dirs only, CLI + MCP + plugin): +- qmd (L2 reference: self-installing skill, discovery-dirs only, CLI, MCP, and plugin): https://github.com/tobi/qmd +- pprose (L2b variant reference: self-installing skill plus a format-stamped, + marker-bounded `AGENTS.md` block, no hooks/prime/setup): + https://github.com/jlevy/practical-prose +- taste-skill (L0 reference: pure-prompt skill collection distributed via skills.sh, a + Claude plugin marketplace, `copilot-instructions.md`, and `llms.txt`): + https://github.com/Leonxlnx/taste-skill +- anthropics/skills (L0 reference: skill collection bundled as Claude plugins): + https://github.com/anthropics/skills ### Security