Han Feedback: plan-a-feature + plan-implementation (2026-05-29)

# Han Feedback — 2026-05-29

**Skills used:** `han:plan-a-feature` → `han:plan-implementation`
**Context:** Planning and implementing the contribution of `/han-feedback` to `testdouble/han` as a pull request — speccing what the skill does, then producing the PR workflow and SKILL.md adaptation checklist
**Outcome:** Feature specification, implementation plan, and a live PR at https://github.com/testdouble/han/pull/39

---

# han:plan-a-feature

## What worked well

- **Codebase-first discovery before any interview.** The skill fetched real SKILL.md examples, the CONTRIBUTING.md requirements, the skills index format, writing voice guide, and the long-form doc template from the Han repo before surfacing a single question. No placeholder decisions. The interview started with real evidence, not assumptions.
- **Review agents caught 16 genuine findings.** The fish shell glob failure on empty directories (`ls -t dir/*.md` with fish raises an error rather than returning empty output) would have produced mysterious first-run behavior. The partial write leaving a file that satisfies the dedup check, the ambiguous-response handling gaps, and the step-order difference between the original skill and the spec — all real, all caught before the SKILL.md was written.
- **Only 2 questions reached the user.** Output folder and skills index group. Everything else resolved from the Han repo's own codebase and writing standards. The "what genuinely needs human judgment" filter held.
- **Project-manager synthesis found orphaned cross-references** the earlier passes missed. Artifacts came out internally consistent — every F# had a matching D# and vice versa.
- **The spec survived "what does the gh CLI actually output?"** The 4-branch gh failure model in the original skill collapsed to 3 in the spec after the on-call-engineer verified the CLI signals. That correction happened at spec time, not at implementation time.

## What didn't work

- **Protocol overhead is sized for software features, not skill contributions.** The spec produces Outcome, Actors, Primary Flow, Alternate Flows, Edge Cases, User Interactions, Coordinations, Out of Scope, Open Items. For a markdown instruction file where the full behavioral surface fits in 10 steps, the 70-line spec is more scaffolding than the content warrants. The useful artifact for this use case turned out to be the implementation checklist (in plan-implementation), not the spec itself.
- **Step counting drift between phases.** The original skill had 8 steps. Review agent findings added directory creation and moved the emptiness check, producing a 10-step spec. The implementation plan then had to carefully number them again. Three numbering artifacts in three files that all needed to stay in sync. A direct adaptation checklist against the original would have been cleaner for this use case.
- **Rating dimensions were underspecified in the spec ("adapt to skill type").** This triggered a finding (F8) and a new decision (D9). A more prescriptive template for the rating table format would have resolved this at template time, not spec time.

## Overall

`plan-a-feature` produced a specification that is genuinely accurate and internally consistent, and the review agents added real value. The friction is in protocol fit: the skill is designed for speccing software features with coordinations across systems, actors, and failure modes. A skill contribution to an external repo is closer to a documentation PR with a well-defined checklist. The spec framework is not wrong here, but it generates more scaffolding than the decision count justifies. Worth considering a lighter-weight "contribution spec" path for this class of work.

## Rating

| Dimension | Score |
|---|---|
| Spec completeness | 5/5 |
| Evidence-first discipline | 5/5 |
| Review agent signal quality | 5/5 |
| Output length vs. decision count | 2/5 |
| Protocol fit for documentation contributions | 3/5 |

---

# han:plan-implementation

## What worked well

- **The adaptation checklist is the right artifact.** 13 numbered changes against the original SKILL.md, each citing the decision that drives it. A developer can diff the original and the checklist and know exactly what changed and why. That's more useful than re-reading the spec.
- **On-call-engineer was the right specialist for a skill file.** The failure path instructions are load-bearing in a way that structural or behavioral analysis wouldn't catch. The 4-to-3 gh branch collapse, the fish glob failure, the partial-file dedup trap — these are the kinds of errors that ship silently and confuse users. Exactly what the on-call-engineer is there for.
- **`.discovery-notes.md` as shared context worked.** The file prevented re-grepping across specialists. admin access confirmation, skill count, writing voice rules — all in one place, read once, referenced everywhere.
- **1-round convergence was the right outcome.** All 16 findings plan-level, none spec-level. The gate correctly went straight to synthesis.
- **YAGNI discipline: partial-file detection was correctly deferred.** Write tool can't distinguish partial from absent. The simpler instruction (tell user the path, have them check) satisfies the same behavioral commitment. That's the rule working correctly.

## What didn't work

- **Full plan-implementation protocol for a documentation PR.** The RAID log, security posture, and testing strategy sections are appropriately scoped (security = the gate is in the skill itself; testing = install locally and run) but still written at full template length. Two specialists, two template sections, one iteration round, project-manager synthesis — the process produced the right plan, but 80% of the output is infrastructure for decisions that were obvious from the start.
- **CONTRIBUTING.md checklist items were found by specialist agents.** The Evidence breadcrumb, the em-dash in the issue title, the step-order difference — a pre-flight read of the source SKILL.md against CONTRIBUTING.md would have caught these in 5 minutes. Dispatching agents to find them added turns without adding judgment. The pre-flight read is a better match for the discovery pattern here.
- **Plan section for CHANGELOG required a live skill count** that wasn't in the discovery notes. Correct call (count from disk), but it surfaced as a finding (C1) that could have been in the standard pre-flight.

## Overall

`plan-implementation` produced an actionable plan with the right adaptation checklist at its center. The specialist combination was correct. The gap is the same as plan-a-feature: the protocol is sized for code changes, and applying it to a documentation contribution generates scaffolding that outweighs the decision content. The framework is not wrong — the findings were real — but the time-to-plan ratio favors a lighter checklist-based approach for this use case. The skill contribution path would benefit from a dedicated mode or a reduced-template option for documentation-only changes.

## Rating

| Dimension | Score |
|---|---|
| Adaptation checklist quality | 5/5 |
| Specialist selection | 5/5 |
| Finding quality (signal-to-noise) | 4/5 |
| YAGNI discipline | 5/5 |
| Output length vs. decision count | 2/5 |
| Protocol fit for documentation contributions | 3/5 |
| Round efficiency | 5/5 |


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Han Feedback: plan-a-feature + plan-implementation (2026-05-29) #40

Han Feedback — 2026-05-29

han:plan-a-feature

What worked well

What didn't work

Overall

Rating

han:plan-implementation

What worked well

What didn't work

Overall

Rating

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Dimension	Score
Spec completeness	5/5
Evidence-first discipline	5/5
Review agent signal quality	5/5
Output length vs. decision count	2/5
Protocol fit for documentation contributions	3/5

Dimension	Score
Adaptation checklist quality	5/5
Specialist selection	5/5
Finding quality (signal-to-noise)	4/5
YAGNI discipline	5/5
Output length vs. decision count	2/5
Protocol fit for documentation contributions	3/5
Round efficiency	5/5

Han Feedback: plan-a-feature + plan-implementation (2026-05-29) #40

Description

Han Feedback — 2026-05-29

han:plan-a-feature

What worked well

What didn't work

Overall

Rating

han:plan-implementation

What worked well

What didn't work

Overall

Rating

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions