Replies: 1 comment
-
|
Please deliver this as a hosted extension as per https://github.com/github/spec-kit/tree/main/extensions If you have any questions feel free to ask. Note this way the core process does not need to change, but you can create additional steps. If you are looking to override what each command does then a preset would be more appropriate. See https://github.com/github/spec-kit/tree/main/presets. Note our documentation site has a link to community hosted extensions and presets that you can look at. See https://github.github.com/spec-kit/community/overview.html |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Context
We've been using spec-kit on a real subsystem — an LLM agent tool dispatcher with authz, rate-limit, approval flow, idempotency, and audit requirements. The
/specify → /clarify → /plan → /tasksflow worked well for user-story-driven features. During code review, we found about 14 categories of non-functional concerns that didn't have a clear home in spec-kit's artifacts. They ended up in a separate single-file engineering design document.Before considering PRs, we'd like to understand maintainers' position on scope so we know whether to propose upstream additions or build externally.
What spec-kit covers well (verified by reading the templates)
spec.md[NEEDS CLARIFICATION]flagging —/clarifyflowplan.mdTechnical Contextplan.mdresearch.mddata-model.mdcontracts/[P]parallelism —tasks.mdThis is genuinely a lot, and we relied on all of it.
What we couldn't place
Below are concrete content types from our real design document that did not fit any spec-kit template. Each has a one-line description of the failure mode if it's missing.
Error-code contract — a table of
code × trigger × counts-toward-rate-limit × caller-visible message. Without it, each handler invents its own error structure and clients can't uniformly handle failures.State machine — typically a
stateDiagram-v2with legal transitions and an explicit illegal-transition policy. Approval flows with TTL-based expiry are the common case.Cross-cutting execution order — the precise order of authz, schema validation, rate-limit, approval gate, planning, and timeout-wrapped execution. Order is a contract; getting it wrong leaks "this resource exists" via schema errors before authz denies, or charges rate-limit budget against denied users.
Authorization invariants (pseudocode + invariants) —
tenants AND roles, no implicit super-admin bypass, enumerateddeny_reasonfor forensics. Plain-English "the feature must check authz" routinely results intenants OR rolesin practice.Failure-mode scheduling rules — timeout, approval persistence + TTL, idempotency-key derivation, per-run hard cap (calls / tokens / duration), cancel propagation across transport layers.
Trust labels and prompt-injection mitigation —
trusted/partially_trusted/untrustedlabeling for content flowing into LLM context, plus the sanitization pipeline (truncation → control-char escape → boundary tagging → system-prompt hardening).Network egress policy — application-layer allowlist for outbound HTTP from tools / handlers.
Observability schema — OTel span hierarchy with required attributes, metric names with label sets, alert thresholds. We treat span names and metric names as part of the interface contract.
Audit event schema — JSON event with required fields (
traceId,runId,callId,principal,result.code,result.durationMs) and retention policy.Framework adapter pattern — isolating third-party framework types behind an adapter so major-version upgrades don't propagate through business code.
Alternatives considered — options rejected, with reasons. Code review repeatedly asks "why not X?"; writing it down once saves cycles.
Rollout / canary / rollback — milestones, feature flags, rollback paths with RTO targets.
Risk register — known risks with mitigations, owners, and trigger conditions.
Anti-pattern citations — pointers to specific lines in reference codebases that motivated each principle. (e.g., "rejected because reference framework Foo
Bar.java:NN-NNscattered cross-cutting led to a tenant-leak incident".)Where these landed in practice
We used
contracts/for (1) and (8). Everything else went into a single sibling file we calledengineering-design.mdthat sits betweenplan.mdandtasks.md. Tasks intasks.mdreference section anchors in that file. This works, but it's outside spec-kit's standard layout.Question
Are any of these in scope for upstream spec-kit? We see three possible answers:
alternatives.md,risks.md, an "Alternatives Considered" section inplan.md.contracts/).What we've prepared
We've published a complementary repo with templates and a worked example for the categories above, anonymized to the LLM tool dispatcher domain:
templates/engineering-design.mdexamples/llm-tool-dispatcher/design.mdplaybooks/spec-kit-integration.md(positions our document betweenplan.mdandtasks.md)docs/why.md(the 14 categories enumerated above with failure modes)The repo is positioned as a complement (Apache 2.0, same license as spec-kit, "we complement, we do not fork"). If maintainers signal that some of the categories are in-scope upstream, we'll migrate those over to PRs and link from our repo so users find the canonical location.
Why we're asking now
We'd rather contribute small focused PRs that you'd accept than maintain a parallel project forever. But before opening 6 PRs, we want to know which ones land in your "yes," "no," or "maybe" buckets.
Thanks for spec-kit — the underlying flow is exactly what we needed for everything that fit, and we're trying to extend the same discipline to what didn't fit.
Beta Was this translation helpful? Give feedback.
All reactions