From 60d424ba753c8e9fffea072d20749a456079e096 Mon Sep 17 00:00:00 2001 From: oscar marina Date: Tue, 10 Mar 2026 14:40:21 +0100 Subject: [PATCH 1/4] chore: initial example --- .github/copilot-instructions.md | 103 +++++++ examples/logistics-control-tower/PROMPT.md | 105 +++++++ examples/mcp-task-widget/PROMPT.md | 42 +++ .../mcp-task-widget/mcp-task-widget-design.md | 124 ++++++++ .../mcp-task-widget/mcp-task-widget-intent.md | 70 +++++ .../mcp-task-widget-verification.md | 275 ++++++++++++++++++ 6 files changed, 719 insertions(+) create mode 100644 .github/copilot-instructions.md create mode 100644 examples/logistics-control-tower/PROMPT.md create mode 100644 examples/mcp-task-widget/PROMPT.md create mode 100644 examples/mcp-task-widget/mcp-task-widget-design.md create mode 100644 examples/mcp-task-widget/mcp-task-widget-intent.md create mode 100644 examples/mcp-task-widget/mcp-task-widget-verification.md diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 0000000..a104d25 --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,103 @@ +# Agent Kit — LLM Context + +This repository **is** the Agent Kit framework. It is not a project built with the framework — it is the framework itself. + +## What this repository contains + +Agent Kit is a process framework for LLM-assisted software development. It provides a structured process (lenses, gates, artifacts) and a learning mechanism (domain profiles) that accumulates knowledge across projects. + +When someone copies `agent-kit/` into their own repository and points `AGENTS.md` at `BUILDER.md`, any LLM can follow the process to build software with verification and domain-specific knowledge. + +## Repository structure + +``` +. +├── AGENTS.md # Example entry point (5 lines, points to BUILDER.md) +├── README.md # Explanation: what the framework is and why it exists +├── GUIDE.md # Tutorial: step-by-step first project walkthrough +├── LICENSE # MIT +│ +├── agent-kit/ # THE FRAMEWORK (this is what gets copied to target repos) +│ ├── BUILDER.md # Process contract — the LLM reads and follows this +│ ├── README.md # Technical reference — gates, artifacts, contracts +│ ├── domains/ +│ │ ├── _template.md # Template for creating new domain profiles +│ │ ├── README.md # How domain profiles work +│ │ └── apps-sdk-mcp-lit-vite.md # Real example profile (11 pitfalls, 7 adversary Qs) +│ └── templates/ +│ ├── INTENT.md # Template: what and why +│ ├── DESIGN.md # Template: how (architecture, decisions, risks) +│ └── VERIFICATION_LOG-template.md # Template: proof (gate output, progress) +└── docs/ # Would hold generated artifacts in a real project + └── .gitkeep +``` + +## Framework concepts + +- **Domain profiles** accumulate stack knowledge (pitfalls, adversary questions, checks, decisions) across projects +- **Gates** (0-4) are verification checkpoints with real command output — "assumed to pass" is never valid +- **Four lenses**: User, Architecture, Adversary, Domain — thinking modes, not sequential phases +- **Anti-Loop Rule**: produce the Intent document before continuing to investigate +- **Pre-Implementation Checkpoint**: 4 questions before writing any code +- **Resume**: verification log Progress section enables continuing interrupted work + +### The process (BUILDER.md) + +The LLM determines project size (Quick / Standard / Full), then follows a structured process: + +1. **Intent** — Capture what and why before doing anything (`docs/[project]-intent.md`) +2. **Domain profile** — Load accumulated stack knowledge, read every pitfall and adversary question +3. **Design** — Architecture, decisions, risks, pitfalls applied, adversary questions answered (`docs/[project]-design.md`) +4. **Pre-Implementation Checkpoint** — 4 mental questions before writing code +5. **Gated build** — Gates 0-4 with real command output recorded +6. **Self-review** — Adversary lens + domain checklist +7. **Domain learning** — Update the profile with new discoveries + +### Domain profiles (the differentiator) + +Domain profiles are living documents in `agent-kit/domains/`. They accumulate stack-specific knowledge — pitfalls, adversary questions, automated checks, decision history. Every gate failure becomes a new pitfall. Every project makes the next one better. + +A profile contains: Selection Metadata, Terminology Mapping, Verification Commands, Common Pitfalls, Adversary Questions, Integration Rules, Automated Checks, Decision History, Review Checklist. + +### Verification gates + +Gates are mandatory checkpoints with real command output. "Assumed to pass" is never valid. Gates 0 (deps) → 1 (scaffold) → 2 (features) → 3 (tests) → 4 (clean build). + +### Artifacts + +- **Intent** — Scope anchor. Given/when/then behaviors, MUST/MUST NOT constraints, IN/OUT scope. +- **Design** — Single document replacing PRD + tech spec + implementation plan. Includes Adversary Questions Applied and Domain Pitfalls Applied as separate mandatory sections. +- **Verification Log** — Gate evidence + Progress section for resuming interrupted work. + +### Anti-Loop Rule + +The LLM produces the Intent before continuing to investigate. Unclear decisions become open questions asked to the human — not reasons to keep researching. + +### Resume mechanism + +Each verification log has a Progress table at the top. When a session is interrupted, the next session reads Progress and continues from the last completed step. + +## Documentation follows Diátaxis + +| Document | Type | Serves | +|----------|------|--------| +| `README.md` | Explanation | Understanding — what and why | +| `GUIDE.md` | Tutorial | Learning — step-by-step first project | +| `agent-kit/README.md` | Reference | Information — specs, contracts, definitions | +| `BUILDER.md` | Reference | Information — the process contract (for LLMs) | + +## When modifying the framework + +- `BUILDER.md` is the source of truth for the process. Changes here affect how every LLM behaves. +- Domain profile `_template.md` defines what new profiles look like. Changes propagate to all future profiles. +- Template changes (`templates/*.md`) affect artifact structure for all future projects. +- `README.md`, `GUIDE.md`, and `agent-kit/README.md` must stay aligned with `BUILDER.md`. If the process changes, the docs must reflect it. +- Examples in `examples/` are historical artifacts — do not modify them to match framework changes. + +## Conventions + +- All framework documentation is in English. +- Domain profiles use a specific structure (see `_template.md`). Do not deviate. +- Verification logs are per-project: `docs/[project]-verification.md`, not a shared file. +- Project code goes in its own directory, never at the repo root. +- The AGENTS.md entry point is intentionally minimal (5 lines). Process logic lives in BUILDER.md. diff --git a/examples/logistics-control-tower/PROMPT.md b/examples/logistics-control-tower/PROMPT.md new file mode 100644 index 0000000..a31ddd8 --- /dev/null +++ b/examples/logistics-control-tower/PROMPT.md @@ -0,0 +1,105 @@ +# Example: Global Logistics Control Tower + +## Context + +This example reverse-engineers an existing project built with an earlier version of Agent Kit. The original was a multi-tenant cold-chain monitoring system with a Lit dashboard, risk engine, incident management, and audit trail. + +The prompt below is designed for the current version of the framework. It describes the same functional requirements but pushes for a bolder visual direction — moving from the original's corporate blue/teal palette to something more distinctive. + +No domain profile exists for this stack yet. The LLM will create one from `_template.md` during the Design phase. + +## The prompt + +``` +Read AGENTS.md. Build a Global Logistics Control Tower — a multi-tenant, +event-driven backend for real-time monitoring of temperature-sensitive +shipments (cold-chain logistics). + +Stack: Node.js (>=22), TypeScript, Lit 3 (CDN via esm.sh), node:http, +node:test. No external frameworks (no Express, no Vitest, no React). + +Core domain: +- Shipments with lifecycle: planned → in_transit → at_risk → incident → delivered +- IoT telemetry ingest (temperature, humidity, door open, battery, location) +- Risk engine: score 0-100 based on sensor thresholds + delivery delay +- Automatic incident opening when risk score crosses threshold (default: 70) +- Incident resolution with notes and audit trail +- Multi-tenant isolation on every read/write path +- Idempotent ingest (dedup by event ID and idempotency key) +- Out-of-order safe (timeline sorted by sensor timestamp, not arrival) +- Immutable audit trail for every lifecycle action +- Structured error envelope on all API responses (code, message, traceId) + +REST API: +- POST /api/shipments (create, idempotent) +- POST /api/telemetry (ingest, triggers risk recalc + status transition) +- POST /api/incidents/resolve +- GET /api/dashboard/{tenantId} (shipments, open incidents, KPIs) +- GET /api/audit/{tenantId} +- GET /api/alerts/{tenantId} (in-app + email outbox) +- GET /api/metrics (ingestCount, duplicateCount, p95 latency) + +Dashboard KPIs: on-time rate %, cold-chain breaches count, MTTR minutes. + +Frontend: Lit custom element served as static HTML. +Responsive 12-column CSS grid. Forms for creating shipments and ingesting +telemetry. Live lists for shipments, incidents, audit trail, and alerts. +Auto-refresh after every action. + +Visual direction: I want something visually bold and modern — not the +typical corporate dashboard with safe blues and grays. Think dark mode +with high-contrast accent colors, glassmorphism cards, gradient mesh +backgrounds, and strong typographic hierarchy. The UI should feel like +a mission control center, not a spreadsheet. Surprise me with the +palette but keep it readable and professional. + +Storage: in-memory maps (tenant-scoped). Include a PostgreSQL schema +artifact in db/schema.sql for future persistence. Include a +docker-compose.yml with PostgreSQL 16 and RabbitMQ 3.13 for +future event streaming. + +Testing: node:test runner. Cover risk scoring, status transitions, +full ingest flow with incident opening, error envelope contracts, +duplicate/out-of-order resilience, and a frontend smoke test. +Target: minimum 9 tests, all passing. + +Constraints: +- MUST: tenant isolation on every path +- MUST: idempotent ingest (no state mutation on duplicates) +- MUST: immutable audit trail +- MUST NOT: use Express, Fastify, or any HTTP framework +- MUST NOT: use jsdom or external test runners +- SHOULD: keep ingest p95 < 500ms +- SHOULD: keep architecture ready for RabbitMQ/PostgreSQL adapters +``` + +## What the framework should produce + +- **Size:** Full (new project, major architecture) +- **Domain profile:** New — no existing profile matches this backend-only Node.js stack. The LLM creates one from `_template.md` +- **Artifacts:** Intent, Design, Verification Log (with all gates), new domain profile + +## What to look for when reviewing the output + +1. **Domain profile creation** — The LLM should create a new profile (e.g., `backend-node-ts-event-driven.md`) with pitfalls specific to this stack: in-memory map tenant isolation, node:test quirks, ESM/CJS boundaries, idempotency edge cases. + +2. **Risk engine design** — The Design document should show scoring rules, threshold logic, and status state machine before any code is written. + +3. **Adversary Questions** — Even without an existing profile, the LLM should generate adversary questions in the Design: "What happens if two telemetry events arrive with the same timestamp?", "What happens if a tenant ID is missing from a request?", etc. + +4. **Gate failures becoming pitfalls** — If any gate fails (it likely will on first pass), watch for the LLM adding the root cause to the new domain profile. This is the flywheel starting. + +5. **Visual execution** — The prompt asks for a bold departure from typical dashboards. The Design should document the visual direction as an architectural decision with rationale. + +## Original project reference + +The original project (built with an earlier framework version) used: +- 438-line ControlTowerService with in-memory tenant-scoped maps +- 68-line risk engine with threshold-based scoring +- 206-line HTTP server with structured error handling +- 364-line Lit component with 12-column grid layout +- 9 tests covering unit, integration, contract, resilience, and smoke +- Corporate blue/teal palette (professional but conventional) +- All 4 verification gates passing + +The prompt above preserves all functional requirements while pushing for a more distinctive visual identity and letting the current framework version guide the process. diff --git a/examples/mcp-task-widget/PROMPT.md b/examples/mcp-task-widget/PROMPT.md new file mode 100644 index 0000000..1b013f5 --- /dev/null +++ b/examples/mcp-task-widget/PROMPT.md @@ -0,0 +1,42 @@ +# Example: MCP Task Widget + +## Context + +This example was generated by GPT 5.4 (ChatGPT) using Agent Kit. The goal was to build an MCP App for ChatGPT — a task management widget embedded in the conversation canvas. + +The domain profile `apps-sdk-mcp-lit-vite.md` already existed from a previous project. GPT 5.4 loaded it, followed the full process (Intent → Design → Gated Build → Self-Review), discovered a new pitfall during implementation (CSS asset reference), and updated the domain profile — demonstrating the learning cycle in action. + +## The prompt + +``` +Read AGENTS.md. Build an MCP App for ChatGPT that turns this existing +Lit todo application into an embedded widget with MCP tools for task +management. Use the components and patterns from this repo as the UI +base: https://github.com/oscarmarina/lit-signals-material — it uses +Lit 3 + signals + Material Web. The MCP server should expose tools for +CRUD operations on tasks, and the widget should communicate through the +MCP Apps bridge. Stack: MCP Apps, Lit, Vite, TypeScript. +``` + +## What Agent Kit determined + +- **Size:** Full (new project) +- **Domain profile:** `apps-sdk-mcp-lit-vite` (keyword score 6, unique match) +- **Artifacts produced:** Intent, Design, Verification Log + +## What happened during the build + +1. Gate 0 failed — package version was wrong (`@modelcontextprotocol/ext-apps@1.27.1` doesn't exist). GPT root-caused it (stale terminal cache), fixed the version, and re-ran. +2. All subsequent gates passed (Gate 1 through Gate 4). +3. During self-review, GPT discovered that the resource HTML referenced `assets/index.css` but Vite might not emit a CSS file when all styles live in Lit's `static styles`. It added **Pitfall 11** to the domain profile. +4. Final result: a working MCP App with 5 tests (3 server, 2 widget in Playwright), clean build from scratch passing. + +## Generated artifacts + +The files in this directory are the actual artifacts GPT 5.4 produced: + +- [`mcp-task-widget-intent.md`](mcp-task-widget-intent.md) — What and why +- [`mcp-task-widget-design.md`](mcp-task-widget-design.md) — How (architecture, decisions, risks, pitfalls applied) +- [`mcp-task-widget-verification.md`](mcp-task-widget-verification.md) — Proof (real gate output, failure history, domain profile updates) + +These show the complete output of the framework for a Full-sized project. No code is included — the artifacts are the interesting part. diff --git a/examples/mcp-task-widget/mcp-task-widget-design.md b/examples/mcp-task-widget/mcp-task-widget-design.md new file mode 100644 index 0000000..a2ac94f --- /dev/null +++ b/examples/mcp-task-widget/mcp-task-widget-design.md @@ -0,0 +1,124 @@ +# Design: MCP Task Widget + +**Intent:** docs/mcp-task-widget-intent.md +**Domain Profile:** agent-kit/domains/mcp-apps-lit-vite.md +**Date:** 2026-03-09 + +## Domain Profile Selection Rationale + +| Candidate Profile | Keyword Score | Excluded? | Reason | +|-------------------|---------------|-----------|--------| +| `mcp-apps-lit-vite` | 6 | No | Matches `mcp apps`, `lit`, `vite`, `widget`, `iframe`, and the ChatGPT embedded UI requirement | + +**Selected Profile:** `mcp-apps-lit-vite` +**Selection Basis:** Unique highest score `>= 2` + +## Architecture + +### Stack + +| Technology | Version | Verified Via | Purpose | +|-----------|---------|-------------|---------| +| `lit` | `3.3.2` | `npm view lit version` | Widget component model | +| `vite` | `7.3.1` | `npm view vite version --json` | Widget bundling | +| `typescript` | `5.9.3` | `npm view typescript version` | Shared typing and compilation | +| `@modelcontextprotocol/sdk` | `1.27.1` | `npm view @modelcontextprotocol/sdk version --registry=https://registry.npmjs.org` | MCP server transports and core server APIs | +| `@modelcontextprotocol/ext-apps` | `1.2.0` | `npm view @modelcontextprotocol/ext-apps version --registry=https://registry.npmjs.org` and the published package source | Widget bridge and app resource registration | +| `@material/web` | `2.4.1` | Referenced UI base package manifest | Material Web UI components | +| `@lit-labs/signals` | `0.2.0` | `npm view @lit-labs/signals version` | Signals-based widget state | +| `zod` | `4.x` | Package install lock verification during Gate 0 | Tool and persistence validation | + +### Structure + +- `src/server/` contains the MCP server entrypoint, HTTP bootstrap, app resource registration, task repository, and Zod schemas. +- `src/widget/` contains the Lit widget entrypoint, signal-based controller, Material Web custom elements, and CSS. +- `src/shared/` contains task schemas and types shared across server and widget boundaries. +- `docs/` contains intent, design, and verification artifacts. +- `data/tasks.json` stores persisted task state for local development and runtime. + +### Data Flow + +1. The HTTP server receives a request on `/mcp` and creates a stateless `StreamableHTTPServerTransport` plus a fresh `McpServer` instance. +2. MCP tools validate input with Zod, mutate the file-backed task repository, and return an authoritative `{ tasks, summary }` snapshot in `structuredContent`. +3. The widget initializes `App` from `@modelcontextprotocol/ext-apps`, receives tool results, and calls additional server tools with `app.callServerTool(...)`. +4. A widget controller stores authoritative task data in signals and derives counts, progress, and filtered lists via computed signals. +5. UI-only state such as draft text, edit mode, and active filter remains in widget-local signals and never crosses into server payloads. + +### Initialization Chain + +1. `src/server/main.ts` starts either stdio or the HTTP server. +2. The HTTP server mounts `/mcp` and `/widget/*`, enabling CORS on both routes. +3. The MCP tool registration links task tools to a `ui://tasks/task-board.html` resource. +4. ChatGPT fetches the resource HTML, which references externally served widget assets under `/widget/assets/*`. +5. `src/widget/main.ts` creates an `App`, registers lifecycle callbacks, applies host theme variables, connects to the host, and triggers the initial `list_tasks` call. +6. The Lit root component renders from signals and dispatches CRUD actions back through the bridge client. + +### Dependencies + +**Production:** +- `@lit-labs/signals` — signal and computed state for the widget controller +- `@material/web` — Material Web inputs, buttons, list, checkbox, divider, and progress components +- `@modelcontextprotocol/ext-apps` — widget `App`, host theming helpers, and server-side app registration helpers +- `@modelcontextprotocol/sdk` — MCP server implementation and transports +- `node:http` — HTTP server for `/mcp` and `/widget/*` +- `zod` — schema validation at MCP and persistence boundaries + +**Development:** +- `@types/node` — TypeScript types for runtime APIs +- `@vitest/browser-playwright` — browser-mode widget tests +- `playwright` — real browser runtime for widget verification +- `vitest` — server and widget tests +- `typescript` — compilation +- `vite` — widget build + +## Decisions + +| # | Decision | Choice | Alternatives Considered | Rationale | +|---|----------|--------|------------------------|-----------| +| 1 | Server framework | `node:http` with manual route handling | Express, Fastify | Matches the project decision from the previous session and keeps the HTTP surface minimal for this embedded MCP server | +| 2 | Tool surface | `list_tasks`, `create_task`, `update_task`, `delete_task` | Single monolithic task mutation tool | Explicit CRUD tools are easier for both models and widget code to reason about | +| 3 | Persistence | JSON file repository | In-memory only, SQLite | JSON is sufficient for a local MCP sample and avoids infrastructure overhead | +| 4 | Widget composition | Lit custom element plus signals controller | Monolithic imperative DOM app | Keeps the architecture close to the referenced `lit-signals-material` project | +| 5 | Styling | Material Web components plus custom CSS variables and gradient surfaces | Plain HTML controls | Preserves the requested UI base and provides a stronger embedded-widget presentation | +| 6 | Asset resolution | Stable Vite output names and server-generated resource HTML | Parsing Vite manifest at runtime | Stable names reduce moving parts and align with the selected domain profile | + +## Risks (Adversary Lens) + +| Risk | Impact | Likelihood | Mitigation | +|------|--------|------------|------------| +| Public origin is wrong behind a tunnel or proxy | Widget assets fail to load in ChatGPT | Medium | Resolve origin from env first, then forwarded headers, and include it in CSP and asset URLs | +| Persisted JSON is malformed or manually edited | Tool calls can crash or return invalid data | Medium | Parse storage through Zod and fall back to an empty list on invalid content | +| Widget attempts calls before bridge connection finishes | Initial render stalls or errors | Medium | Controller tracks connection state and defers actions until `app.connect()` resolves | +| Cross-origin module asset requests are blocked | Widget stays blank | Medium | Add `Access-Control-Allow-Origin: *` to `/widget/*` responses | +| Future changes leak UI-only fields into tool payloads | Model-visible data becomes polluted | Low | Centralize snapshot construction on the server and keep UI state in the controller only | + +## Domain Pitfalls Applied + +| Pitfall | Applies? | How Addressed | +|---------|----------|---------------| +| Official SDK bridge only | Yes | Widget uses `App` from `@modelcontextprotocol/ext-apps`; no custom bridge code is written | +| UI state must not enter `structuredContent` | Yes | Server snapshots contain only task entities and summary counts | +| Relative/portable widget assets | Yes | Vite uses `base: './'` and stable output names | +| No inline bundled scripts | Yes | Resource HTML points to `/widget/assets/index.js` and `/widget/assets/index.css` | +| CSP must include server domain | Yes | Resource metadata includes the resolved public origin in `connectDomains` and `resourceDomains` | +| Widget fallback background | Yes | Resource HTML includes an explicit body background style | +| Stateless transport | Yes | `/mcp` creates `StreamableHTTPServerTransport({ sessionIdGenerator: undefined })` | +| Widget tests must use a real browser | Yes | Vitest uses a browser project backed by Playwright | +| Widget assets need CORS headers | Yes | `/widget/*` responses include `Access-Control-Allow-Origin: *` | + +## Verification + +| Gate | Command | Pass Criteria | +|------|---------|---------------| +| 0 | `npm install` | Exit 0 and no unresolved dependency errors | +| 1 | `npm run build` | Server build succeeds and widget assets are emitted under `dist/widget/` | +| 2 | `npm run build && npm test` | Build succeeds and all tests pass | +| 3 | `npm test` | All server and widget tests pass | +| 4 | `rm -rf dist node_modules && npm install && npm run build && npm test` | Clean build and tests pass from scratch | + +## Test Strategy + +- **What to test:** task repository behavior, CRUD tool behavior, resource HTML generation, widget initial load, and widget CRUD interactions against a mocked bridge client +- **How:** Vitest workspace with a Node project for server tests and a Playwright browser project for widget tests +- **Coverage target:** At least core CRUD paths and the widget’s create/update/delete flow +- **What NOT to test:** Material Web internals, MCP SDK internals, or browser rendering details already owned by dependencies \ No newline at end of file diff --git a/examples/mcp-task-widget/mcp-task-widget-intent.md b/examples/mcp-task-widget/mcp-task-widget-intent.md new file mode 100644 index 0000000..1bfa863 --- /dev/null +++ b/examples/mcp-task-widget/mcp-task-widget-intent.md @@ -0,0 +1,70 @@ +# Intent: MCP Task Widget + +**Date:** 2026-03-09 +**Size:** Full +**Domain Profile:** agent-kit/domains/mcp-apps-lit-vite.md +**Supersedes:** — + +## Goal + +Build a ChatGPT-compatible MCP App that turns the Lit todo UI into an embedded widget backed by MCP tools for task CRUD operations. The widget should use Lit 3, signals, Material Web, and the MCP Apps bridge so task management works directly inside the conversation UI. + +## Behavior + +- **Given** ChatGPT renders the widget resource for the task tool, **when** the widget loads, **then** it connects through the official MCP Apps bridge and renders the authoritative task list returned by the server. +- **Given** a user enters a new task in the widget, **when** they submit it, **then** the widget calls an MCP server tool and re-renders from the returned task snapshot. +- **Given** a rendered task, **when** the user edits its title or completion state, **then** the widget updates the task through an MCP tool and reflects the updated server state. +- **Given** a rendered task, **when** the user deletes it, **then** the widget calls the delete MCP tool and removes the task from the rendered list using the returned snapshot. +- **Given** the MCP server is exposed over HTTP, **when** ChatGPT fetches the widget resource and assets, **then** the resource HTML and asset routes load correctly under the sandbox CSP and cross-origin rules. + +## Decisions + +| Decision | Choice | Rejected | Why | +|----------|--------|----------|-----| +| MCP UI bridge | `App` from `@modelcontextprotocol/ext-apps` | Custom `postMessage` bridge | The domain profile explicitly requires the official SDK and it handles host lifecycle safely | +| Server transport | Streamable HTTP plus stdio entrypoint | HTTP-only or stdio-only | HTTP is needed for ChatGPT embedding and stdio remains useful for local MCP testing | +| HTTP runtime | `node:http` | Express | This project already chose the native HTTP server and the reference implementation style fits the required routes | +| Task storage | File-backed JSON repository with Zod validation | Pure in-memory state | File-backed storage preserves tasks across server restarts without adding external infrastructure | +| Widget state model | Signals-based controller feeding Lit components | Ad hoc element-local state only | Matches the referenced Lit repo patterns and keeps server data authoritative | +| Widget asset delivery | External Vite-built assets served from `/widget/*` | Inline bundled HTML/JS | ChatGPT iframe CSP blocks inline bundled scripts | + +## Constraints + +**MUST:** +- Use MCP Apps, Lit, Vite, and TypeScript. +- Use components and patterns from the `lit-signals-material` repo as the UI base. +- Expose MCP tools for task CRUD operations. +- Have the widget communicate through the MCP Apps bridge. +- Produce project artifacts in `docs/`. + +**MUST NOT:** +- Implement a custom iframe bridge instead of the official MCP Apps SDK. +- Inline bundled widget JavaScript into the MCP resource HTML. +- Leak UI-only transient state into tool `structuredContent`. + +**SHOULD:** +- Keep the widget visually close to the referenced Lit + signals + Material style. +- Preserve deterministic build output for widget asset URLs. +- Include automated tests for both server and widget flows. + +## Scope + +**IN:** +- MCP server with task list, create, update, and delete tools +- Embedded Lit widget for creating, editing, completing, filtering, and deleting tasks +- File-backed task persistence and schema validation +- Vite widget build, TypeScript server build, and browser-based widget tests +- Verification log entries for all required gates + +**OUT:** +- Multi-user synchronization +- Authentication and per-user task separation +- Rich due dates, labels, reminders, or notifications +- Remote database integration + +## Acceptance + +- `npm install`, `npm run build`, and `npm test` pass. +- The MCP server serves `/mcp` and widget assets with a resource HTML document that ChatGPT can load. +- The widget uses MCP tool calls to create, read, update, and delete tasks. +- The widget styling and component approach reflect Lit 3 + signals + Material Web patterns from the referenced repo. \ No newline at end of file diff --git a/examples/mcp-task-widget/mcp-task-widget-verification.md b/examples/mcp-task-widget/mcp-task-widget-verification.md new file mode 100644 index 0000000..1e3b0c1 --- /dev/null +++ b/examples/mcp-task-widget/mcp-task-widget-verification.md @@ -0,0 +1,275 @@ +# Verification Log: MCP Task Widget + +This log captures the actual output of every verification gate. It is the source of truth for project completion status. + +**Rule:** No entry may be written without executing the command and pasting real output. "Assumed to pass" is not an entry. + +--- + +## Progress + +**Current phase:** Complete +**Last updated:** 2026-03-09 16:16 local + +| Step | Status | +|------|--------| +| Intent | PASS | +| Design | PASS | +| Gate 0: Dependencies | PASS | +| Gate 1: Scaffold | PASS | +| Gate 2: Feature | PASS | +| Gate 3: Tests | PASS | +| Gate 4: Clean build | PASS | +| Self-Review | PASS | +| Domain update | PASS | + +**Update this section after every gate or phase transition. When resuming interrupted work, read this section first.** + +--- + +## Gate 0: Dependencies +**Executed:** 2026-03-09 16:15 local +**Command:** `npm install` +**Exit code:** 0 +**Status:** PASS + +
+Output + +```text +removed 11 packages, and audited 159 packages in 632ms + +44 packages are looking for funding + run `npm fund` for details + +found 0 vulnerabilities +``` + +
+ +**Notes:** This gate was re-run after migrating the HTTP runtime from Express to `node:http`; the install removed direct Express packages from the root manifest while keeping the rest of the dependency graph healthy. + +--- + +## Gate 1: Scaffold Verification +**Executed:** 2026-03-09 16:15 local +**Command:** `npm run build` +**Exit code:** 0 +**Status:** PASS + +
+Output + +```text +> mcp-task-widget@1.0.0 build +> tsc --noEmit && vite build && tsc -p tsconfig.server.json + +vite v7.3.1 building client environment for production... +✓ 243 modules transformed. +dist/widget/index.html 0.45 kB │ gzip: 0.27 kB +dist/widget/assets/index.css 0.21 kB │ gzip: 0.16 kB +dist/widget/assets/index.js 546.29 kB │ gzip: 131.15 kB + +(!) Some chunks are larger than 500 kB after minification. Consider: +- Using dynamic import() to code-split the application +- Use build.rollupOptions.output.manualChunks to improve chunking: https://rollupjs.org/configuration-options/#output-manualchunks +- Adjust chunk size limit for this warning via build.chunkSizeWarningLimit. +✓ built in 555ms +``` + +
+ +**Notes:** Revalidated after replacing Express with `node:http`. Build output remains deterministic and still emits both `assets/index.js` and `assets/index.css` for the MCP resource HTML. + +--- + +## Gate 2: Feature Verification +**Executed:** 2026-03-09 16:16 local +**Command:** `npm run build && npm test` +**Exit code:** 0 +**Status:** PASS + +
+Output + +```text +> mcp-task-widget@1.0.0 build +> tsc --noEmit && vite build && tsc -p tsconfig.server.json + +vite v7.3.1 building client environment for production... +✓ 243 modules transformed. +dist/widget/index.html 0.45 kB │ gzip: 0.27 kB +dist/widget/assets/index.css 0.21 kB │ gzip: 0.16 kB +dist/widget/assets/index.js 546.29 kB │ gzip: 131.15 kB + +(!) Some chunks are larger than 500 kB after minification. Consider: +- Using dynamic import() to code-split the application +- Use build.rollupOptions.output.manualChunks to improve chunking: https://rollupjs.org/configuration-options/#output-manualchunks +- Adjust chunk size limit for this warning via build.chunkSizeWarningLimit. +✓ built in 552ms + +> mcp-task-widget@1.0.0 test +> vitest run + + + RUN v4.0.18 /Users/oscarmarina/Projects/AGENTS/openai-apps-sdk + + ✓ server src/server/task-service.test.ts (1 test) 7ms + ✓ server src/server/widget-resource.test.ts (2 tests) 1ms +stderr | unknown test +Lit is in dev mode. Not recommended for production! See https://lit.dev/msg/dev-mode for more information. + ✓ widget (chromium) src/widget/task-widget-app.test.ts (2 tests) 133ms + + Test Files 3 passed (3) + Tests 5 passed (5) + Start at 16:16:23 + Duration 1.39s (transform 47ms, setup 0ms, import 833ms, tests 141ms, environment 0ms) +``` + +
+ +**Notes:** Revalidated on the `node:http` implementation. The Lit dev-mode stderr is expected in test runs and did not affect correctness. + +--- + +## Gate 3: Test Verification +**Executed:** 2026-03-09 16:15 local +**Command:** `npm test` +**Exit code:** 0 +**Status:** PASS +**Tests passed:** 5/5 +**Coverage:** not measured + +
+Output + +```text +> mcp-task-widget@1.0.0 test +> vitest run + + + RUN v4.0.18 /Users/oscarmarina/Projects/AGENTS/openai-apps-sdk + +4:15:35 PM [vite] (client) Re-optimizing dependencies because lockfile has changed + ✓ server src/server/task-service.test.ts (1 test) 16ms + ✓ server src/server/widget-resource.test.ts (2 tests) 1ms +stderr | unknown test +Lit is in dev mode. Not recommended for production! See https://lit.dev/msg/dev-mode for more information. + ✓ widget (chromium) src/widget/task-widget-app.test.ts (2 tests) 149ms + + Test Files 3 passed (3) + Tests 5 passed (5) + Start at 16:15:35 + Duration 1.56s (transform 84ms, setup 0ms, import 1.11s, tests 166ms, environment 0ms) +``` + +
+ +**Notes:** Browser-mode widget tests executed in Chromium through `@vitest/browser-playwright` after the lockfile and runtime migration. + +--- + +## Gate 4: Final Verification (Clean Build) +**Executed:** 2026-03-09 16:16 local +**Clean command:** `rm -rf dist node_modules && npm install && npm run build && npm test` +**Exit code:** 0 +**Status:** PASS +**Tests passed:** 5/5 + +
+Output + +```text +added 158 packages, and audited 159 packages in 1s + +44 packages are looking for funding + run `npm fund` for details + +found 0 vulnerabilities + +> mcp-task-widget@1.0.0 build +> tsc --noEmit && vite build && tsc -p tsconfig.server.json + +vite v7.3.1 building client environment for production... +✓ 243 modules transformed. +dist/widget/index.html 0.45 kB │ gzip: 0.27 kB +dist/widget/assets/index.css 0.21 kB │ gzip: 0.16 kB +dist/widget/assets/index.js 546.29 kB │ gzip: 131.15 kB + +(!) Some chunks are larger than 500 kB after minification. Consider: +- Using dynamic import() to code-split the application +- Use build.rollupOptions.output.manualChunks to improve chunking: https://rollupjs.org/configuration-options/#output-manualchunks +- Adjust chunk size limit for this warning via build.chunkSizeWarningLimit. +✓ built in 602ms + +> mcp-task-widget@1.0.0 test +> vitest run + + + RUN v4.0.18 /Users/oscarmarina/Projects/AGENTS/openai-apps-sdk + + ✓ server src/server/task-service.test.ts (1 test) 25ms + ✓ server src/server/widget-resource.test.ts (2 tests) 5ms +stderr | unknown test +Lit is in dev mode. Not recommended for production! See https://lit.dev/msg/dev-mode for more information. + ✓ widget (chromium) src/widget/task-widget-app.test.ts (2 tests) 118ms + + Test Files 3 passed (3) + Tests 5 passed (5) + Start at 16:16:31 + Duration 1.58s (transform 81ms, setup 0ms, import 1.05s, tests 148ms, environment 0ms) +``` + +
+ +**Notes:** Clean verification confirms the `node:http` server bootstrap, start script target, and full reinstall path all work from scratch. + +--- + +## Self-Review (Full projects only) + +### Domain Checklist Results + +| Check | Command | Result | Pass? | +|-------|---------|--------|-------| +| Uses official SDK bridge | `grep -R "@modelcontextprotocol/ext-apps" -n src/widget ; grep -R "document\.referrer" -n src/widget ; true` | `src/widget/mcp-task-bridge.ts:7` imports the SDK; `document.referrer` had no matches | YES | +| No UI-only fields in structured content | `grep -R "selected\|expanded\|draft" -n src/server ; true` | No matches in `src/server` | YES | +| Relative Vite assets configured | `grep -n "base:\|entryFileNames" vite.config.ts` | Found `base: './'` and `entryFileNames: 'assets/index.js'` | YES | +| CSP includes server domain | `grep -R "connectDomains\|resourceDomains\|sessionIdGenerator\|access-control-allow-origin" -n src/server` | Found CSP domains in `src/server/widget-resource.ts`, CORS headers in `src/server/main.ts`, and stateless transport config in `src/server/main.ts` | YES | +| Widget tests use browser env | `grep -n "browser:" vitest.config.ts && grep -n "jsdom" vitest.config.ts` | Browser config present at `vitest.config.ts`; `jsdom` had no matches | YES | +| External script in resource HTML | `grep -R "widget/assets/index.js\|widget/assets/index.css" -n src/server` | Absolute asset references present in `src/server/widget-resource.ts` | YES | +| CSS asset exists when linked | `test -f dist/widget/assets/index.css && test -f dist/widget/assets/index.js` | Both files exist after build | YES | +| Startup path matches build output | `test -f dist/server/server/main.js && node -e "console.log(require('./package.json').scripts.start)"` | Built entry exists and script is `node dist/server/server/main.js` | YES | + +### Devil's Advocate +1. **What happens when:** `PUBLIC_ORIGIN` is not set in a remote deployment, ChatGPT loads the widget over a slower network with the current 546 kB JS bundle, or multiple users share the same server-side JSON task file. +2. **The weakest link is:** runtime deployment configuration. The local fallback origin is correct for development, but remote ChatGPT usage still depends on the operator setting a public HTTPS origin correctly. +3. **If I had to break this, I would:** deploy the server behind a tunnel without setting `PUBLIC_ORIGIN`, then let the resource HTML point back to `http://localhost:3001`, which would make widget asset loading fail in the host iframe. + +### Findings + +| # | Severity | Finding | Impact | +|---|----------|---------|--------| +| 1 | P2 | The built widget bundle is `546.29 kB` minified, which triggers Vite's chunk-size warning. | Embedded widget startup may be slower than necessary in ChatGPT and other MCP hosts. | +| 2 | P2 | Remote deployments still require `PUBLIC_ORIGIN` to be set correctly; the default fallback is development-only. | A misconfigured deployment can return valid MCP responses while the iframe fails to load widget assets. | +| 3 | P3 | Task persistence is global and file-backed rather than user-scoped. | A multi-user deployment would share one task list across all users unless a per-user storage layer is added. | + +--- + +## Failure History + +### 2026-03-09 Gate 0 FAILED +**Error:** `npm install` failed with `ETARGET` for `@modelcontextprotocol/ext-apps@1.27.1`. +**Root Cause:** Initial version verification was contaminated by a shared terminal that had drifted into a cloned `/tmp` repo, so the package versions recorded in `package.json` were not the actual npm registry versions. +**Fix:** Re-verified package versions in fresh shells against `https://registry.npmjs.org`, updated the manifest to `@modelcontextprotocol/ext-apps@1.2.0` and `@modelcontextprotocol/sdk@1.27.1`, and re-ran `npm install` successfully. +**Re-run result:** PASS — see Gate 0 above for passing output. + +--- + +## Domain Profile Updates + +| What Changed | Section Updated | Trigger | +|---|---|---| +| Added Pitfall 11 for fixed stylesheet links without an emitted CSS asset | Common Pitfalls | Resource HTML referenced `assets/index.css` before the build emitted that file | +| Added a CSS asset existence automated check | Automated Checks | Needed a repeatable way to verify that a fixed stylesheet link is valid after build | +| Added a decision to import a real global stylesheet when linking `/widget/assets/index.css` | Decision History / Review Checklist | Final widget resource design now depends on a stable emitted CSS asset | \ No newline at end of file From 5b92dc9fc4b987e32aa4364ec2391a6454a0d66f Mon Sep 17 00:00:00 2001 From: oscar marina Date: Fri, 20 Mar 2026 18:35:06 +0100 Subject: [PATCH 2/4] feat: Introduce Builder framework with comprehensive documentation and templates - Added BUILDER.md to outline the design and implementation process using multiple perspectives. - Created README.md for framework overview and setup instructions. - Established VERSION file to track framework versioning. - Introduced domain profiles with README.md and template for creating new profiles. - Developed templates for Intent, Design, and Verification Log documentation. - Implemented structured guidelines for verification gates and self-review protocols. - Enhanced domain profiles with operational matching contracts and common pitfalls. --- .github/copilot-instructions.md | 29 +++-- AGENTS.md | 2 +- GUIDE.md | 16 +-- README.md | 9 +- catalog/README.md | 51 ++++++++ .../apps-sdk-mcp-lit-vite.md | 0 catalog/web-kinu-preact-vite.md | 122 ++++++++++++++++++ examples/test-habit-tracker/PROMPT.md | 112 ++++++++++++++++ {agent-kit => framework}/BUILDER.md | 18 +-- {agent-kit => framework}/README.md | 11 +- framework/VERSION | 1 + {agent-kit => framework}/domains/README.md | 0 {agent-kit => framework}/domains/_template.md | 0 {agent-kit => framework}/templates/DESIGN.md | 2 +- {agent-kit => framework}/templates/INTENT.md | 2 +- .../templates/VERIFICATION_LOG-template.md | 0 16 files changed, 340 insertions(+), 35 deletions(-) create mode 100644 catalog/README.md rename {agent-kit/domains => catalog}/apps-sdk-mcp-lit-vite.md (100%) create mode 100644 catalog/web-kinu-preact-vite.md create mode 100644 examples/test-habit-tracker/PROMPT.md rename {agent-kit => framework}/BUILDER.md (96%) rename {agent-kit => framework}/README.md (96%) create mode 100644 framework/VERSION rename {agent-kit => framework}/domains/README.md (100%) rename {agent-kit => framework}/domains/_template.md (100%) rename {agent-kit => framework}/templates/DESIGN.md (98%) rename {agent-kit => framework}/templates/INTENT.md (97%) rename {agent-kit => framework}/templates/VERIFICATION_LOG-template.md (100%) diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index a104d25..6344f33 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -6,7 +6,7 @@ This repository **is** the Agent Kit framework. It is not a project built with t Agent Kit is a process framework for LLM-assisted software development. It provides a structured process (lenses, gates, artifacts) and a learning mechanism (domain profiles) that accumulates knowledge across projects. -When someone copies `agent-kit/` into their own repository and points `AGENTS.md` at `BUILDER.md`, any LLM can follow the process to build software with verification and domain-specific knowledge. +When someone copies `framework/` into their own repository and points `AGENTS.md` at `BUILDER.md`, any LLM can follow the process to build software with verification and domain-specific knowledge. ## Repository structure @@ -17,19 +17,29 @@ When someone copies `agent-kit/` into their own repository and points `AGENTS.md ├── GUIDE.md # Tutorial: step-by-step first project walkthrough ├── LICENSE # MIT │ -├── agent-kit/ # THE FRAMEWORK (this is what gets copied to target repos) +├── framework/ # THE FRAMEWORK (this is what gets copied to target repos) │ ├── BUILDER.md # Process contract — the LLM reads and follows this │ ├── README.md # Technical reference — gates, artifacts, contracts +│ ├── VERSION # Framework version for tracking updates │ ├── domains/ │ │ ├── _template.md # Template for creating new domain profiles -│ │ ├── README.md # How domain profiles work -│ │ └── apps-sdk-mcp-lit-vite.md # Real example profile (11 pitfalls, 7 adversary Qs) +│ │ └── README.md # How domain profiles work │ └── templates/ │ ├── INTENT.md # Template: what and why │ ├── DESIGN.md # Template: how (architecture, decisions, risks) │ └── VERIFICATION_LOG-template.md # Template: proof (gate output, progress) +│ +├── catalog/ # Community-contributed domain profiles +│ ├── README.md # How to use and contribute profiles +│ ├── apps-sdk-mcp-lit-vite.md # Real profile (11 pitfalls, 7 adversary Qs) +│ └── web-kinu-preact-vite.md # Real profile (4 pitfalls, 4 adversary Qs) +│ +├── examples/ # Real project artifacts showing the framework in action +│ ├── mcp-task-widget/ # Complete example with intent, design, verification +│ ├── logistics-control-tower/ # Full-sized project prompt +│ └── test-habit-tracker/ # Spanish-language prompt example +│ └── docs/ # Would hold generated artifacts in a real project - └── .gitkeep ``` ## Framework concepts @@ -55,7 +65,9 @@ The LLM determines project size (Quick / Standard / Full), then follows a struct ### Domain profiles (the differentiator) -Domain profiles are living documents in `agent-kit/domains/`. They accumulate stack-specific knowledge — pitfalls, adversary questions, automated checks, decision history. Every gate failure becomes a new pitfall. Every project makes the next one better. +Domain profiles are living documents in `framework/domains/`. They accumulate stack-specific knowledge — pitfalls, adversary questions, automated checks, decision history. Every gate failure becomes a new pitfall. Every project makes the next one better. + +Community-contributed profiles live in `catalog/`. Copy relevant ones into your project's `framework/domains/`. A profile contains: Selection Metadata, Terminology Mapping, Verification Commands, Common Pitfalls, Adversary Questions, Integration Rules, Automated Checks, Decision History, Review Checklist. @@ -83,7 +95,7 @@ Each verification log has a Progress table at the top. When a session is interru |----------|------|--------| | `README.md` | Explanation | Understanding — what and why | | `GUIDE.md` | Tutorial | Learning — step-by-step first project | -| `agent-kit/README.md` | Reference | Information — specs, contracts, definitions | +| `framework/README.md` | Reference | Information — specs, contracts, definitions | | `BUILDER.md` | Reference | Information — the process contract (for LLMs) | ## When modifying the framework @@ -91,8 +103,9 @@ Each verification log has a Progress table at the top. When a session is interru - `BUILDER.md` is the source of truth for the process. Changes here affect how every LLM behaves. - Domain profile `_template.md` defines what new profiles look like. Changes propagate to all future profiles. - Template changes (`templates/*.md`) affect artifact structure for all future projects. -- `README.md`, `GUIDE.md`, and `agent-kit/README.md` must stay aligned with `BUILDER.md`. If the process changes, the docs must reflect it. +- `README.md`, `GUIDE.md`, and `framework/README.md` must stay aligned with `BUILDER.md`. If the process changes, the docs must reflect it. - Examples in `examples/` are historical artifacts — do not modify them to match framework changes. +- Catalog profiles in `catalog/` are contributed by the community — review for quality but preserve the contributor's learnings. ## Conventions diff --git a/AGENTS.md b/AGENTS.md index cd10175..c2aa8ba 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,5 +1,5 @@ # Agent Instructions -Read and follow `agent-kit/BUILDER.md` for all tasks. +Read and follow `framework/BUILDER.md` for all tasks. Project artifacts (intent, design, verification) go in `docs/`. diff --git a/GUIDE.md b/GUIDE.md index c7add1a..e488271 100644 --- a/GUIDE.md +++ b/GUIDE.md @@ -13,7 +13,7 @@ In this tutorial, we will set up Agent Kit in a new repository and use it to bui Copy three things into your repository's root: ```bash -cp -R agent-kit/ your-repo/agent-kit/ +cp -R framework/ your-repo/framework/ cp AGENTS.md your-repo/ mkdir -p your-repo/docs ``` @@ -23,7 +23,7 @@ Your repo should now look like this: ``` your-repo/ ├── AGENTS.md -├── agent-kit/ +├── framework/ │ ├── BUILDER.md │ ├── domains/ │ │ └── _template.md @@ -39,7 +39,7 @@ Open `AGENTS.md` and verify it contains: ```markdown # Agent Instructions -Read and follow `agent-kit/BUILDER.md` for all tasks. +Read and follow `framework/BUILDER.md` for all tasks. Project artifacts (intent, design, verification) go in `docs/`. ``` @@ -96,9 +96,9 @@ This is your chance to course-correct. If a decision is wrong, say so now. If a ## Step 5: Watch the domain profile load (or get created) -If a domain profile exists for your stack in `agent-kit/domains/`, the LLM loads it and reads every pitfall and adversary question before continuing. +If a domain profile exists for your stack in `framework/domains/`, the LLM loads it and reads every pitfall and adversary question before continuing. -If no profile exists, the LLM creates one from `agent-kit/domains/_template.md`. The first version will be minimal — terminology mapping, verification commands, a couple of pitfalls. That's fine. It will grow. +If no profile exists, the LLM creates one from `framework/domains/_template.md`. The first version will be minimal — terminology mapping, verification commands, a couple of pitfalls. That's fine. It will grow. ## Step 6: Review the Design @@ -159,7 +159,7 @@ If Gate 4 passed, the project builds and tests from a clean state. That's the pr ## Step 9: Check the domain profile -Open `agent-kit/domains/[your-profile].md`. Compare it to how it looked before the project. You may see: +Open `framework/domains/[your-profile].md`. Compare it to how it looked before the project. You may see: - New entries in **Common Pitfalls** — things the LLM discovered during implementation - New **Adversary Questions** — traps specific to this stack @@ -184,8 +184,8 @@ The LLM reads the verification log's Progress section, finds the last completed **Add constraints as you discover preferences.** Every time you say "always do X" or "never do Y", the framework captures it — in the Intent for this project, in the domain profile for all future projects on this stack. -**Bring profiles to new repos.** When you start a new repository with the same stack, copy the domain profile along with `agent-kit/`. All accumulated knowledge travels with it. +**Bring profiles to new repos.** When you start a new repository with the same stack, copy the domain profile along with `framework/`. All accumulated knowledge travels with it. -For the full technical reference — file descriptions, gate definitions, domain profile contract, and artifact specs — see [`agent-kit/README.md`](agent-kit/README.md). +For the full technical reference — file descriptions, gate definitions, domain profile contract, and artifact specs — see [`framework/README.md`](framework/README.md). For the concepts behind the framework — why it works, how the learning cycle operates, what makes domain profiles different — see the [project README](README.md). diff --git a/README.md b/README.md index 98e5815..a716ffc 100644 --- a/README.md +++ b/README.md @@ -164,11 +164,16 @@ The framework is LLM-agnostic. Any model that can read markdown and follow instr See [**GUIDE.md**](GUIDE.md) for a step-by-step tutorial on setting up and using Agent Kit. -See [**agent-kit/README.md**](agent-kit/README.md) for the technical reference — file descriptions, gate definitions, artifact specs, and the domain profile contract. +See [**framework/README.md**](framework/README.md) for the technical reference — file descriptions, gate definitions, artifact specs, and the domain profile contract. ## Included examples -**Domain profile:** [`agent-kit/domains/apps-sdk-mcp-lit-vite.md`](agent-kit/domains/apps-sdk-mcp-lit-vite.md) — A real domain profile built across multiple projects with [Apps SDK](https://developers.openai.com/apps-sdk/quickstart) + Lit + Vite. 11 pitfalls, 7 adversary questions, automated checks, and decision history — all learned from real bugs. Shows what a mature profile looks like after the flywheel has turned a few times. +**Domain profiles:** The [`catalog/`](catalog/) directory contains community-contributed domain profiles built from real projects: + +- [`apps-sdk-mcp-lit-vite.md`](catalog/apps-sdk-mcp-lit-vite.md) — MCP Apps + Lit + Vite. 11 pitfalls, 7 adversary questions. Shows what a mature profile looks like after the flywheel has turned. +- [`web-kinu-preact-vite.md`](catalog/web-kinu-preact-vite.md) — Kinu + Preact + Vite. 4 pitfalls, 4 adversary questions. + +Copy any relevant profile from `catalog/` into your project's `framework/domains/` to start with accumulated knowledge. ## License diff --git a/catalog/README.md b/catalog/README.md new file mode 100644 index 0000000..e8e8500 --- /dev/null +++ b/catalog/README.md @@ -0,0 +1,51 @@ +# Domain Profile Catalog + +Community-contributed domain profiles for Agent Kit. Each profile captures stack-specific knowledge — pitfalls, adversary questions, verification commands, and decision history — learned from real projects. + +## Available Profiles + +| Profile | Stack | Pitfalls | Adversary Qs | +|---------|-------|----------|--------------| +| [apps-sdk-mcp-lit-vite](apps-sdk-mcp-lit-vite.md) | MCP Apps + Lit + Vite + TypeScript | 11 | 7 | +| [web-kinu-preact-vite](web-kinu-preact-vite.md) | Kinu + Preact + Vite + TypeScript | 4 | 4 | + +## Using a profile + +Copy the profile you need into your project's `framework/domains/` directory: + +```bash +cp catalog/web-kinu-preact-vite.md your-repo/framework/domains/ +``` + +The Builder will automatically detect and load it based on keyword matching when you describe your stack in a prompt. + +## Contributing a profile + +Domain profiles grow from real project experience. If you've built projects with a stack that isn't represented here, consider contributing your profile. + +### Requirements + +1. Use the template at `framework/domains/_template.md` +2. Follow the naming convention: `[domain]-[stack].md` (e.g., `web-react-nextjs.md`, `backend-python-fastapi.md`) +3. Include at minimum: + - **Selection Metadata** — Profile ID, Match Keywords, Use When, Do Not Use When + - **Terminology Mapping** — Stack-specific command translations + - **Verification Commands** — Exact commands for Gates 0-4 + - **2+ Common Pitfalls** — With What/Correct/Detection pattern +4. Every pitfall and adversary question should come from a real bug or failure — not hypotheticals + +### What makes a good profile + +- **Specific:** "Vite uses `--k-` CSS variable prefix, not `--p-`" beats "check your CSS variables" +- **Actionable:** Each pitfall has a Detection command that can be run mechanically +- **Honest:** If a pitfall was discovered by making the mistake, say so +- **Growing:** A profile with 3 real pitfalls is more valuable than one with 10 speculative ones + +### Submitting + +1. Fork the repository +2. Add your profile to `catalog/` +3. Open a pull request with: + - The stack and domain your profile covers + - How many projects informed the profile + - At least one example of a pitfall that prevented a real bug diff --git a/agent-kit/domains/apps-sdk-mcp-lit-vite.md b/catalog/apps-sdk-mcp-lit-vite.md similarity index 100% rename from agent-kit/domains/apps-sdk-mcp-lit-vite.md rename to catalog/apps-sdk-mcp-lit-vite.md diff --git a/catalog/web-kinu-preact-vite.md b/catalog/web-kinu-preact-vite.md new file mode 100644 index 0000000..ab20bd4 --- /dev/null +++ b/catalog/web-kinu-preact-vite.md @@ -0,0 +1,122 @@ +# Domain Profile: Web App — Kinu + Preact + Vite + +**Domain:** Web Frontend +**Stack:** Kinu UI toolkit + Preact + Vite + TypeScript +**Standards:** Semantic HTML, CSS custom properties, WCAG 2.1 basics + +## Selection Metadata (Operational Contract) + +**Profile ID:** web-kinu-preact-vite +**Match Keywords:** kinu, preact, vite, dashboard, ui toolkit, css variables, attribute selectors +**Use When:** Select this profile for web apps built with kinu (Preact UI toolkit) and bundled with Vite. +**Do Not Use When:** Do not use for React, Angular, Lit, or projects not using the kinu component library. + +## Terminology Mapping + +| Framework Term | Domain Term | Notes | +|---|---|---| +| Build/Compile | `pnpm run build` | Vite production build | +| Test suite | `pnpm run build` (type-check + build) | No test framework by default | +| Dev server | `pnpm run dev` | Vite dev server with HMR | +| Package/dependency | npm/pnpm package | kinu is a peer-dep library | +| Import/module | ES module / TypeScript module | Tree-shakeable kinu imports | +| Deployment | Static file hosting | Vite outputs to `dist/` | + +## Verification Commands + +**GATE 0 (Dependencies):** +- Command: `pnpm install` +- Expected output: lockfile created or updated, exit code 0, no unresolved dependency errors + +**GATE 1 (Scaffold):** +- Command: `pnpm run build` +- Expected output: Vite outputs assets to `dist/`, exit code 0 + +**GATE 2 (Feature):** +- Command: `pnpm run build` +- Expected output: build passes, no TypeScript errors, output artifacts exist + +**GATE 3 (Tests):** +- Command: `pnpm run build` (type-check is the minimum verification) +- Expected output: clean build with no errors + +**GATE 4 (Final):** +- Clean command (POSIX): `rm -rf dist node_modules && pnpm install && pnpm run build` +- Expected output: clean install and build pass from scratch + +## Common Pitfalls + +### Pitfall 1: Using `p` attribute instead of `k` attribute +- **What goes wrong:** The AGENTS.md summary mentions `[p="button"]` but kinu actually uses `[k="button"]` as the component identifier attribute. Using `p` produces unstyled elements. +- **Correct approach:** All kinu components use the `k` attribute. CSS selectors target `[k="component-name"]`. +- **Detection:** `rg 'p="' src/` — should have no matches for component identifiers using `p`. + +### Pitfall 2: Using `--p-` CSS variable prefix instead of `--k-` +- **What goes wrong:** CSS variables are prefixed `--k-` (e.g., `--k-primary`, `--k-background`), not `--p-`. +- **Correct approach:** Always use `--k-` prefix: `hsl(var(--k-primary))`. +- **Detection:** `rg '\-\-p\-' src/` — should have no matches. + +### Pitfall 3: Importing kinu CSS files incorrectly +- **What goes wrong:** Forgetting to import `kinu/style.css` (which includes variables.css, base.css, and all component styles) results in unstyled components. +- **Correct approach:** Import `kinu/style.css` once in the app entry point. +- **Detection:** `rg "kinu/style" src/` — should find at least one import. + +### Pitfall 4: Using JavaScript for variant defaults instead of CSS +- **What goes wrong:** Destructuring props to add fallback variant values in JS. Kinu's philosophy is CSS-driven: defaults are handled by `:not([variant])` selectors. +- **Correct approach:** Let CSS handle defaults. Pass variant props only when overriding. +- **Detection:** Review component code for `variant = "default"` destructuring patterns. + +## Adversary Questions + +- Does the app import `kinu/style.css` before rendering any kinu components? +- Are CSS custom properties (`--k-*`) used consistently instead of hardcoded colors? +- Are compound components (Dialog, Tabs, etc.) used with their sub-components correctly? +- Does the Vite config include the `@preact/preset-vite` plugin? + +## Integration Rules + +### Kinu Component Usage +- Import components directly: `import { Button, Card } from 'kinu'` +- Components forward all HTML attributes to the underlying element +- Variants are set via props that map to HTML attributes: `