agentspawn

System that engineers production-grade AI agents. A 7-phase pipeline — classify, interview, research, design, build, validate, deliver — with hook-enforced quality gates and automated evaluation suites.

You describe what you need. agentspawn runs the full pipeline: classifies the agent type and complexity tier, interviews for requirements, researches the domain via a background subagent, produces a design brief for your approval, builds the agent files, validates against an eval suite, and delivers with usage instructions and a maintenance model.

Two phases are gate-enforced by shell hooks — the system cannot write agent files without an approved design brief, and cannot deliver without passing evals.

Usage

cd agentspawn
claude

> I need an agent that [your use case]

Pipeline

Phase	What	Your role
Classify	Agent type (Creative, Analytical, Operational, Regulated, Internal-Knowledge) + complexity tier (1-3)	Confirm
Interview	Structured discovery — 2-3 rounds shaped by agent type	Answer
Research	Domain knowledge via researcher subagent (background, read-only)	Review
Design	Design brief: failure taxonomy, skill plan, knowledge architecture	Approve
Build	Agent files — you invoke `/build`	Invoke
Validate	Eval suite via evaluator subagent — hard gates + scored dimensions	Review
Deliver	Usage instructions + maintenance model — you invoke `/deliver`	Invoke

Quality Gates

Shell hooks intercept tool calls and enforce phase ordering at runtime:

Hook	What it enforces
`require-design-brief.sh`	Blocks writes to agent directories unless `design-brief.md` exists
`require-evals-before-delivery.sh`	Blocks delivery artifacts unless eval results exist
`check-user-level-budget.sh`	Warns at 5, blocks at 10 user-level agents (context budget guard)

/build and /deliver require manual invocation — the system cannot auto-invoke them.

Output

The pipeline produces Claude Code native files:

File	Purpose
Agent definition (`.md`)	System prompt — identity, constraints, forbidden outputs, ambiguity strategy
Skills (`SKILL.md`)	Workflows with input specs, output templates, good + bad examples
Knowledge base (`docs/`)	Sourced domain data, checklists, volatile facts with review dates
Eval suite (`test-cases.md`)	Hard gates (binary, zero tolerance) + scored dimensions (1-5)

Deployment targets:

Target	Location	Use case
Standalone workspace	`generated-agents/[name]/`	The agent is the project
Project subagent	`.claude/agents/` + `.claude/skills/`	Agent augments existing work
User-level	`~/.claude/agents/` + `~/.claude/skills/`	Available in every session

Design Approach

Every agent gets a failure taxonomy before any files are written. The design brief ranks the top 3-5 failure modes by cost and maps each to a prevention mechanism — a rule in the system prompt, a template in a skill, an eval hard gate, or a hook.

The evaluation model separates hard gates (binary — any failure blocks release) from scored dimensions (1-5 on correctness, tone, completeness). Release criteria: all hard gates pass, ≥80% of representative cases score ≥3 on all dimensions.

Agent type determines everything downstream:

Type	Research Intensity	Top Risk
Creative/Voice	Low	Tone drift, generic output
Analytical/Research	High	Stale facts, hallucinated citations
Operational/Workflow	Low	Missed edge cases
Regulated/High-Risk	Medium	Non-compliant output
Internal-Knowledge	None	Confidentiality leaks

Architecture

agentspawn/
├── CLAUDE.md                                 # System prompt and operating rules
├── .claude/
│   ├── settings.json                         # Hook wiring
│   ├── skills/                               # One skill per pipeline phase
│   │   ├── classify/    interview/    research/    design/
│   │   └── build/       validate/     deliver/     memory-recall/
│   ├── agents/
│   │   ├── researcher.md                     # Background research (Sonnet, read-only)
│   │   └── evaluator.md                      # Eval execution (autonomous, isolated)
│   └── hooks/
│       ├── require-design-brief.sh           # Gate: design before build
│       ├── require-evals-before-delivery.sh  # Gate: evals before delivery
│       ├── check-user-level-budget.sh        # Context budget guard
│       ├── session-start-memory.sh           # Cross-session memory injection
│       └── pre-compact-backup.sh             # Memory persistence before compaction
├── templates/                                # Scaffolding for generated agent files
├── reference/                                # Failure taxonomy, knowledge architecture
└── generated-agents/                         # Pipeline output

Requirements

Claude Code CLI

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude		.claude
.github		.github
docs		docs
generated-agents		generated-agents
reference		reference
templates		templates
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agentspawn

Usage

Pipeline

Quality Gates

Output

Design Approach

Architecture

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agentspawn

Usage

Pipeline

Quality Gates

Output

Design Approach

Architecture

Requirements

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages