Clean up agent system: fix orchestrator.yaml schema compliance, unify validation paths by Copilot · Pull Request #28 · peternicholls/PaperKit

Copilot · 2026-01-06T04:40:08Z

Summary

Clean up PaperKit's agent system to establish the canonical two-file split:

YAML files (.paperkit/_cfg/agents/*.yaml) = schema-validated metadata ONLY
MD files (.paperkit/{core,specialist}/agents/*.md) = behavioural instructions ONLY

Changes

orchestrator.yaml

Contains ONLY schema-compliant metadata
All fields match agent-schema.json requirements
path points to .paperkit/core/agents/orchestrator.md

orchestrator.md

Removed YAML frontmatter (metadata belongs in YAML file only)
Contains behavioural instructions for the orchestrator
Includes explicit tie-break rules (capability → keyword → module → example prompt)
Specifies machine-parseable JSON output format

Documentation

Updated docs/dev/agent-audit/decision.md with correct canonical design
Updated docs/dev/agent-audit/how-agents-are-structured.md to clarify the split

Code Quality

Removed unused imports (os, Set) from check-agents.py

Validation Results

✓ 11/11 YAML files valid
✓ No duplicate agent names
✓ All path references valid
✓ Manifest matches YAML directory
✓ All MD files exist

Original prompt

Copilot Agent Mission Brief: PaperKit .paperkit/ cleanup and de-duplication

Goal

Clean up PaperKit’s agent system so there is one canonical agent definition system that:
• matches the agent-schema.json rules
• is what the runtime actually uses to load/run agents
• has no duplicated “second system” lingering in parallel (or, if backward-compat is required, duplicates are replaced with a deliberate compatibility layer)

Non-negotiables
• Do not break existing PaperKit workflows/CLI behaviour.
• Do not change the schema unless you can prove the schema is wrong (it looks good).
• Preserve academic integrity constraints in agent instructions (don’t accidentally weaken them during migration).
• Every change must be backed by repo evidence (grep + file inspection), not assumptions.

⸻

Phase 0 — Safety + Baseline
1. Create a branch: chore/agent-system-unification.
2. Record a baseline:
• Run existing validation command(s) and save output to docs/dev/agent-audit/baseline.txt.
• Run whatever tests/lints exist (or make test / pytest / etc.), save output too.
3. Make a “do no harm” inventory snapshot:
• Save tree -a .paperkit > docs/dev/agent-audit/paperkit-tree.txt (or equivalent on Windows).

⸻

Phase 1 — Forensic Discovery (find the two systems and who uses what)

1.1 Locate the schema + validator
• Find the schema file in repo:
• rg -n "agent-schema.json|PaperKit Agent Definition Schema" -S .paperkit
• Find the validator script referenced by the manifest comment:
• rg -n "validate-agent-schema|agent-schema" -S .paperkit open-agent-system tools

1.2 Enumerate “agent definition” locations

Collect counts and lists:
• .paperkit/core/agents/*
• .paperkit/specialist/agents/*
• any _cfg agents: .paperkit/_cfg/agents/* (or .paperkit/_cfg/...)
Commands:
• find .paperkit -maxdepth 4 -type f | rg "/agents/"
• find .paperkit -type f -name ".yaml" | rg "agents"
• find .paperkit -type f -name ".md" | rg "agents"

Write a short report to docs/dev/agent-audit/inventory.md listing:
• all agent definition files (path + extension)
• which look canonical (match schema path pattern vs not)
• any “manifest” index files (like your YAML manifest)

1.3 Determine runtime loading path (the key question)

Search code for references to:
• .paperkit/_cfg/agents
• .paperkit/core/agents
• .paperkit/specialist/agents
• “manifest”, “agent registry”, “agent loader”, “open-agent-system”
Commands:
• rg -n "_cfg/agents|core/agents|specialist/agents|agent manifest|agent loader|load.*agent" -S .

Identify:
• The code path(s) that load agents at runtime
• The expected file format (MD? YAML? front matter?)
• Whether the runtime uses the schema validator or bypasses it

Document findings in docs/dev/agent-audit/runtime-paths.md with:
• file + function names
• what directories are scanned
• how agent metadata is parsed

⸻

Phase 2 — Decide Canonical System (based on evidence)

Decision rule

Prefer the system that:
• matches agent-schema.json without hacks
• is already aligned with the repo structure (.paperkit/core and .paperkit/specialist)
• is easiest to keep stable long-term

Likely outcome (verify in repo):
• Canonical agent definitions = .paperkit/{core|specialist}/agents/.md
• _cfg/agents/.yaml is legacy OR internal config that got mistaken for canonical

But do not assume — prove by runtime loader.

Write the decision and rationale to docs/dev/agent-audit/decision.md.

⸻

Phase 3 — Unification Plan (choose one of these and implement)

Option A (preferred): Make MD agents canonical, retire YAML agents

Do this if runtime already supports MD agents or can easily be made to.

Steps:
1. Ensure every active agent in the manifest exists as a .paperkit/{core|specialist}/agents/.md
2. If YAML-only agents exist, migrate them to .md:
• Convert YAML fields → schema fields:
• name, displayName, title, icon, module, path
• identity.role, identity.description, identity.communicationStyle
• capabilities, constraints, principles, examplePrompts
• Put the operational prompt/instructions into the Markdown body.
3. Update the runtime loader to scan only:
• .paperkit/core/agents/.md
• .paperkit/specialist/agents/.md
4. Add a compatibility shim ONLY if needed:
• If parts of PaperKit still expect _cfg/agents/*.yaml, replace them with either:
• symlinks (if supported), or
• a generated file step, or
• a tiny adapter that reads YAML and emits in-memory schema objects
• BUT: the adapter should be explicitly marked “legacy” and have a deprecation path.

Option B: Make YAML canonical and update schema

Only do this if runtime is deeply YAML-based and MD agents don’t actually exist / aren’t used.
If you choose this, you must:
• update the schema’s path regex to match reality
• ensure schema validation still blocks drift
• update repo structure to remove misleading core/agents expectations

Given your screenshot and schema, Option A is probably right — but again: confirm.

⸻

Phase 4 — Remove duplica...

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

…cripts, add unified check-agents.py Co-authored-by: peternicholls <519525+peternicholls@users.noreply.github.com>

peternicholls · 2026-01-06T05:03:28Z

@copilot
You are working on PR #28 in PaperKit (“cleanup agent system / orchestrator schema compliance”).

Objective:

Make the orchestrator fully consistent with PaperKit’s agent-schema.json and the intended split:
- _cfg/agents/*.yaml = schema-validated metadata ONLY
- core/agents/.md and specialist/agents/.md = the actual prompt/instructions content referenced by path

Hard requirements:

Open and read .paperkit/_cfg/schemas/agent-schema.json and treat it as the source of truth.
Open and read .paperkit/_cfg/agents/orchestrator.yaml, .paperkit/_cfg/agent-manifest.yaml, and any loader/runner code that consumes these.
Rewrite orchestrator.yaml to contain ONLY keys permitted by agent-schema.json.
- Remove any non-schema keys such as: schemaVersion, status, instructions, decisionSchema (unless the schema explicitly allows them).
- Ensure required keys exist and are correct: name, displayName, title, icon, identity{role,description}, module, path.
- Ensure there are NO duplicate keys in YAML.
Ensure path points to a real markdown file: .paperkit/core/agents/orchestrator.md
Create/Update .paperkit/core/agents/orchestrator.md to contain the orchestrator’s behaviour:
- The orchestrator MUST NOT do the user’s task.
- It MUST output a machine-parseable routing decision.
- It MUST choose exactly one agent from the registry OR ask a clarifying question.
- Include explicit tie-break rules (capability match > module match > example prompt match).
Fix icon fields that are empty (manifest + orchestrator) to valid emojis.
Run and record:
- python .paperkit/tools/validate-agent-schema.py --ci --schema .paperkit/_cfg/schemas/agent-schema.json --agents-dir .paperkit/_cfg/agents
- python .paperkit/tools/check-agents.py --ci
  Commit any needed changes so those pass.

Deliverables:

Updated orchestrator.yaml (schema-clean)
Updated/created orchestrator.md (prompt content)
Any minimal adjustments to check-agents.py / validate-agent-schema.py ONLY if necessary to enforce the schema correctly (do not weaken validation).
Update docs/dev/agent-audit/decision.md if the canonical design changed.

Copilot

Pull request overview

This PR resolves validation inconsistencies in the PaperKit agent system by fixing schema compliance issues, correcting path references, and adding unified validation tooling.

Key changes:

Fixed orchestrator.yaml to comply with agent schema (added required identity and path fields, removed non-standard fields)
Corrected hardcoded .paper/ paths to .paperkit/ across validation scripts
Added unified check-agents.py tool for comprehensive agent system validation

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`.paperkit/_cfg/agents/orchestrator.yaml`	Completely restructured to match schema requirements with proper identity, capabilities, constraints, and input/output schemas
`.paperkit/_cfg/agent-manifest.yaml`	Added orchestrator entry and updated statistics (11 total agents, 7 core)
`.paperkit/tools/validate.py`	Fixed all hardcoded paths from `.paper/` to `.paperkit/` for schemas, agents, workflows, tools, and manifests
`.paperkit/tools/validate-agent-schema.py`	Updated default path documentation and hardcoded defaults from `.paper/` to `.paperkit/`
`.paperkit/tools/check-agents.py`	New unified validation tool that checks YAML schema compliance, duplicate names, path references, manifest consistency, and MD file coverage
`.github/workflows/validate-agent-metadata.yml`	Added unified check step and path triggers for core/specialist agent directories
`docs/dev/agent-audit/baseline.txt`	Documents pre-fix validation state showing orchestrator as invalid
`docs/dev/agent-audit/inventory.md`	Catalogs dual agent system (YAML metadata + MD definitions)
`docs/dev/agent-audit/runtime-paths.md`	Analyzes runtime loading paths and identifies path inconsistencies
`docs/dev/agent-audit/decision.md`	Documents rationale for keeping dual system and implementation plan
`docs/dev/agent-audit/how-agents-are-structured.md`	Comprehensive guide explaining agent structure and validation
`docs/dev/agent-audit/paperkit-tree.txt`	Directory tree snapshot for reference

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-06T05:03:30Z

docs/dev/agent-audit/decision.md

+### Phase 1: Fix Schema Compliance
+- [ ] Update `orchestrator.yaml` to be schema-compliant
+- [ ] Verify all YAML files pass validation
+
+### Phase 2: Fix Validation Scripts
+- [ ] Update `validate.py` to use `.paperkit/` paths
+
+### Phase 3: Update Manifest
+- [ ] Add orchestrator to `agent-manifest.yaml`
+
+### Phase 4: Add Unified Check
+- [ ] Create `tools/check-agents.sh` that:
+  - Validates all YAML files against schema
+  - Validates all MD frontmatter against schema
+  - Checks for name mismatches
+  - Ensures path references exist
+  - Detects duplicate agent names
+
+### Phase 5: Documentation
+- [ ] Create "How agents are structured" guide


The implementation plan shows all tasks as unchecked, but based on the PR changes, these tasks appear to have been completed. The checkboxes should be marked as complete to accurately reflect the work done in this PR.

Copilot · 2026-01-06T05:03:30Z

.paperkit/tools/check-agents.py

+"""
+
+import sys
+import os


Import of 'os' is not used.

Suggested change

import os

Copilot · 2026-01-06T05:03:31Z

.paperkit/tools/check-agents.py

+import yaml
+import argparse
+from pathlib import Path
+from typing import Dict, List, Tuple, Optional, Set


Import of 'Set' is not used.

Suggested change

from typing import Dict, List, Tuple, Optional, Set

from typing import Dict, List, Tuple, Optional

peternicholls · 2026-01-06T05:04:03Z

@copilot apply changes based on the comments in this thread

…ter), update decision.md with correct canonical design Co-authored-by: peternicholls <519525+peternicholls@users.noreply.github.com>

Copilot · 2026-01-06T05:08:21Z

@copilot
You are working on PR #28 in PaperKit (“cleanup agent system / orchestrator schema compliance”).

...

Addressed in commit ead9be9:

orchestrator.yaml - Contains only schema-compliant metadata (no instructions, decisionSchema, schemaVersion, status)
orchestrator.md - Now contains only behavioural instructions (removed YAML frontmatter)
Tie-break rules - Added explicit priority order: capability match → keyword-based routing → module match → example prompt match
All icons - Already valid emojis (🧭 for orchestrator)
Validation passes:
- validate-agent-schema.py --ci: ✓ 11/11 valid
- check-agents.py --ci: ✓ All checks passed
decision.md - Updated to reflect canonical two-file split design

Co-authored-by: peternicholls <519525+peternicholls@users.noreply.github.com>

Copilot · 2026-01-06T05:11:50Z

@copilot apply changes based on the comments in this thread

Fixed in commit d308552:

Removed unused os import
Removed unused Set import from typing

Note: The decision.md checkboxes were already marked as complete in a previous commit (ead9be9).

peternicholls · 2026-01-10T21:38:46Z

@copilot apply changes based on the comments in this thread

Copilot · 2026-01-10T21:40:02Z

@copilot apply changes based on the comments in this thread

These changes were already applied in commit d308552:

Removed unused os import from check-agents.py
Removed unused Set import from typing
Checkboxes in decision.md were marked complete in commit ead9be9

All review comments have been addressed.

Copilot AI assigned Copilot and peternicholls Jan 6, 2026

Initial plan

47e7e2e

Copilot started work on behalf of peternicholls January 6, 2026 04:40 View session

Fix orchestrator.yaml schema compliance, update paths in validation s…

dc6aea6

…cripts, add unified check-agents.py Co-authored-by: peternicholls <519525+peternicholls@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Clean up PaperKit agent system for de-duplication~~ Clean up agent system: fix orchestrator.yaml schema compliance, unify validation paths Jan 6, 2026

Copilot AI requested a review from peternicholls January 6, 2026 04:54

Copilot finished work on behalf of peternicholls January 6, 2026 04:54

peternicholls marked this pull request as ready for review January 6, 2026 04:59

Copilot AI review requested due to automatic review settings January 6, 2026 04:59

Copilot started reviewing on behalf of peternicholls January 6, 2026 04:59 View session

Copilot AI reviewed Jan 6, 2026

View reviewed changes

Copilot started work on behalf of peternicholls January 6, 2026 05:04 View session

Update orchestrator.md to contain instructions only (no YAML frontmat…

ead9be9

…ter), update decision.md with correct canonical design Co-authored-by: peternicholls <519525+peternicholls@users.noreply.github.com>

Copilot finished work on behalf of peternicholls January 6, 2026 05:09

Copilot started work on behalf of peternicholls January 6, 2026 05:10 View session

Remove unused imports (os, Set) from check-agents.py

d308552

Co-authored-by: peternicholls <519525+peternicholls@users.noreply.github.com>

Copilot finished work on behalf of peternicholls January 6, 2026 05:12

Copilot started work on behalf of peternicholls January 10, 2026 21:39 View session

Copilot finished work on behalf of peternicholls January 10, 2026 21:40

peternicholls merged commit 1537798 into master Jan 10, 2026
2 checks passed

peternicholls deleted the copilot/cleanup-agent-system branch January 10, 2026 21:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clean up agent system: fix orchestrator.yaml schema compliance, unify validation paths#28

Clean up agent system: fix orchestrator.yaml schema compliance, unify validation paths#28
peternicholls merged 4 commits intomasterfrom
copilot/cleanup-agent-system

Copilot AI commented Jan 6, 2026 •

edited

Loading

Uh oh!

peternicholls commented Jan 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

peternicholls commented Jan 6, 2026

Uh oh!

Copilot AI commented Jan 6, 2026

Uh oh!

Copilot AI commented Jan 6, 2026

Uh oh!

peternicholls commented Jan 10, 2026

Uh oh!

Copilot AI commented Jan 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	from typing import Dict, List, Tuple, Optional, Set
	from typing import Dict, List, Tuple, Optional

Uh oh!

Conversation

Copilot AI commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

orchestrator.yaml

orchestrator.md

Documentation

Code Quality

Validation Results

Uh oh!

peternicholls commented Jan 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

peternicholls commented Jan 6, 2026

Uh oh!

Copilot AI commented Jan 6, 2026

Uh oh!

Copilot AI commented Jan 6, 2026

Uh oh!

peternicholls commented Jan 10, 2026

Uh oh!

Copilot AI commented Jan 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Jan 6, 2026 •

edited

Loading