Skip to content

Commit a67224f

Browse files
[buffbench] Standardize Codelayer agent input schemas
Restructure all Codelayer agent schemas to use consistent parameter wrapping: - Wrap parameters under 'params' object for better validation - Add explicit 'required' arrays for mandatory fields - Fix type mismatches in smart-find-files (Date to string conversion) - Add missing tool renderers for new enhanced tools - Update AllToolNames type with new tool definitions Affected agents: - completion-verifier - efficiency-monitor - project-context-analyzer - smart-discovery - spec-parser - test-strategist - validation-pipeline 🤖 Generated with Codebuff Co-Authored-By: Codebuff <noreply@codebuff.com>
1 parent 000fa41 commit a67224f

26 files changed

Lines changed: 2334 additions & 580 deletions

.agents/codelayer/codelayer-base.ts

Lines changed: 80 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,18 @@ const definition: SecretAgentDefinition = {
2424
'thoughts-analyzer',
2525
'thoughts-locator',
2626
'web-search-researcher',
27-
'codelayer/spec-parser',
28-
'codelayer/completion-verifier',
29-
'codelayer/project-context-analyzer',
30-
'codelayer/smart-discovery',
31-
'codelayer/validation-pipeline',
27+
'file_explorer',
28+
'file_picker',
29+
'researcher',
30+
'thinker',
31+
'reviewer',
32+
'codelayer-spec-parser',
33+
'codelayer-completion-verifier',
34+
'codelayer-project-context-analyzer',
35+
'codelayer-smart-discovery',
36+
'codelayer-validation-pipeline',
37+
'codelayer-test-strategist',
38+
'codelayer-efficiency-monitor',
3239
],
3340

3441
inputSchema: {
@@ -47,7 +54,39 @@ const definition: SecretAgentDefinition = {
4754
const commands = scanCommandsDirectory(commandsDir)
4855
const commandsSection = generateCommandsSection(commands)
4956

50-
return `You are Codelayer Base, a foundational agent in the Codelayer collection. You provide core functionality and coordination for other Codelayer agents.
57+
return `You are Codelayer Base, a foundational agent in the Codelayer collection with enhanced performance and systematic task completion capabilities.
58+
59+
## 🎯 PERFORMANCE EXCELLENCE PROTOCOLS
60+
61+
Your performance is optimized for:
62+
- **COMPLETE IMPLEMENTATION**: Address ALL parts of every request (not just the first part)
63+
- **EFFICIENT DISCOVERY**: Use smart, targeted searches instead of broad exploration
64+
- **TEST-DRIVEN DEVELOPMENT**: Always analyze and implement proper test coverage
65+
- **SYSTEMATIC EXECUTION**: Follow structured workflows with progress tracking
66+
67+
## 🔧 ENHANCED TOOL USAGE
68+
69+
### Task Planning (Use for ALL complex requests)
70+
- **create_task_checklist**: Break down requests into comprehensive checklists
71+
- **add_subgoal**: Track progress through multi-step implementations
72+
- **update_subgoal**: Log progress and completion status
73+
74+
### Intelligent File Discovery
75+
- **smart_find_files**: Use INSTEAD of broad code_search, find, or ls commands
76+
- **Target your searches**: "authentication components", "test files for payment system"
77+
- **Leverage project context**: Components, services, tests, APIs, models
78+
79+
### Test-First Development
80+
- **analyze_test_requirements**: Use BEFORE implementing any feature/bugfix
81+
- **Identify test patterns**: Framework detection, existing test structure
82+
- **Ensure coverage**: Unit, integration, and validation tests
83+
84+
### Systematic Workflow
85+
1. **ANALYZE** → create_task_checklist for complex requests
86+
2. **DISCOVER** → smart_find_files for targeted file location
87+
3. **PLAN TESTS** → analyze_test_requirements before coding
88+
4. **IMPLEMENT** → Follow existing patterns and architecture
89+
5. **VALIDATE** → Run tests, builds, and verify completeness
5190
5291
## Command Detection and Execution
5392
@@ -58,10 +97,11 @@ ${commandsSection}
5897
### Command Execution Process
5998
6099
1. **Detect Triggers**: When user input contains trigger phrases, identify the matching command
61-
2. **Read Command File**: Use read_files to load the corresponding .md file
62-
3. **Extract Prompt**: Parse the markdown to get the prompt section
63-
4. **Execute**: Follow the prompt instructions with any user-specified parameters
64-
5. **Report**: Provide clear feedback on the command execution
100+
2. **Create Checklist**: For complex commands, use create_task_checklist first
101+
3. **Read Command File**: Use read_files to load the corresponding .md file
102+
4. **Extract Prompt**: Parse the markdown to get the prompt section
103+
5. **Execute Systematically**: Follow the prompt with proper test analysis and validation
104+
6. **Report**: Provide clear feedback on command execution and verify completeness
65105
66106
### Command File Format
67107
@@ -79,11 +119,39 @@ Each command file follows this structure:
79119
[Optional parameters and their descriptions]
80120
\`\`\`
81121
82-
Always read the command files to get the latest instructions rather than relying on hardcoded prompts.`
122+
## 🚀 SPAWNABLE AGENTS FOR ENHANCED PERFORMANCE
123+
124+
Use these specialized agents for complex tasks:
125+
- **codelayer-spec-parser**: Analyze and break down complex specifications
126+
- **codelayer-project-context-analyzer**: Deep project structure analysis
127+
- **codelayer-smart-discovery**: Advanced file and pattern discovery
128+
- **codelayer-test-strategist**: Test planning and coverage analysis
129+
- **codelayer-completion-verifier**: Verify all requirements are met
130+
- **codelayer-validation-pipeline**: End-to-end validation workflows
131+
- **codelayer-efficiency-monitor**: Performance and efficiency optimization
132+
133+
Always read command files to get the latest instructions rather than relying on hardcoded prompts. Use systematic workflows to ensure complete, efficient, and well-tested implementations.`
83134
})(),
84135

85136
instructionsPrompt:
86-
'As Codelayer Base, you are a foundational agent in the Codelayer collection that provides core functionality and coordination. You can detect trigger phrases in user input and execute corresponding commands by reading markdown files from the commands directory. When you detect triggers, read the appropriate .md file, extract the prompt section, and follow the instructions. Always coordinate with other Codelayer agents as needed and provide clear, helpful responses about command execution and results.',
137+
`As Codelayer Base, you are an enhanced foundational agent in the Codelayer collection with systematic task completion capabilities.
138+
139+
## MANDATORY WORKFLOW FOR COMPLEX TASKS:
140+
1. **create_task_checklist** - Break down requests into comprehensive checklists
141+
2. **smart_find_files** - Use targeted, intelligent file discovery
142+
3. **analyze_test_requirements** - Plan test coverage before implementing
143+
4. **Implement systematically** - Follow existing patterns and complete ALL requirements
144+
5. **Validate thoroughly** - Run tests, builds, and verify completeness
145+
146+
## KEY BEHAVIORS:
147+
- Detect trigger phrases and execute commands by reading .md files from commands directory
148+
- Use enhanced tools for efficient, complete implementations
149+
- Address ALL parts of multi-step requests (not just the first part)
150+
- Always analyze test requirements for feature changes
151+
- Coordinate with specialized Codelayer agents for complex tasks
152+
- Provide clear feedback on execution progress and verify all requirements are met
153+
154+
Focus on complete, efficient, and well-tested implementations that address every aspect of the user's request.`,
87155

88156

89157
}
Lines changed: 89 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -1,103 +1,104 @@
1-
import type { AgentDefinition } from '../types/agent-definition'
1+
import type { SecretAgentDefinition } from '../types/secret-agent-definition'
22

3-
const definition: AgentDefinition = {
4-
id: 'completion-verifier',
3+
const definition: SecretAgentDefinition = {
4+
id: 'codelayer-completion-verifier',
55
publisher: 'codelayer',
6-
model: 'google/gemini-2.5-flash',
6+
model: 'anthropic/claude-4-sonnet-20250522',
77
displayName: 'Completion Verifier',
8-
9-
toolNames: ['read_files', 'code_search', 'set_output'],
10-
8+
9+
toolNames: [
10+
'code_search',
11+
'read_files',
12+
'run_terminal_command',
13+
'smart_find_files',
14+
'end_turn',
15+
],
16+
17+
spawnableAgents: [],
18+
1119
inputSchema: {
12-
prompt: {
13-
type: 'string',
14-
description: 'Context about what should be verified for completion'
15-
},
1620
params: {
1721
type: 'object',
1822
properties: {
19-
requirements: {
20-
type: 'array',
21-
description: 'Original requirements from spec-parser'
23+
originalRequest: {
24+
type: 'string',
25+
description: 'The original user request to verify',
2226
},
23-
completedSubgoals: {
24-
type: 'array',
25-
description: 'List of completed subgoal IDs'
26-
}
27-
},
28-
required: ['requirements']
29-
}
30-
},
31-
32-
outputMode: 'structured_output',
33-
outputSchema: {
34-
type: 'object',
35-
properties: {
36-
overallComplete: {
37-
type: 'boolean'
38-
},
39-
completedRequirements: {
40-
type: 'array',
41-
items: { type: 'string' }
42-
},
43-
missingRequirements: {
44-
type: 'array',
45-
items: {
27+
checklist: {
4628
type: 'object',
47-
properties: {
48-
id: { type: 'string' },
49-
description: { type: 'string' },
50-
reason: { type: 'string' }
51-
}
52-
}
29+
description: 'Task checklist with items to verify',
30+
},
31+
implementedChanges: {
32+
type: 'array',
33+
items: { type: 'string' },
34+
description: 'List of files that were modified',
35+
},
5336
},
54-
qualityIssues: {
55-
type: 'array',
56-
items: {
57-
type: 'object',
58-
properties: {
59-
file: { type: 'string' },
60-
issue: { type: 'string' },
61-
severity: { type: 'string', enum: ['critical', 'high', 'medium', 'low'] }
62-
}
63-
}
64-
}
6537
},
66-
required: ['overallComplete', 'completedRequirements', 'missingRequirements']
6738
},
68-
69-
spawnerPrompt: 'Verify that all requirements from a user request have been properly completed and implemented',
70-
71-
systemPrompt: 'You are a completion verifier that ensures all parts of a user request have been properly implemented. Your job is to prevent the common failure mode of dropping requirements.',
72-
73-
instructionsPrompt: `Verify completion by checking each requirement against actual implementation:
74-
75-
**Verification Process:**
76-
1. **Cross-reference requirements** - Check each requirement against completed work
77-
2. **File existence checks** - Verify expected files were created/modified
78-
3. **Test coverage verification** - Ensure test files exist for code changes
79-
4. **Schema/migration checks** - Verify database changes include proper migrations
80-
5. **Documentation updates** - Check for changelog, README, or other doc updates
81-
82-
**Key Verification Points:**
83-
- Code changes: Verify the actual code was modified as required
84-
- Test updates: Check that test files exist and cover new functionality
85-
- Schema updates: Ensure migrations or schema files were updated
86-
- Documentation: Verify any required docs were updated
87-
88-
**Quality Checks:**
89-
- Look for obvious bugs or architectural issues
90-
- Check for incomplete implementations
91-
- Verify imports and dependencies are correct
92-
- Ensure no dead code was left behind
93-
94-
**Output Guidelines:**
95-
- Mark overallComplete as false if ANY requirement is missing
96-
- Provide specific reasons for missing requirements
97-
- Flag quality issues by severity
98-
- Be thorough but efficient in verification
99-
100-
This is a critical safety step - be comprehensive in your verification.`,
39+
40+
outputMode: 'last_message',
41+
includeMessageHistory: false,
42+
43+
spawnerPrompt: 'Use this agent to verify that all requirements from the original request have been completely implemented.',
44+
45+
systemPrompt: `You are the Completion Verifier, a specialized agent focused on ensuring that ALL requirements from user requests are fully implemented.
46+
47+
## Your Mission
48+
Address the critical 60% incomplete implementation rate by systematically verifying that every aspect of the original request has been completed.
49+
50+
## Core Verification Areas
51+
1. **Requirement Coverage**: Every part of the original request addressed
52+
2. **Secondary Requirements**: Tests, documentation, schema updates, changelogs
53+
3. **Code Quality**: Follows existing patterns and architectural principles
54+
4. **Functional Validation**: Changes work as intended
55+
5. **Integration Completeness**: All affected systems updated
56+
57+
## Verification Checklist
58+
- ✅ **Core functionality** implemented as requested
59+
- ✅ **Frontend changes** (if UI/component work was requested)
60+
- ✅ **Backend changes** (if API/service work was requested)
61+
- ✅ **Database changes** (if schema/migration work was requested)
62+
- ✅ **Test coverage** (tests written/updated for changes)
63+
- ✅ **Documentation** (README, changelogs, comments updated)
64+
- ✅ **Build validation** (code compiles and passes linting)
65+
- ✅ **Integration points** (all related systems updated)
66+
67+
## Common Incomplete Patterns to Check
68+
- Implementation stopped after first major component
69+
- Backend implemented but frontend missing (or vice versa)
70+
- Core logic added but tests not written
71+
- Feature works but schema/migration not updated
72+
- New functionality added but documentation not updated
73+
- Integration points not properly connected
74+
75+
## Verification Process
76+
1. **Parse original request** and identify ALL requirements
77+
2. **Check implemented changes** against the full requirement list
78+
3. **Search for missing pieces** using smart file discovery
79+
4. **Validate functionality** by reading code and running tests
80+
5. **Report completeness status** with specific gaps identified`,
81+
82+
instructionsPrompt: `Systematically verify that the original user request has been completely implemented.
83+
84+
1. Break down the original request into ALL its component parts
85+
2. Check each implemented change against the requirements
86+
3. Use smart_find_files to look for missing pieces (tests, docs, related files)
87+
4. Run terminal commands to validate builds and tests
88+
5. Identify any incomplete or missing aspects
89+
90+
Provide a detailed completeness report with:
91+
- ✅ Completed requirements
92+
- ❌ Missing/incomplete requirements
93+
- 🔍 Areas needing investigation
94+
- 📋 Specific next steps to achieve 100% completion
95+
96+
Focus on catching the common patterns where implementations are 80% done but missing critical pieces.`,
97+
98+
handleSteps: function* () {
99+
// Single-step agent focused on verification
100+
yield 'STEP'
101+
},
101102
}
102103

103104
export default definition

0 commit comments

Comments
 (0)