-
Notifications
You must be signed in to change notification settings - Fork 2
tester.ts required fields for open ai models #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
tester.ts required fields for open ai models #6
Conversation
|
CodeAnt AI is reviewing your PR. |
Nitpicks 🔍
|
| suggestion: z.string().describe('Specific next action recommendation'), | ||
| recommendReset: z.boolean().optional().describe('Recommend reset() if persistent failures suggest navigation issues'), | ||
| recommendStop: z.boolean().optional().describe('Recommend stop() if test is fundamentally incompatible or cannot proceed'), | ||
| recommendReset: z.boolean().describe('Recommend reset() if persistent failures suggest navigation issues'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: The recommendReset field is required in the Zod schema, but the model prompt does not explicitly guarantee it will always be returned, so generateObject can fail validation if the model omits it, causing response to be undefined and silently skipping the progress report logic. [possible bug]
Severity Level: Major ⚠️
- ❌ Progress checkpoint report skipped during test run.
- ⚠️ analyzeProgress() returns empty string instead of report.
- ⚠️ Test guidance (reset suggestion) may be lost.| recommendReset: z.boolean().describe('Recommend reset() if persistent failures suggest navigation issues'), | |
| recommendReset: z.boolean().optional().describe('Recommend reset() if persistent failures suggest navigation issues'), |
Steps of Reproduction ✅
1. Run a test that reaches the progress analysis checkpoint so Tester.analyzeProgress() is
executed. The call site is in src/ai/tester.ts inside test() where the loop calls
this.analyzeProgress(task, currentState) (see the loop block that conditionally calls
analyzeProgress).
2. Inside src/ai/tester.ts, open the analyzeProgress function where the Zod schema is
defined (schema declaration starts at line 546 and includes the recommendReset field at
line 549).
3. The function builds a prompt and then calls the provider to parse the model output:
const model = this.provider.getModelForAgent('tester'); const response = await
this.provider.generateObject(..., schema, model); (the generateObject invocation is the
provider.parse call that immediately follows the schema definition in the same function).
4. If the LLM response omits the boolean field recommendReset, Zod validation will fail
because recommendReset is required (line 549). generateObject will therefore not return a
valid response.object, and analyzeProgress will receive response as undefined or a failed
parse, causing: const result = response?.object; if (!result) return ''; — the progress
checkpoint report is skipped. Reproducing this is achieved by running a test scenario that
triggers analyzeProgress and forcing the model to reply without the recommendReset boolean
(e.g., with a concise textual assessment that lacks that field).Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/ai/tester.ts
**Line:** 549:549
**Comment:**
*Possible Bug: The `recommendReset` field is required in the Zod schema, but the model prompt does not explicitly guarantee it will always be returned, so `generateObject` can fail validation if the model omits it, causing `response` to be undefined and silently skipping the progress report logic.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.| recommendReset: z.boolean().optional().describe('Recommend reset() if persistent failures suggest navigation issues'), | ||
| recommendStop: z.boolean().optional().describe('Recommend stop() if test is fundamentally incompatible or cannot proceed'), | ||
| recommendReset: z.boolean().describe('Recommend reset() if persistent failures suggest navigation issues'), | ||
| recommendStop: z.boolean().describe('Recommend stop() if test is fundamentally incompatible or cannot proceed'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: The recommendStop field is also required in the Zod schema without the prompt explicitly enforcing its presence, so if the model does not include it, generateObject may fail validation and result handling will be skipped. [possible bug]
Severity Level: Major ⚠️
- ❌ Stop recommendation lost when model omits the flag.
- ⚠️ analyzeProgress() may not record persistent failure actions.
- ⚠️ Tester misses critical stop/reset guidance.| recommendStop: z.boolean().describe('Recommend stop() if test is fundamentally incompatible or cannot proceed'), | |
| recommendStop: z.boolean().optional().describe('Recommend stop() if test is fundamentally incompatible or cannot proceed'), |
Steps of Reproduction ✅
1. Trigger the same code path as previous step: execute a test that leads to
Tester.analyzeProgress() being called from the main test loop (the conditional call to
analyzeProgress is in src/ai/tester.ts within the main loop).
2. Inspect the schema in analyzeProgress where recommendStop is declared (schema block at
lines 546-551, with recommendStop at line 550).
3. The function calls the provider to parse the LLM output via
this.provider.generateObject(..., schema, model). If the LLM response includes assessment
and suggestion but omits recommendStop, Zod will reject the object because recommendStop
is required (line 550).
4. When validation fails, generateObject yields no usable response.object, analyzeProgress
treats that as no result (const result = response?.object; if (!result) return ''), and
the function exits without adding the expected progress note or advising stop/reset.
Reproduce by running any test that causes analyzeProgress to run and configuring the LLM
(or using a prompt/seed) so it omits the recommendStop boolean.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/ai/tester.ts
**Line:** 550:550
**Comment:**
*Possible Bug: The `recommendStop` field is also required in the Zod schema without the prompt explicitly enforcing its presence, so if the model does not include it, `generateObject` may fail validation and result handling will be skipped.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.| summary: z.string().describe('Concise overview of the test findings'), | ||
| scenarioAchieved: z.boolean().describe('Indicates if the scenario goal appears satisfied'), | ||
| recommendation: z.string().optional().describe('Follow-up suggestion if needed'), | ||
| recommendation: z.string().describe('Follow-up suggestion if needed'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: In finalReview, the new recommendation field is required in the Zod schema but the prompt never asks the model to provide it, so generateObject can fail schema validation if the field is omitted, preventing summaries from being processed. [possible bug]
Severity Level: Major ⚠️
- ❌ Final summary not recorded in task.summary.
- ❌ finalReview may not call task.finish().
- ⚠️ Test session ends without final notes.| recommendation: z.string().describe('Follow-up suggestion if needed'), | |
| recommendation: z.string().optional().describe('Follow-up suggestion if needed'), |
Steps of Reproduction ✅
1. Execute a test that reaches the end-of-test cleanup so Tester.finalReview(task) is
invoked. finalReview is defined in src/ai/tester.ts (the function signature is private
async finalReview(task: Test): Promise<void; see the finalReview function block).
2. In finalReview, inspect the Zod schema declared for the final summary (schema block at
lines 664-668, where recommendation is currently required at line 667).
3. finalReview then calls the provider to parse the LLM output: const model =
this.provider.getModelForAgent('tester'); const response = await
this.provider.generateObject(..., schema, model); (the generateObject invocation follows
the schema definition in the same function).
4. If the LLM provides summary and scenarioAchieved but does not emit a recommendation
string, Zod validation will fail because recommendation is required (line 667). As a
result response?.object is undefined and finalReview returns early (if (!result) return;),
preventing task.summary assignment and the task.finish() logic from executing. Reproduce
by running a complete test and using an LLM reply that contains summary and
scenarioAchieved but omits recommendation.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/ai/tester.ts
**Line:** 667:667
**Comment:**
*Possible Bug: In `finalReview`, the new `recommendation` field is required in the Zod schema but the prompt never asks the model to provide it, so `generateObject` can fail schema validation if the field is omitted, preventing summaries from being processed.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.|
CodeAnt AI finished reviewing your PR. |
| suggestion: z.string().describe('Specific next action recommendation'), | ||
| recommendReset: z.boolean().optional().describe('Recommend reset() if persistent failures suggest navigation issues'), | ||
| recommendStop: z.boolean().optional().describe('Recommend stop() if test is fundamentally incompatible or cannot proceed'), | ||
| recommendReset: z.boolean().describe('Recommend reset() if persistent failures suggest navigation issues'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be confusing as it requires LLM to fill all fields and it would provide recommendation which it was previously about to avoid
Probably best way to remove all that recommended* fields alltogether
And keep only suggestion
Also probably we can use enum if this allowed:
recommendedTool: 'reset|stop|continue'
User description
CodeAnt-AI Description
Require recommendation fields in tester outputs
What Changed
Impact
✅ Clearer reset/stop guidance in test reports✅ Fewer missing follow-up recommendations✅ Earlier detection of incomplete model responses💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.