fix(evals): require tool assertions before Run in test template editor#1901
fix(evals): require tool assertions before Run in test template editor#1901
Conversation
- Disable Run when no expected tool call is defined on any prompt turn - Add toast guard in handleRunCompare for the same invariant - Avoid duplicate Run tooltip when inline blocked hint is shown - Extend validation tests for no-assertion case Made-with: Cursor
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
Internal previewPreview URL: https://mcp-inspector-pr-1901.up.railway.app |
WalkthroughThis change introduces validation logic to require expected tool assertions before running tests in the TestTemplateEditor component. A new detection mechanism identifies whether any asserted tool calls exist across prompt turns. When valid prompts lack expected tool assertions, the Run button becomes disabled with an updated tooltip message. A new test validates this behavior by confirming the Run button disables and displays the appropriate warning text when assertions are absent. Additionally, a runtime check prevents compare runs from starting without assertions and emits a toast error. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
mcpjam-inspector/client/src/components/evals/test-template-editor.tsx (1)
647-655:⚠️ Potential issue | 🟡 MinorPreserve prompt-validation precedence before the assertion gate.
Line 647 and Line 1197 now report “Add at least one expected tool assertion…” even when the prompt itself is empty. That changes the existing empty-prompt guidance/toast. Validate prompt completeness first, then apply the new assertion requirement.
Proposed ordering fix
- if (!hasRequiredRunAssertions) { - return "Add at least one expected tool assertion before running."; - } if (!arePromptTurnsValid && editForm) { return ( getPromptTurnBlockReason(editForm.promptTurns) ?? "Fix the test configuration before running." ); } + if (!hasRequiredRunAssertions) { + return "Add at least one expected tool assertion before running."; + }- if (!hasAnyAssertedToolCall(editForm.promptTurns)) { - toast.error("Add at least one expected tool assertion before running."); - return; - } - if (!validatePromptTurns(editForm.promptTurns)) { toast.error( getPromptTurnBlockReason(editForm.promptTurns) ?? "Fix the test configuration before running." ); return; } + + if (!hasAnyAssertedToolCall(editForm.promptTurns)) { + toast.error("Add at least one expected tool assertion before running."); + return; + }Also applies to: 1197-1208
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@mcpjam-inspector/client/src/components/evals/test-template-editor.tsx` around lines 647 - 655, The current gate checks assertions before prompt validity causing the "Add at least one expected tool assertion..." message to override empty-prompt guidance; change the logic in the block that uses hasRequiredRunAssertions, arePromptTurnsValid, editForm and getPromptTurnBlockReason so you validate the prompt first: if editForm exists and arePromptTurnsValid is false, return getPromptTurnBlockReason(editForm.promptTurns) ?? "Fix the test configuration before running." before checking hasRequiredRunAssertions; apply the same reordered check in the other occurrence that uses these same symbols (the block around the other start/end range).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@mcpjam-inspector/client/src/components/evals/__tests__/test-template-editor-validation.test.tsx`:
- Around line 192-220: The test mutates shared fixtures caseDoc.query and
baseIteration.testCaseSnapshot.query causing cross-test pollution; fix by using
per-test clones or resetting those fields in beforeEach: locate the test
"disables Run when there are no tool assertions" and either replace direct
assignments to caseDoc and baseIteration with cloned copies ( e.g., clone the
fixture objects before modifying ) or add explicit resets in the test file's
beforeEach to restore caseDoc.query and baseIteration.testCaseSnapshot.query to
their original values so each test gets a fresh, unmodified fixture.
---
Outside diff comments:
In `@mcpjam-inspector/client/src/components/evals/test-template-editor.tsx`:
- Around line 647-655: The current gate checks assertions before prompt validity
causing the "Add at least one expected tool assertion..." message to override
empty-prompt guidance; change the logic in the block that uses
hasRequiredRunAssertions, arePromptTurnsValid, editForm and
getPromptTurnBlockReason so you validate the prompt first: if editForm exists
and arePromptTurnsValid is false, return
getPromptTurnBlockReason(editForm.promptTurns) ?? "Fix the test configuration
before running." before checking hasRequiredRunAssertions; apply the same
reordered check in the other occurrence that uses these same symbols (the block
around the other start/end range).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0544adef-53c6-4e86-9695-1aff95f9cf70
📒 Files selected for processing (2)
mcpjam-inspector/client/src/components/evals/__tests__/test-template-editor-validation.test.tsxmcpjam-inspector/client/src/components/evals/test-template-editor.tsx
| it("disables Run when there are no tool assertions", async () => { | ||
| caseDoc.query = "Hello"; | ||
| baseIteration.testCaseSnapshot.query = "Hello"; | ||
|
|
||
| renderWithProviders( | ||
| <TestTemplateEditor | ||
| suiteId="suite-1" | ||
| selectedTestCaseId="case-1" | ||
| connectedServerNames={new Set(["srv"])} | ||
| workspaceId={null} | ||
| availableModels={[ | ||
| { | ||
| provider: "openai", | ||
| model: "gpt-4", | ||
| label: "GPT-4", | ||
| } as any, | ||
| ]} | ||
| />, | ||
| ); | ||
|
|
||
| await waitFor(() => { | ||
| expect(screen.getByDisplayValue("Hello")).toBeInTheDocument(); | ||
| }); | ||
|
|
||
| expect(screen.getByRole("button", { name: /^Run$/ })).toBeDisabled(); | ||
| expect( | ||
| screen.getByText("Add at least one expected tool assertion before running."), | ||
| ).toBeInTheDocument(); | ||
| }); |
There was a problem hiding this comment.
Avoid leaking prompt fixture mutations across tests.
Line 193 mutates shared caseDoc, and Line 194 mutates shared baseIteration. Since beforeEach does not reset those fields, later tests can inherit "Hello" and become order-dependent. Prefer per-test cloned fixtures or reset these fields in beforeEach.
Minimal reset option
beforeEach(() => {
vi.clearAllMocks();
+ caseDoc.query = "";
+ baseIteration.testCaseSnapshot.query = "";
useMutationMock.mockReturnValue(vi.fn());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@mcpjam-inspector/client/src/components/evals/__tests__/test-template-editor-validation.test.tsx`
around lines 192 - 220, The test mutates shared fixtures caseDoc.query and
baseIteration.testCaseSnapshot.query causing cross-test pollution; fix by using
per-test clones or resetting those fields in beforeEach: locate the test
"disables Run when there are no tool assertions" and either replace direct
assignments to caseDoc and baseIteration with cloned copies ( e.g., clone the
fixture objects before modifying ) or add explicit resets in the test file's
beforeEach to restore caseDoc.query and baseIteration.testCaseSnapshot.query to
their original values so each test gets a fresh, unmodified fixture.
Summary
Users could start a test run with a prompt but no expected tool calls (no assertions), which is not a useful eval and matches product feedback. This enforces a clear invariant: Run requires at least one tool assertion across the prompt flow.
Changes
Runstays disabled when there is no asserted expected tool on any prompt turn. Save validation is unchanged; only the run path is stricter.handleRunComparereturns early with a toast if the invariant is not met, so a programmatic trigger cannot bypass the UI.How to test
Made with Cursor