[feat][evaluation] cozeclaw evaluation solution#517
Conversation
Implement ShouldSkipEvaluator method to allow evaluators to skip execution based on input data. This includes adding the interface method, default implementations for prompt and code evaluators, and integration into the evaluation workflow. The feature helps optimize evaluation by skipping unnecessary runs.
… and create record when skipped Modify ShouldSkip methods to return EvaluatorOutputData instead of EvaluatorRecord Update ShouldSkipEvaluator to create a record with output data when skipped Adjust tests and mocks to reflect the new behavior
Add mock expectation for ShouldSkipEvaluator in all test cases to ensure proper test coverage of evaluator skipping logic
Add error handling for ShouldSkipEvaluator call and log warning when skip check fails
Codecov Report❌ Patch coverage is @@ Coverage Diff @@
## main #517 +/- ##
==========================================
+ Coverage 77.47% 77.51% +0.03%
==========================================
Files 660 660
Lines 74230 74418 +188
==========================================
+ Hits 57511 57686 +175
- Misses 13318 13324 +6
- Partials 3401 3408 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 2 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
…nto feat/skip_evaluator
…522) * feat(evaluation): generalize evaluator skip rule to intercept rule Rename ShouldSkip to ShouldIntercept across interfaces, implementations, and tests. Add RunStatus field to EvaluatorOutputData so that intercept rules can control both score and status of evaluator records. * refactor(evaluation): rename skip to intercepted in ShouldIntercept semantics Align variable names and comments with the intercept semantics instead of the legacy skip terminology. * refactor: move RunStatus from EvaluatorOutputData to ShouldIntercept return value Align ShouldIntercept return style with Run method - status is now an independent return value instead of being embedded in the DO struct.
|
|
What type of PR is this?
Check the PR title
(Optional) Translate the PR title into Chinese
(Optional) More detailed description for this PR(en: English/zh: Chinese)
en:
zh(optional):
(Optional) Which issue(s) this PR fixes