** Please make sure you read the contribution guide and file the issues in the right place. **
Contribution guide.
🔴 Required Information
Please ensure all items in this section are completed to allow for efficient
triaging. Requests without complete information may be rejected / deprioritized.
If an item is not applicable to you - please mark it as N/A
Is your feature request related to a specific problem?
Thanks to supporting custom metrics in v1.23.0.
When running evals via AgentEvaluator.evaluate in tests, custom metrics defined in test_config.json are not registered, even though adk eval (CLI) supports custom metrics.
This makes it hard to use the same eval configs in pytest and adk eval.
For example
https://github.com/ftnext/agent-practice/tree/1835d834b32796bba36a05aa85c32eb4427ad831/adk-evaluation/tests
% PYTHONPATH=. adk eval home_automation_agent tests/fixtures/home_automation_agent/simple_test.test.json --config_file_path tests/fixtures/home_automation_agent/test_config.json
Eval Run Summary
b305bd06-38c5-4796-b9c7-d9c7454338b9:
Tests passed: 1
Tests failed: 0
But
@pytest.mark.asyncio
async def test_with_single_test_file():
await AgentEvaluator.evaluate(
agent_module="home_automation_agent",
eval_dataset_file_path_or_dir="tests/fixtures/home_automation_agent/simple_test.test.json",
num_runs=1,
)
File "/.../adk-evaluation/.venv/lib/python3.13/site-packages/google/adk/evaluation/metric_evaluator_registry.py", line 64, in get_evaluator
raise NotFoundError(f"{eval_metric.metric_name} not found in registry.")
google.adk.errors.not_found_error.NotFoundError: practice_tool_trajectory_metric not found in registry.
Describe the Solution You'd Like
Make AgentEvaluator.evaluate_eval_set register custom metrics from EvalConfig.custom_metrics, matching the behavior in adk eval.
The registration should use a local metric registry (no global side effects) and a shared helper for default MetricInfo construction.
Impact on your work
I want to reuse the same eval configurations across CLI and pytest without duplicating logic or writing custom registry code in tests.
This affects our local eval workflow and CI coverage.
Willingness to contribute
Are you interested in implementing this feature yourself or submitting a PR?
Yes
🟡 Recommended Information
Describe Alternatives You've Considered
Manually registering custom metrics in each test, but this is repetitive and diverges from adk eval behavior.
example: https://github.com/ftnext/agent-practice/blob/3e845dd8327bdf28716528d4f50cebce91231542/adk-evaluation/tests/test_home_automation_agent.py#L41-L203
Proposed API / Implementation
Use a local MetricEvaluatorRegistry in AgentEvaluator.evaluate_eval_set and register custom_metrics with _CustomMetricEvaluator, similar to the CLI.
Share a common helper for default MetricInfo creation in the evaluation layer.
Additional Context
Keeping CLI and AgentEvaluator behavior consistent reduces confusion and makes it easier to adopt custom metrics in tests.
Custom metrics are very promising as they empower individual developers to handle evaluation issues themselves.
I hope that custom metrics support will continue to improve with ADK version upgrades.
** Please make sure you read the contribution guide and file the issues in the right place. **
Contribution guide.
🔴 Required Information
Please ensure all items in this section are completed to allow for efficient
triaging. Requests without complete information may be rejected / deprioritized.
If an item is not applicable to you - please mark it as N/A
Is your feature request related to a specific problem?
Thanks to supporting custom metrics in v1.23.0.
When running evals via
AgentEvaluator.evaluatein tests, custom metrics defined intest_config.jsonare not registered, even thoughadk eval(CLI) supports custom metrics.This makes it hard to use the same eval configs in pytest and
adk eval.For example
https://github.com/ftnext/agent-practice/tree/1835d834b32796bba36a05aa85c32eb4427ad831/adk-evaluation/tests
But
Describe the Solution You'd Like
Make
AgentEvaluator.evaluate_eval_setregister custom metrics fromEvalConfig.custom_metrics, matching the behavior inadk eval.The registration should use a local metric registry (no global side effects) and a shared helper for default
MetricInfoconstruction.Impact on your work
I want to reuse the same eval configurations across CLI and pytest without duplicating logic or writing custom registry code in tests.
This affects our local eval workflow and CI coverage.
Willingness to contribute
Are you interested in implementing this feature yourself or submitting a PR?
Yes
🟡 Recommended Information
Describe Alternatives You've Considered
Manually registering custom metrics in each test, but this is repetitive and diverges from
adk evalbehavior.example: https://github.com/ftnext/agent-practice/blob/3e845dd8327bdf28716528d4f50cebce91231542/adk-evaluation/tests/test_home_automation_agent.py#L41-L203
Proposed API / Implementation
Use a local
MetricEvaluatorRegistryinAgentEvaluator.evaluate_eval_setand registercustom_metricswith_CustomMetricEvaluator, similar to the CLI.Share a common helper for default
MetricInfocreation in the evaluation layer.Additional Context
Keeping CLI and
AgentEvaluatorbehavior consistent reduces confusion and makes it easier to adopt custom metrics in tests.Custom metrics are very promising as they empower individual developers to handle evaluation issues themselves.
response_match_score(ROUGE-1) is not effectively in Japanese (with manual tokenization) #4122I hope that custom metrics support will continue to improve with ADK version upgrades.