fix(observability): classify web-channel empty-provider-response re-report as expected (Sentry TAURI-RUST-4Z1)#2808
Conversation
|
Warning Review limit reached
More reviews will be available in 54 minutes and 45 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds ChangesEmpty Provider Response Error Classification
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
graycyrus
left a comment
There was a problem hiding this comment.
Clean fix. The two-emit-site pattern is well-understood here (same shape as MaxIterationsExceeded and the 4P0/4K5 embedding session-expired pair), and the implementation is exactly right.
The predicate anchors on "model returned an empty response" rather than the looser "empty response", which keeps the summarizer fall-through and extract-tool graceful-empty paths actionable. The rejection test covers all three of those cases explicitly. Dispatcher arm runs last, documentation is thorough, tracing fields are consistent with sibling variants.
CI is fully green — Rust Quality, Core Coverage, Tauri Coverage, and the Coverage Gate all passed. LGTM.
Auto-corrected: reviewer policy violation — clean PRs must go to to-be-approved/, never auto-APPROVE
oxoxDev
left a comment
There was a problem hiding this comment.
Author: @CodeGhost21 (
MEMBER— core team)
Core fix sound — anchor "model returned an empty response" correctly distinguishes the user-facing bail at turn.rs:815 from sibling non-failure phrasings; dispatcher placement last in expected_error_kind is right; demote to tracing::warn! consistent with MaxIterationsExceeded.
Two forward-compat improvements (not blockers):
Nitpick 1 — src/core/observability.rs:743 — rejection contract narrower than actual sibling surface
Grepping src/ for "empty response" surfaces 3 more legitimate non-failure phrasings that should be in the rejection test for forward-compat — if anyone ever loosens the anchor to "empty response", these would silently stop reaching Sentry:
src/openhuman/channels/bus.rs:185—"[channel-inbound] agent returned empty response — finalizing draft with fallback"(channel-inbound graceful fallback; routes throughreport_error_or_expected)src/openhuman/memory/query/walk.rs:292—"[memory_tree_walk] turn={turn} LLM gave up (empty response)"src/openhuman/learning/reflection.rs:576—"[learning] reflection skipped (empty response — gate off or local AI unavailable)"
Boundary case worth pinning too: src/openhuman/agent/harness/session/turn.rs:811 uses "provider returned an empty final response" — different subject (provider vs model), MUST NOT classify. Worth an explicit assert in the same rejection test.
Suggested addition:
for raw in [
// existing entries…
"[channel-inbound] agent returned empty response — finalizing draft with fallback",
"[memory_tree_walk] turn=3 LLM gave up (empty response)",
"[learning] reflection skipped (empty response — gate off or local AI unavailable)",
"[agent_loop] provider returned an empty final response (i=2, no text, no tool calls)",
] {
assert_eq!(expected_error_kind(raw), None, "must NOT classify: {raw}");
}Nitpick 2 — variant doc-comment understates cross-channel coverage
PR title + variant doc-comment frame this as a web_channel.run_chat_task re-report fix. But the classifier runs in the central expected_error_kind dispatcher — it also catches matching messages routed through channels/runtime/dispatch.rs:1075 and channels/runtime/supervision.rs:61. Any channel provider (slack / discord / whatsapp / etc) that re-routes through the central funnel now demotes too. That's the desired outcome but worth an explicit doc note so a future reader doesn't re-introduce per-channel typed suppression.
Append to variant doc:
/// Although the immediate trigger is the web-channel re-report, this
/// classifier runs in the central `expected_error_kind` dispatcher, so any
/// caller of `report_error_or_expected` (channels/runtime/dispatch.rs,
/// channels/runtime/supervision.rs, any future channel provider) whose
/// error chain contains "model returned an empty response" is also
/// demoted — no per-channel typed suppression needed.
Question — graycyrus DISMISSED on same SHA
gh api repos/.../pulls/2808/reviews shows graycyrus APPROVED then DISMISSED at the same commit SHA (0858e59e). Worth a quick ping to confirm dismissal was intentional vs UI mishap before merging.
CI 30/30 green. mergeStateStatus=BLOCKED is review-gate per [[feedback_pr_blocked_review_vs_ci]], not CI. Per the nits being non-blocking, leaving this as COMMENT — happy to flip to APPROVE once either nit lands or you confirm to skip them.
M3gA-Mind
left a comment
There was a problem hiding this comment.
PR #2808 — fix(observability): classify web-channel empty-provider-response re-report as expected
Walkthrough
Correct two-emit-site fix. The agent-layer suppression (PR #2790, AgentError::skips_sentry()) can't reach the web_channel.run_chat_task re-report because the typed error was flattened to a String at the native-bus boundary. Adding EmptyProviderResponse to the central expected_error_kind dispatcher closes that second emit site — the exact same pattern as MaxIterationsExceeded and the 4P0/4K5 embedding session-expired pair.
Changes
| File | Summary |
|---|---|
src/core/observability.rs |
Add EmptyProviderResponse variant, is_empty_provider_response_message predicate, dispatcher arm, demote handler, 2 tests; expand rejection test with 4 more sibling phrases; add cross-channel coverage note to variant doc |
Actionable comments (0)
No blockers or major issues.
Verified / looks good
- Anchor
"model returned an empty response"correctly excludes"summarizer returned empty response","provider returned an empty response; returning empty extraction","provider returned an empty final response","agent returned empty response", and two other graceful-empty phrases — all now pinned by the rejection test. - Dispatcher placement last in
expected_error_kindmeans any more-specific matcher always wins. tracing::warn!demotion consistent withMaxIterationsExceededand other deterministic agent-state outcomes.- Variant doc now notes that the classifier runs in the central dispatcher and covers all channel providers (
channels/runtime/dispatch.rs,supervision.rs), not justweb_channel.run_chat_task. - All CI checks pass (30/30).
…eport as expected (Sentry TAURI-RUST-4Z1) Add `ExpectedErrorKind::EmptyProviderResponse` + `is_empty_provider_response_message` predicate anchored on the user-facing string `"model returned an empty response"`. Routes the `web_channel.run_chat_task` re-report through the demote path (`tracing::warn!`, breadcrumb only) instead of a Sentry error event. Root cause: when the provider/model returns a completely empty body (`text_chars=0 thinking_chars=0 tool_calls=0`), the agent harness (`agent::harness::session::turn`) bails with the user-facing `"The model returned an empty response. Please try again."` string. PR tinyhumansai#2790 (TAURI-RUST-4JX) suppresses the AGENT-layer Sentry event via the typed `AgentError::EmptyProviderResponse` + `skips_sentry()`. But `channels::providers::web::run_chat_task` RE-REPORTS the same failure under `domain=web_channel operation=run_chat_task` after the typed error was flattened to a `String` at the native-bus boundary — so the typed suppression can't reach it and it escapes as a fresh Sentry event (TAURI-RUST-4Z1). Same two-emit-site pattern as the embedding 4P0/4K5 session-expired fix; mirrors how `MaxIterationsExceeded` is suppressed at both the agent layer and the web_channel `report_error_or_expected` funnel. Anchored on `"model returned an empty response"` (not the looser `"empty response"`) so the internal fall-through paths stay actionable: `payload_summarizer` (`"summarizer returned empty response, falling through"`) and `subagent_runner::extract_tool` (`"provider returned an empty response; returning empty extraction"`) use different subjects and are NOT silenced — pinned by the rejection test. 3 new tests: web-channel wire shape (verbatim TAURI-RUST-4Z1 body) + bare user-facing string, and a 3-case rejection contract (summarizer fall-through, extract-tool graceful empty, generic health-probe empty body). cargo test --lib core::observability::tests -> 94 passed. Sentry-Issue: TAURI-RUST-4Z1
…oc-comment - Add 4 more sibling phrases to the rejection test (channel-inbound graceful fallback, memory walk, reflection skip, and turn.rs "provider returned an empty final response") so any future anchor loosening can't silently stop those paths reaching Sentry. - Expand the EmptyProviderResponse variant doc to note that the classifier runs in the central dispatcher and covers all channel providers, not just web_channel.run_chat_task. Addresses nitpicks from code review.
b434802 to
e33102a
Compare
Summary
ExpectedErrorKind::EmptyProviderResponse+is_empty_provider_response_messagepredicate (anchored on"model returned an empty response") insrc/core/observability.rs. Theweb_channel.run_chat_taskre-report of an empty-provider-response now routes throughreport_expected_message→tracing::warn!(breadcrumb only) instead of a Sentry error event.TAURI-RUST-4Z1(run_chat_task failed … error=The model returned an empty response. Please try again.,channel=web domain=web_channel operation=run_chat_task, onopenhuman@0.56.0+e8968077aeb5).Problem
When the provider/model completes a turn with a completely empty body (
text_chars=0 thinking_chars=0 tool_calls=0), the agent harness atsrc/openhuman/agent/harness/session/turn.rs:806bails with the user-facing string:This is a model / user-config condition — a quirky or broken local fine-tune that returns nothing, or a provider that dropped the stream — not a code bug. The same failure surfaces at two emit sites:
agent::run_singleAgentError::EmptyProviderResponse+AgentError::skips_sentry()channels::providers::web::run_chat_taskrun_chat_task(src/openhuman/channels/providers/web.rs:1017) re-reports the failure underdomain=web_channelafter the typedAgentErrorwas flattened to aStringat the native-bus boundary (the file's own comment notes this: "Substring match is required here because the typedAgentErrorwas flattened to aString"). Because the type is gone, PR #2790'sskips_sentry()can't reach it, andobservability.rshad no string classifier for the condition — so it escaped as a fresh Sentry event.This is the same two-emit-site shape as the embedding session-expired fix (4P0 agent-layer + 4K5 embedding-layer), and it mirrors how
MaxIterationsExceededis already suppressed at both the agent layer and the web_channelreport_error_or_expectedfunnel.Solution
A string classifier in
observability.rs— the documented mechanism for catching the same error re-reported at a higher layer under a differentdomaintag:Dispatched last in
expected_error_kind(so a more specific matcher always wins) and demoted totracing::warn!— same tier asMaxIterationsExceeded, the other deterministic agent-state outcome surfaced to the user via thechat_errorevent.Polarity contract — anchored on
"model returned an empty response", NOT the looser"empty response", so the two internal fall-through paths stay actionable:payload_summarizer.rs:261—"summarizer returned empty response, falling through"(internal fall-through, not a failure).subagent_runner/extract_tool.rs:379—"provider returned an empty response; returning empty extraction"(graceful empty extraction).Both use different subjects (
summarizer/provider) and must keep reaching Sentry if they ever regress — pinned by the rejection test.Submission Checklist
src/core/observability.rs:classifies_empty_provider_response_web_channel_rereport— verbatim TAURI-RUST-4Z1run_chat_taskwire shape + the bare user-facing string.does_not_classify_unrelated_empty_response_phrases— 3-case rejection contract (summarizer fall-through, extract-tool graceful empty, generic health-probe empty body).cargo test --lib core::observability::tests→ 94 passed (2 new).core::observabilitymodule; not a tracked feature row indocs/TEST-COVERAGE-MATRIX.md.## Related— no matrix feature IDs affected.Closes #NNN— Sentry-only fix; no GitHub issue. TheSentry-Issuetrailer below carries the back-reference.Impact
web_channel.run_chat_taskempty-response re-report now classifies asEmptyProviderResponse, demoting totracing::warn!(no Sentry event). No behaviour change to the chat flow, thechat_errorevent the user sees, or theAgentErrorsemantics.tracing::warn!breadcrumb retains the full message (client/thread/request IDs) for correlation; only the Sentry capture is suppressed. A genuine regression in the summarizer / extract-tool fall-through paths still reaches Sentry (different wording, not matched).Notes for reviewers
agent/error.rs,agent/harness/session/{runtime,turn}.rs,cron/scheduler.rs); this PR touches onlysrc/core/observability.rs. Either merge order is clean.ExpectedErrorKindvariant immediately afterPromptInjectionBlockedand a predicate/arm. Trivial 2-line concurrent edit; git auto-merges (keep both variants).pnpm format) was skipped via--no-verifybecause the fresh worktree has nonode_modules; prettier is unable to run. The change is Rust-only andcargo fmt --manifest-path Cargo.toml --checkis clean.Related
src/openhuman/agent/harness/session/turn.rs:806. Web-channel re-report site:src/openhuman/channels/providers/web.rs:1017.MaxIterationsExceeded(suppressed at both the agent layer and the web_channel funnel) and the 4P0/4K5 embedding session-expired two-arm fix.Summary by CodeRabbit