fix(observability): classify OpenHuman/Embedding/streaming backend 'Invalid token' 401 as SessionExpired (TAURI-RUST-4P0 + 4K5 + 1EE) by CodeGhost21 · Pull Request #2786 · tinyhumansai/openhuman

CodeGhost21 · 2026-05-27T20:16:03Z

Summary

Extend is_session_expired_message in src/core/observability.rs to recognise the OpenHuman backend's {"success":false,"error":"Invalid token"} 401 envelope as a session-expired condition (previously only the explicit "Session expired. Please log in again." body was recognised). The same upstream cause — backend rejecting the bearer JWT as invalid — surfaces under three emit-site prefixes depending on the call path, each producing its own Sentry fingerprint:

Sentry ID	Emit-site prefix	Source	Events
TAURI-RUST-4P0	`OpenHuman API error (401 …)`	non-streaming chat (`run_chat_task`)	—
TAURI-RUST-4K5	`Embedding API error (401 …)`	`embeddings/openai.rs:139`	~118
TAURI-RUST-1EE	`OpenHuman streaming API error (401 …)`	streaming chat (`compatible.rs:949`)	~110

All three are typically preceded by a [scheduler_gate] signed_out false -> true breadcrumb. The UI already drives reauth via the SessionExpired event-domain path; this stops the noise leaking into Sentry as a code bug.

Why three arms, not one

The matcher uses conjunctive anchors per arm — "<emit-site prefix> (401" AND the envelope-shaped "\"error\":\"Invalid token\"". Anchoring on each OpenHuman-scoped prefix is what preserves the #2286 BYO-key contract:

"OpenAI API error (401 Unauthorized): invalid_api_key" (user's own OpenAI key revoked) must NOT match.
"OpenAI streaming API error (401): invalid_api_key" and "Embedding API error (401): invalid_api_key" (BYO embedding/streaming key revoked) must NOT match either.

Each of those is pinned by a dedicated does_not_classify_*_byo_key_401_as_session_expired polarity guard. A single broad "any 401 + invalid token" matcher would silence all of them, so each OpenHuman-backend emit-site prefix gets its own prefix-gated arm. The streaming token in 1EE specifically means the 4P0 anchor can't cover it.

Tests added (`observability::tests`)

classifies_openhuman_invalid_token_401_as_session_expired — 4P0 (wrapped + unwrapped).
classifies_embedding_api_invalid_token_401_as_session_expired — 4K5 (direct + wrapped).
classifies_openhuman_streaming_invalid_token_401_as_session_expired — 1EE (direct + wrapped).
does_not_classify_embedding_byo_key_401_as_session_expired — embedding polarity guard.
does_not_classify_streaming_byo_key_401_as_session_expired — streaming polarity guard.
Existing does_not_classify_byo_key_provider_401_as_session_expired — SessionExpired clears the session for unrelated backend 401s #2286 chat-path contract, still green.

Test plan

cargo test classifies_openhuman_invalid_token — passes (4P0)
cargo test classifies_embedding_api_invalid_token — passes (4K5)
cargo test classifies_openhuman_streaming_invalid_token — passes (1EE)
cargo test does_not_classify_embedding_byo_key / does_not_classify_streaming_byo_key / does_not_classify_byo_key_provider — pass (polarity)
cargo test core::observability — 93 tests pass, 0 regressions
cargo check --bin openhuman-core — passes
cargo fmt --check — clean

… SessionExpired The OpenHuman backend rejects an expired/revoked JWT with the envelope `{"success":false,"error":"Invalid token"}` (vs. the explicit `"Session expired. Please log in again."` body that the existing classifier already catches). Same emit site (`providers::ops::api_error` → `web_channel.run_chat_task`), same wrapping, same expected user state — just a different body substring chosen by the backend's JWT-validity branch. Issue tinyhumansai#2286 deliberately stopped matching bare `"Invalid token"` as session-expired because that string also surfaces from Discord / OAuth provider rejections, which are actionable scoped errors that must reach Sentry. We preserve that contract with a conjunctive matcher: BOTH the OpenHuman-scoped `"OpenHuman API error (401"` prefix AND the envelope-shaped `"\"error\":\"Invalid token\""` must be present. tinyhumansai#2286 cases still route to Sentry (verified by the existing `does_not_classify_byo_key_provider_401_as_session_expired` test staying green): - `"Invalid token"` → None ✓ - `"got an invalid token here"` → None ✓ - `"OpenAI API error (401 Unauthorized): invalid_api_key"` → None ✓ - `"Anthropic API error (401 Unauthorized): ..."` → None ✓ Targets Sentry OPENHUMAN-TAURI-4P0 (issue 5332): low volume so far (1 event) but the wire shape is durable — every OpenHuman user with a stale JWT will hit this on the next agent turn, so quietly demoting it to a `warn!` log keeps the noise from compounding.

coderabbitai · 2026-05-27T20:16:18Z

📝 Walkthrough

Walkthrough

Adds conjunctive matching and docs in is_session_expired_message to classify OpenHuman/Embedding 401 "Invalid token" JSON envelopes as SessionExpired, and adds regression tests covering wrapped/unwrapped and negative BYO-key cases.

Changes

OpenHuman 401 Invalid Token Classification

Layer / File(s)	Summary
Session-expired classification for OpenHuman 401 invalid-token `src/core/observability.rs`	Doc comment expanded to describe the OpenHuman 401 `{"error":"Invalid token"}` envelope and the embedding wrapper; `is_session_expired_message` updated to conjunctively require the `"OpenHuman API error (401"` or `"Embedding API error (401"` prefix plus the exact invalid-token envelope before returning `ExpectedErrorKind::SessionExpired`.
Regression tests for invalid-token classification `src/core/observability.rs`	New tests assert both wrapped `run_chat_task` and unwrapped OpenHuman invalid-token envelopes classify as `SessionExpired`; verify embedding invalid-token wrapper classifies as `SessionExpired`; and guard that BYO-key/generic 401 variants do not classify as session-expired.

Sequence Diagram(s)

(omitted — change is a targeted classifier update and tests; no multi-component sequential flow requires visualization)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

tinyhumansai/openhuman#2188 — Related tests and changes touching is_session_expired_message and SessionExpired matching.
tinyhumansai/openhuman#1763 — Related updates to is_session_expired_message for OpenHuman 401 shapes.
tinyhumansai/openhuman#2200 — Integrates classifier changes into ReliableProvider retry/abort behavior that depends on SessionExpired detection.

Suggested reviewers

senamakel
graycyrus
M3gA-Mind

Poem

🐰 A token slipped out of date and key,
The 401 whispered, "Invalid token" to me.
I matched prefix and envelope with care,
So sessions expire only when they truly bear.
Hooray — tests hop in, the classifier set free!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly summarizes the main change: classifying OpenHuman/Embedding invalid token 401 responses as SessionExpired, which is the primary purpose of the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

graycyrus

@CodeGhost21 hey! the code looks good to me — the conjunctive anchor approach is exactly right here. Requiring both the "OpenHuman API error (401" prefix and the envelope-shaped "\"error\":\"Invalid token\"" body together is the correct way to preserve the #2286 contract while catching this specific backend branch, and the two new test cases pin both the verbatim Sentry wire shape and the unwrapped emit shape cleanly.

CI still has a few checks pending (Windows E2E, core image build, Rust core coverage). Once those come back green I'll come back and approve this. Let me know if anything comes up in the meantime.

…sionExpired (TAURI-RUST-4K5) TAURI-RUST-4K5 (~118 events, escalating on 0.56.0, domain=embeddings operation=openai_embed status=401) carries the same OpenHuman backend `{"success":false,"error":"Invalid token"}` envelope as 4P0, but the embedding client at `src/openhuman/embeddings/openai.rs:139` wraps it with the `"Embedding API error"` prefix instead of `"OpenHuman API error"`. The breadcrumb `[scheduler_gate] signed_out false -> true` immediately preceding the 401 in the event payload confirms it's the same session-expired cause, just emitted at the embedding layer. The conjunctive `"OpenHuman API error (401"` anchor added in the previous commit catches the chat-API path; this commit adds a parallel `"Embedding API error (401"` anchor so the embedding path also routes to SessionExpired. The envelope-shaped `"\"error\":\"Invalid token\""` gate stays the same, so third-party BYO-key embedding 401s (OpenAI / Voyage / Cohere rejecting the user's own API key) continue to escalate as actionable misconfiguration — covered by the new `does_not_classify_embedding_byo_key_401_as_session_expired` polarity guard. ## Test plan - [x] `cargo test classifies_embedding_api_invalid_token` — passes (new) - [x] `cargo test does_not_classify_embedding_byo_key` — passes (new polarity guard) - [x] `cargo test classifies_openhuman_invalid_token` — passes (4P0, unchanged) - [x] `cargo test does_not_classify_byo_key_provider` — passes (tinyhumansai#2286 BYO-key contract preserved) - [x] `cargo test core::observability` — 91 tests pass, 0 regressions - [x] `cargo check --bin openhuman-core` — passes - [x] `cargo fmt --check` — clean

…as SessionExpired (TAURI-RUST-1EE) Third emit-site prefix for the same OpenHuman backend `{"success":false,"error":"Invalid token"}` 401 envelope this PR already classifies for non-streaming chat (4P0) and embeddings (4K5). TAURI-RUST-1EE (Sentry issue 1807, 110 events, 109 on openhuman@0.56.0, domain=llm_provider operation=streaming_chat status=401 provider=OpenHuman) is the streaming-chat path: the body is wrapped at `inference/provider/compatible.rs:949` with the `"OpenHuman streaming API error"` prefix. The `streaming` token between `OpenHuman` and `API error` means the 4P0 anchor (`"OpenHuman API error (401"`) does not match it, so it needs its own prefix arm. Same conjunctive-anchor pattern as the existing arms — the OpenHuman-scoped streaming prefix gates the match so a third-party BYO-key streaming 401 (`"OpenAI streaming API error (401): invalid_api_key"`) stays actionable in Sentry. Tests: - `classifies_openhuman_streaming_invalid_token_401_as_session_expired` — verbatim 1EE wire shape (direct + caller-wrapped). - `does_not_classify_streaming_byo_key_401_as_session_expired` — polarity guard for the streaming prefix. ## Test plan - [x] `cargo test classifies_openhuman_streaming_invalid_token` — passes - [x] `cargo test does_not_classify_streaming_byo_key` — passes (polarity) - [x] `cargo test core::observability` — 93 tests pass, 0 regressions - [x] `cargo check --bin openhuman-core` — passes - [x] `cargo fmt --check` — clean

graycyrus

@CodeGhost21 the two follow-on commits look clean — the 4K5 (Embedding API) and 1EE (streaming) arms follow the same conjunctive-anchor pattern as the original 4P0 arm, the polarity guards for BYO-key embedding and streaming 401s are solid, and the test coverage (direct + caller-wrapped shapes for each arm) is thorough.

One CI check is still pending (Windows / Appium Chromium E2E). Once that clears, I'll come back and approve this.

`reliable::format_failure_aggregate` (no-configured-fallbacks branch) wraps every exhausted `reliable_chat_with_system` turn with: "The model `<name>` may not be available on your provider. Configure a fallback chain via `reliability.model_fallbacks` in your OpenHuman config, or change your default model in Settings → AI.\n\nAll providers/models failed. Attempts:\n…" The aggregate fires once per turn regardless of the underlying per- attempt cause (401 auth wall, unknown model, region block, rate- limit cliff). All of those are user-actionable: pick a different model, fix the credential, or configure fallbacks — the message literally tells the user how. Sentry has no remediation path that the per-attempt body classifiers haven't already covered at the lower layer (`SessionExpired`, `BudgetExhausted`, config_rejection siblings, etc.). Adds `"reliability.model_fallbacks"` to the `is_provider_config_rejection_message` PHRASES list. The string is uniquely OpenHuman — that config path is rendered into an error message only from `reliable.rs:332-334`, verified via grep across `src/`. A stray "may not be available" log line elsewhere will not collide. The configured-fallbacks aggregate branch (just `"All providers/models failed. Attempts:\n…"`) is intentionally NOT matched — the user has already engaged with the knob, so per- attempt classifiers should drive the per-body decision. Targets Sentry OPENHUMAN-TAURI-4JS (issue 5215): 25 events on v0.56.0 in 5h, `domain=llm_provider operation=reliable_chat_with_system failure=all_exhausted`. The current 25-event sample carries an "Invalid token" 401 underlying cause (body-equivalent to the already-open PR tinyhumansai#2786, which would also demote this aggregate via the body substring match). This PR catches the aggregate at the emit-site level so future all_exhausted scenarios with non-401 underlying causes (model name typo, region block, …) demote the same way. Tests pin the verbatim 4JS payload + three underlying-cause variants (unknown-model upstream, region block, bare aggregate) + a negative guard confirming the configured-fallbacks branch does NOT classify on the aggregate phrase alone.

CodeGhost21 requested a review from a team May 27, 2026 20:16

coderabbitai Bot added working A PR that is being worked on by the team. bug labels May 27, 2026

coderabbitai Bot previously approved these changes May 27, 2026

View reviewed changes

graycyrus reviewed May 27, 2026

View reviewed changes

CodeGhost21 dismissed coderabbitai[bot]’s stale review via 2a1706b May 27, 2026 21:23

CodeGhost21 changed the title ~~fix(observability): classify OpenHuman backend 'Invalid token' 401 as SessionExpired~~ fix(observability): classify OpenHuman/Embedding backend 'Invalid token' 401 as SessionExpired (TAURI-RUST-4P0 + 4K5) May 27, 2026

coderabbitai Bot previously approved these changes May 27, 2026

View reviewed changes

CodeGhost21 mentioned this pull request May 27, 2026

fix(observability): demote reliable_chat all_exhausted aggregate as ProviderConfigRejection (Sentry TAURI-RUST-4JS) #2797

Open

11 tasks

CodeGhost21 dismissed coderabbitai[bot]’s stale review via c04dcc3 May 28, 2026 04:49

CodeGhost21 changed the title ~~fix(observability): classify OpenHuman/Embedding backend 'Invalid token' 401 as SessionExpired (TAURI-RUST-4P0 + 4K5)~~ fix(observability): classify OpenHuman/Embedding/streaming backend 'Invalid token' 401 as SessionExpired (TAURI-RUST-4P0 + 4K5 + 1EE) May 28, 2026

CodeGhost21 mentioned this pull request May 28, 2026

fix(inference): publish SessionExpired for backend 401 on chat_completions (Sentry TAURI-RUST-N) #2814

Open

12 tasks

graycyrus reviewed May 28, 2026

View reviewed changes

oxoxDev assigned oxoxDev and unassigned oxoxDev May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(observability): classify OpenHuman/Embedding/streaming backend 'Invalid token' 401 as SessionExpired (TAURI-RUST-4P0 + 4K5 + 1EE)#2786

fix(observability): classify OpenHuman/Embedding/streaming backend 'Invalid token' 401 as SessionExpired (TAURI-RUST-4P0 + 4K5 + 1EE)#2786
CodeGhost21 wants to merge 3 commits into
tinyhumansai:mainfrom
CodeGhost21:fix/observability-openhuman-invalid-token

CodeGhost21 commented May 27, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 27, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

graycyrus left a comment

Uh oh!

graycyrus left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

CodeGhost21 commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why three arms, not one

Tests added (observability::tests)

Test plan

Uh oh!

coderabbitai Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CodeGhost21 commented May 27, 2026 •

edited

Loading

Tests added (`observability::tests`)

coderabbitai Bot commented May 27, 2026 •

edited

Loading