Follow-up from #768 review (non-blocking; edge case today).
#768 added capability filters in retry_with_fallback_caps so a NEAR failure doesn't fall through to Chutes for operations Chutes can't serve — filter_streaming_capable (streaming) and filter_client_e2ee_capable (client-facing E2EE). Without the filter, Chutes' "unsupported" error overwrites the retryable NEAR error and suppresses retry.
rerank / embeddings / score (crates/services/src/inference_provider_pool/mod.rs ~2700) have the same masking risk: on a NEAR+Chutes canonical id, a retryable NEAR failure would fall through to Chutes' "unsupported" error.
Not reachable today — Chutes is chat-only and isn't configured as a fallback for rerank/embeddings/score canonical ids. If that ever changes, extend the capability-filter pattern (e.g. a supports_<op>() capability) to those paths.
Follow-up from #768 review (non-blocking; edge case today).
#768 added capability filters in
retry_with_fallback_capsso a NEAR failure doesn't fall through to Chutes for operations Chutes can't serve —filter_streaming_capable(streaming) andfilter_client_e2ee_capable(client-facing E2EE). Without the filter, Chutes' "unsupported" error overwrites the retryable NEAR error and suppresses retry.rerank/embeddings/score(crates/services/src/inference_provider_pool/mod.rs~2700) have the same masking risk: on a NEAR+Chutes canonical id, a retryable NEAR failure would fall through to Chutes' "unsupported" error.Not reachable today — Chutes is chat-only and isn't configured as a fallback for rerank/embeddings/score canonical ids. If that ever changes, extend the capability-filter pattern (e.g. a
supports_<op>()capability) to those paths.