Skip to content

Chutes tiered fallback: extend per-op capability filter to rerank/embeddings/score #770

@Evrard-Nil

Description

@Evrard-Nil

Follow-up from #768 review (non-blocking; edge case today).

#768 added capability filters in retry_with_fallback_caps so a NEAR failure doesn't fall through to Chutes for operations Chutes can't serve — filter_streaming_capable (streaming) and filter_client_e2ee_capable (client-facing E2EE). Without the filter, Chutes' "unsupported" error overwrites the retryable NEAR error and suppresses retry.

rerank / embeddings / score (crates/services/src/inference_provider_pool/mod.rs ~2700) have the same masking risk: on a NEAR+Chutes canonical id, a retryable NEAR failure would fall through to Chutes' "unsupported" error.

Not reachable today — Chutes is chat-only and isn't configured as a fallback for rerank/embeddings/score canonical ids. If that ever changes, extend the capability-filter pattern (e.g. a supports_<op>() capability) to those paths.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions