Skip to content

Classify Cloudflare AI errors with error codes and retry guidance #61

@whoabuddy

Description

@whoabuddy

Problem

The Cloudflare AI chat endpoint catches all errors with a single generic handler that returns 500 "Chat completion failed". AI agent callers cannot distinguish transient errors (timeout, rate limit) from permanent ones (model not found, bad request), so they cannot implement intelligent retry logic.

Log evidence (Feb 28 – Mar 2, 2026): Multiple Cloudflare AI chat error log entries with varied underlying messages (timeout, model not found, rate limit exceeded) all surfacing identically as 500 to the caller.

Affected File

src/endpoints/inference/cloudflare/chat.ts, lines 226–233

} catch (error) {
  log.error("Cloudflare AI chat error", {
    model,
    error: error instanceof Error ? error.message : String(error),
  });

  return this.errorResponse(c, "Chat completion failed", 500);
}

The errorResponse method (base.ts, lines 53–69) already accepts an extra parameter:

protected errorResponse(
  c: AppContext,
  error: string,
  status: ContentfulStatusCode,
  extra: Record<string, unknown> = {}
): Response

Current Behavior

All Cloudflare AI errors return:

{
  "ok": false,
  "error": "Chat completion failed"
}

HTTP 500, regardless of the underlying cause.

Expected Behavior

Errors are classified and return actionable fields:

{
  "ok": false,
  "error": "Cloudflare AI request timed out",
  "error_code": "AI_TIMEOUT",
  "retryable": true,
  "retry_after_seconds": 5
}
{
  "ok": false,
  "error": "Cloudflare AI rate limit exceeded",
  "error_code": "AI_RATE_LIMITED",
  "retryable": true,
  "retry_after_seconds": 60
}
{
  "ok": false,
  "error": "Model not found or not available",
  "error_code": "AI_MODEL_NOT_FOUND",
  "retryable": false
}
{
  "ok": false,
  "error": "Cloudflare AI internal error",
  "error_code": "AI_INTERNAL_ERROR",
  "retryable": true,
  "retry_after_seconds": 10
}

Error Classification

Error pattern HTTP status error_code retryable
Timeout / AbortError 504 AI_TIMEOUT true
Rate limit (429 from CF AI) 429 AI_RATE_LIMITED true
Model not found 404 AI_MODEL_NOT_FOUND false
Other / unknown 500 AI_INTERNAL_ERROR true

Acceptance Criteria

  • The catch block at chat.ts:226–233 inspects the error message/type to determine classification
  • Response includes error_code (string), retryable (boolean), and optionally retry_after_seconds (number) via the extra parameter of errorResponse
  • HTTP status code reflects the error class (504 for timeout, 429 for rate limit, 404 for model not found, 500 for internal)
  • Internal log still captures the full original error message
  • At least one test covers each error class

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions