Skip to content

Run saveUsageCost in background to fix slow proxy responses#4244

Open
jurgenwerk wants to merge 1 commit intomainfrom
cs-10501-ai-proxy-request-is-very-slow
Open

Run saveUsageCost in background to fix slow proxy responses#4244
jurgenwerk wants to merge 1 commit intomainfrom
cs-10501-ai-proxy-request-is-very-slow

Conversation

@jurgenwerk
Copy link
Contributor

@jurgenwerk jurgenwerk commented Mar 25, 2026

Summary

  • The _request-forward handler was awaiting saveUsageCost before returning the response to the client
  • saveUsageCost polls OpenRouter's generation cost API (/api/v1/generation?id=...) with exponential backoff (1s, 2s, 4s, 8s...) because cost data is often not immediately available (404), adding 10-15+ seconds of latency
  • Both the non-streaming and streaming code paths now run saveUsageCost in the background so responses return immediately

Test plan

  • Make a non-streaming request via _request-forward to OpenRouter and verify the response returns promptly (~5s instead of ~20s)
  • Make a streaming request and verify the [DONE] event is not delayed
  • Verify credits are still deducted correctly after the response is returned

🤖 Generated with Claude Code

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bc12a4d213

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// 6. Calculate and deduct credits in the background
// so we don't block the response on OpenRouter's generation cost API
destinationConfig.creditStrategy
.saveUsageCost(dbAdapter, matrixUserId, responseData)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep credit charging in the request gating path

Making saveUsageCost fire-and-forget here returns the client response before billing is recorded, but credit enforcement still happens earlier via validateCredits, so subsequent requests are authorized against stale balances. In this codebase, saveUsageCost can back off/retry for up to minutes when generation cost is not yet available, which creates a large window where a user can issue multiple requests and bypass the minimum-credit guard before deductions land; this is a production under-billing regression (and the same pattern is now used in the streaming [DONE] branch too).

Useful? React with 👍 / 👎.

@jurgenwerk jurgenwerk force-pushed the cs-10501-ai-proxy-request-is-very-slow branch 4 times, most recently from ff5274e to dbe6040 Compare March 25, 2026 09:50
@github-actions
Copy link

github-actions bot commented Mar 25, 2026

Host Test Results

    1 files  ±    0      1 suites  ±0   3h 46m 38s ⏱️ + 1h 37m 2s
2 051 tests ±    0  2 036 ✅ ±    0  15 💤 ± 0  0 ❌ ±0 
4 006 runs  +1 940  3 977 ✅ +1 926  29 💤 +14  0 ❌ ±0 

Results for commit 26ab140. ± Comparison against base commit 0ce7ece.

♻️ This comment has been updated with latest results.

The request forward handler was awaiting saveUsageCost before returning
the response. This function polls OpenRouter's generation cost API with
exponential backoff (1s, 2s, 4s, 8s...) because cost data is often not
immediately available, adding 10-15+ seconds of latency. Run it in the
background instead so responses return immediately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jurgenwerk jurgenwerk force-pushed the cs-10501-ai-proxy-request-is-very-slow branch from dbe6040 to 26ab140 Compare March 25, 2026 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant