Skip to content

fix(rag): add sse keepalive heartbeat to prevent nginx 60s timeout#434

Merged
AndreLiar merged 1 commit into
devfrom
fix/rag-stream-sse-keepalive
Jun 15, 2026
Merged

fix(rag): add sse keepalive heartbeat to prevent nginx 60s timeout#434
AndreLiar merged 1 commit into
devfrom
fix/rag-stream-sse-keepalive

Conversation

@AndreLiar

Copy link
Copy Markdown
Owner

Root cause

The RAG /stream endpoint uses SSE. The embedding phase (multi-query expansion via CPU-only ollama) takes 3–5 minutes on the current prod infra. Nginx's default proxy_read_timeout is 60 seconds — it kills any SSE connection where no body bytes are written within that window.

Result: the SSE stream was dropped silently 60s after opening. The frontend stayed on "Retrieving context…" indefinitely, then reset to 0 messages. The backend actually completed the pipeline successfully (confirmed in logs: totalTime: 323330ms, messages saved to DB) but no client was connected anymore to receive the events.

Fix

Send an SSE comment (: keepalive\n\n) every 20 seconds from the moment the stream opens. SSE comments are valid per spec, ignored by browsers, and count as body bytes — so nginx resets its idle timer on each one, keeping the connection alive for the full pipeline duration.

const heartbeat = setInterval(() => {
  if (!closed && !res.writableEnded) res.write(': keepalive\n\n');
}, 20000);
// cleared in finally block

Test plan

  • All 1449 backend unit + integration tests pass
  • All 521 frontend tests pass
  • E2E: send a question via the Ask AI UI → response appears within 5–6 minutes (not stuck at "Retrieving context…")

Notes

The underlying performance issue (CPU-only ollama at ~5 min/query) is a separate infra concern. This fix makes the feature functional while that is addressed.

the embedding phase (multi-query expansion via cpu-only ollama) can take
several minutes. nginx's proxy_read_timeout (60s default) was killing the
idle sse connection before any event was flushed, leaving the ui stuck on
"retrieving context…" forever.

send a sse comment (: keepalive) every 20s to keep the connection alive
through the entire pipeline duration.
@AndreLiar AndreLiar merged commit 721b955 into dev Jun 15, 2026
5 checks passed
@AndreLiar AndreLiar deleted the fix/rag-stream-sse-keepalive branch June 15, 2026 07:32
@codecov-commenter

codecov-commenter commented Jun 15, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/controllers/ragController.js 0.00% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

AndreLiar added a commit that referenced this pull request Jun 15, 2026
) (#435)

the embedding phase (multi-query expansion via cpu-only ollama) can take
several minutes. nginx's proxy_read_timeout (60s default) was killing the
idle sse connection before any event was flushed, leaving the ui stuck on
"retrieving context…" forever.

send a sse comment (: keepalive) every 20s to keep the connection alive
through the entire pipeline duration.
AndreLiar added a commit that referenced this pull request Jun 15, 2026
* fix(rag): add sse keepalive heartbeat to prevent nginx 60s timeout (#434)

the embedding phase (multi-query expansion via cpu-only ollama) can take
several minutes. nginx's proxy_read_timeout (60s default) was killing the
idle sse connection before any event was flushed, leaving the ui stuck on
"retrieving context…" forever.

send a sse comment (: keepalive) every 20s to keep the connection alive
through the entire pipeline duration.

* feat(settings): render qr code in mfa setup flow (#441)

the setup form was showing the raw otpauth:// url as plain text.
users had to manually copy a long secret into their authenticator app.
now shows a scannable qr code image (qrcode.react) + keeps the secret
as a manual-entry fallback below.

also clarifies the step 1 label to mention microsoft authenticator / google
authenticator explicitly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants