Conversation
Translations of the new items added in the latest versions.
…ng (#20547) Refactored the file processing status streaming endpoint to avoid holding a database connection for the entire stream duration (up to 2 hours). Changes: - Each status poll now creates its own short-lived database session instead of capturing the request's session in the generator closure - Increased poll interval from 0.5s to 1s, halving database queries with negligible UX impact This prevents a single file status stream from blocking a connection pool slot for hours, which could contribute to pool exhaustion under load.
… (#20534) - Remove incorrect 403 check that blocked STT when ENGINE="" (local whisper) - Change TTS empty ENGINE check from 403 to 404 for proper semantics
…ection pool exhaustion (#20542)
fix: use efficient COUNT queries in telemetry metrics to prevent connection pool exhaustion
This fixes database connection pool exhaustion issues reported after v0.7.0,
particularly affecting PostgreSQL deployments on high-latency networks (e.g., AWS Aurora).
## The Problem
The telemetry metrics callbacks (running every 10 seconds via OpenTelemetry's
PeriodicExportingMetricReader) were using inefficient queries that loaded entire
database tables into memory just to count records:
len(Users.get_users()["users"]) # Loads ALL user records to count them
On high-latency network-attached databases like AWS Aurora, this would:
1. Hold database connections for hundreds of milliseconds while transferring data
2. Deserialize all records into Python objects
3. Only then count the list length
Under concurrent load, these long-held connections would stack up and drain the
connection pool, resulting in:
sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached,
connection timed out, timeout 30.00
## The Fix
Replace inefficient full-table loads with efficient COUNT(*) queries using
methods that already exist in the codebase:
- `len(Users.get_users()["users"])` → `Users.get_num_users()`
- Similar changes for other telemetry callbacks as needed
COUNT(*) queries use database indexes and return a single integer, completing in
~5-10ms even on Aurora, versus potentially 500ms+ for loading all records.
## Why v0.7.1's Session Sharing Disable "Helped"
The v0.7.1 change to disable DATABASE_ENABLE_SESSION_SHARING by default appeared
to fix the issue, but it was masking the root cause. Disabling session sharing
causes connections to be returned to the pool faster (more connection churn),
which reduced the window for pool exhaustion but didn't address the underlying
inefficient queries.
With this fix, session sharing can be safely re-enabled for deployments that
benefit from it (especially PostgreSQL), as telemetry will no longer hold
connections for extended periods.
## Impact
- Telemetry connection usage drops from potentially seconds to ~30ms total per
collection cycle
- Connection pool pressure from telemetry becomes negligible (~0.3% utilization)
- Enterprise PostgreSQL deployments (Aurora, RDS, etc.) should no longer
experience pool exhaustion under normal load
…olding during LLM calls (#20545)
fix: release database connections immediately after auth instead of holding during LLM calls
Authentication was using Depends(get_session) which holds a database connection
for the entire request lifecycle. For chat completions, this meant connections
were held for 30-60 seconds while waiting for LLM responses, despite only needing
the connection for ~50ms of actual database work.
With a default pool of 15 connections, this limited concurrent chat users to ~15
before pool exhaustion and timeout errors:
sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached,
connection timed out, timeout 30.00
The fix removes Depends(get_session) from get_current_user. Each database
operation now manages its own short-lived session internally:
BEFORE: One session held for entire request
──────────────────────────────────────────────────
│ auth │ queries │ LLM wait (30s) │ save │
│ CONNECTION HELD ENTIRE TIME │
──────────────────────────────────────────────────
AFTER: Short-lived sessions, released immediately
┌──────┐ ┌───────┐ ┌──────┐
│ auth │ │ query │ LLM (30s) │ save │
│ 10ms │ │ 20ms │ NO CONNECTION │ 20ms │
└──────┘ └───────┘ └──────┘
This is safe because:
- User model has no lazy-loaded relationships (all simple columns)
- Pydantic conversion (UserModel.model_validate) happens while session is open
- Returned object is pure Pydantic with no SQLAlchemy ties
Combined with the telemetry efficiency fix, this resolves connection pool
exhaustion for high-concurrency deployments, particularly on network-attached
databases like AWS Aurora where connection hold time is more impactful.
# Conflicts: # package-lock.json # package.json # src/lib/i18n/locales/ar-BH/translation.json # src/lib/i18n/locales/ar/translation.json # src/lib/i18n/locales/bg-BG/translation.json # src/lib/i18n/locales/bn-BD/translation.json # src/lib/i18n/locales/bo-TB/translation.json # src/lib/i18n/locales/bs-BA/translation.json # src/lib/i18n/locales/ca-ES/translation.json # src/lib/i18n/locales/ceb-PH/translation.json # src/lib/i18n/locales/cs-CZ/translation.json # src/lib/i18n/locales/da-DK/translation.json # src/lib/i18n/locales/de-DE/translation.json # src/lib/i18n/locales/dg-DG/translation.json # src/lib/i18n/locales/el-GR/translation.json # src/lib/i18n/locales/en-GB/translation.json # src/lib/i18n/locales/es-ES/translation.json # src/lib/i18n/locales/et-EE/translation.json # src/lib/i18n/locales/eu-ES/translation.json # src/lib/i18n/locales/fa-IR/translation.json # src/lib/i18n/locales/fi-FI/translation.json # src/lib/i18n/locales/fr-CA/translation.json # src/lib/i18n/locales/fr-FR/translation.json # src/lib/i18n/locales/gl-ES/translation.json # src/lib/i18n/locales/he-IL/translation.json # src/lib/i18n/locales/hi-IN/translation.json # src/lib/i18n/locales/hr-HR/translation.json # src/lib/i18n/locales/hu-HU/translation.json # src/lib/i18n/locales/id-ID/translation.json # src/lib/i18n/locales/ie-GA/translation.json # src/lib/i18n/locales/it-IT/translation.json # src/lib/i18n/locales/ja-JP/translation.json # src/lib/i18n/locales/ka-GE/translation.json # src/lib/i18n/locales/kab-DZ/translation.json # src/lib/i18n/locales/ko-KR/translation.json # src/lib/i18n/locales/lt-LT/translation.json # src/lib/i18n/locales/ms-MY/translation.json # src/lib/i18n/locales/nb-NO/translation.json # src/lib/i18n/locales/nl-NL/translation.json # src/lib/i18n/locales/pa-IN/translation.json # src/lib/i18n/locales/pl-PL/translation.json # src/lib/i18n/locales/pt-BR/translation.json # src/lib/i18n/locales/pt-PT/translation.json # src/lib/i18n/locales/ro-RO/translation.json # src/lib/i18n/locales/ru-RU/translation.json # src/lib/i18n/locales/sk-SK/translation.json # src/lib/i18n/locales/sr-RS/translation.json # src/lib/i18n/locales/sv-SE/translation.json # src/lib/i18n/locales/th-TH/translation.json # src/lib/i18n/locales/tk-TM/translation.json # src/lib/i18n/locales/tr-TR/translation.json # src/lib/i18n/locales/ug-CN/translation.json # src/lib/i18n/locales/uk-UA/translation.json # src/lib/i18n/locales/ur-PK/translation.json # src/lib/i18n/locales/uz-Cyrl-UZ/translation.json # src/lib/i18n/locales/uz-Latn-Uz/translation.json # src/lib/i18n/locales/vi-VN/translation.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.