Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions src/trio_core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,12 +144,14 @@ def from_env_file(

# API-layer concurrency
vlm_api_concurrency: int = Field(
default=1,
default=16,
ge=1,
description="Max concurrent VLM requests at the FastAPI handler. "
"Default 1 protects local GPU backends from contention. "
"Raise to 8-16 when remote_vlm_url is set, since the remote service "
"handles its own concurrency and the local lock is bypassed.",
"Local backends still serialize generation via their own "
"BaseBackend._lock, so a higher value here is safe — extra requests "
"just wait at the lock. Remote backends use nullcontext(), so this "
"value caps the actual number of parallel HTTPS calls. Lower it "
"if a remote provider rate-limits aggressively.",
)

# Cache (Phase 2)
Expand Down
Loading