diff --git a/CHANGELOG.md b/CHANGELOG.md
index 38e3e2be4d..4d119a1386 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,106 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.6.36] - 2025-11-07
+
+### Added
+
+- 🔐 OAuth group parsing now supports configurable separators via the "OAUTH_GROUPS_SEPARATOR" environment variable, enabling proper handling of semicolon-separated group claims from providers like CILogon. [#18987](https://github.com/open-webui/open-webui/pull/18987), [#18979](https://github.com/open-webui/open-webui/issues/18979)
+
+### Fixed
+
+- 🛠️ Tool calling functionality is restored by correcting asynchronous function handling in tool parameter updates. [#18981](https://github.com/open-webui/open-webui/issues/18981)
+- 🖼️ The ComfyUI image edit workflow editor modal now opens correctly when clicking the Edit button. [#18978](https://github.com/open-webui/open-webui/issues/18978)
+- 🔥 Firecrawl import errors are resolved by implementing lazy loading and using the correct class name. [#18973](https://github.com/open-webui/open-webui/issues/18973)
+- 🔌 Socket.IO CORS warning is resolved by properly configuring CORS origins for Socket.IO connections. [Commit](https://github.com/open-webui/open-webui/commit/639d26252e528c9c37a5f553b11eb94376d8792d)
+
+## [0.6.35] - 2025-11-06
+
+### Added
+
+- 🖼️ Image generation system received a comprehensive overhaul with major new capabilities including full image editing support allowing users to modify existing images using text prompts with OpenAI, Gemini, or ComfyUI engines, adding Gemini 2.5 Flash Image (Nano Banana) support, Qwen Image Edit integration, resolution of base64-encoded image display issues, streamlined AUTOMATIC1111 configuration by consolidating parameters into a flexible JSON parameters field, and enhanced UI with a code editor modal for ComfyUI workflow management. [#17434](https://github.com/open-webui/open-webui/pull/17434), [#16976](https://github.com/open-webui/open-webui/issues/16976), [Commit](https://github.com/open-webui/open-webui/commit/8e5690aab4f632a57027e2acf880b8f89a8717c0), [Commit](https://github.com/open-webui/open-webui/commit/72f8539fd2e679fec0762945f22f4b8a6920afa0), [Commit](https://github.com/open-webui/open-webui/commit/8d34fcb586eeee1fac6da2f991518b8a68b00b72), [Commit](https://github.com/open-webui/open-webui/commit/72900cd686de1fa6be84b5a8a2fc857cff7b91b8)
+- 🔒 CORS origin validation was added to WebSocket connections as a defense-in-depth security measure against cross-site WebSocket hijacking attacks. [#18411](https://github.com/open-webui/open-webui/pull/18411), [#18410](https://github.com/open-webui/open-webui/issues/18410)
+- 🔄 Automatic page refresh now occurs when a version update is detected via WebSocket connection, ensuring users always run the latest version without cache issues. [Commit](https://github.com/open-webui/open-webui/commit/989f192c92d2fe55daa31336e7971e21798b96ae)
+- 🐍 Experimental initial preparations for Python 3.13 compatibility by updating dependencies with security enhancements and cryptographic improvements. [#18430](https://github.com/open-webui/open-webui/pull/18430), [#18424](https://github.com/open-webui/open-webui/pull/18424)
+- ⚡ Image compression now preserves the original image format instead of converting to PNG, significantly reducing file sizes and improving chat loading performance. [#18506](https://github.com/open-webui/open-webui/pull/18506)
+- 🎤 Mistral Voxtral model support was added for text-to-speech, including voxtral-small and voxtral-mini models with both transcription and chat completion API support. [#18934](https://github.com/open-webui/open-webui/pull/18934)
+- 🔊 Text-to-speech now uses a global audio queue system to prevent overlapping playback, ensuring only one TTS instance plays at a time with proper stop/start controls and automatic cleanup when switching between messages. [#16152](https://github.com/open-webui/open-webui/pull/16152), [#18744](https://github.com/open-webui/open-webui/pull/18744), [#16150](https://github.com/open-webui/open-webui/issues/16150)
+- 🔊 ELEVENLABS_API_BASE_URL environment variable now allows configuration of custom ElevenLabs API endpoints, enabling support for EU residency API requirements. [#18402](https://github.com/open-webui/open-webui/issues/18402)
+- 🔐 OAUTH_ROLES_SEPARATOR environment variable now allows custom role separators for OAuth roles that contain commas, useful for roles specified in LDAP syntax. [#18572](https://github.com/open-webui/open-webui/pull/18572)
+- 📄 External document loaders can now optionally forward user information headers when ENABLE_FORWARD_USER_INFO_HEADERS is enabled, enabling cost tracking, audit logs, and usage analytics for external services. [#18731](https://github.com/open-webui/open-webui/pull/18731)
+- 📄 MISTRAL_OCR_API_BASE_URL environment variable now allows configuration of custom Mistral OCR API endpoints for flexible deployment options. [Commit](https://github.com/open-webui/open-webui/commit/415b93c7c35c2e2db4425e6da1b88b3750f496b0)
+- ⌨️ Keyboard shortcut hints are now displayed on sidebar buttons with a refactored shortcuts modal that accurately reflects all available hotkeys across different keyboard layouts. [#18473](https://github.com/open-webui/open-webui/pull/18473)
+- 🛠️ Tooltips now display tool descriptions when hovering over tool names on the model edit page, improving usability and providing immediate context. [#18707](https://github.com/open-webui/open-webui/pull/18707)
+- 📝 "Create a new note" from the search modal now immediately creates a new private note and opens it in the editor instead of navigating to the generic notes page. [#18255](https://github.com/open-webui/open-webui/pull/18255)
+- 🖨️ Code block output now preserves whitespace formatting with monospace font to accurately reflect terminal behavior. [#18352](https://github.com/open-webui/open-webui/pull/18352)
+- ✏️ Edit button is now available in the three-dot menu of models in the workspace section for quick access to model editing, with the menu reorganized for better user experience and Edit, Clone, Copy Link, and Share options logically grouped. [#18574](https://github.com/open-webui/open-webui/pull/18574)
+- 📌 Sidebar models section is now collapsible, allowing users to expand and collapse the pinned models list for better sidebar organization. [Commit](https://github.com/open-webui/open-webui/commit/82c08a3b5d189f81c96b6548cc872198771015b0)
+- 🌙 Dark mode styles for select elements were added using Tailwind CSS classes, improving consistency across the interface. [#18636](https://github.com/open-webui/open-webui/pull/18636)
+- 🔄 Various improvements were implemented across the frontend and backend to enhance performance, stability, and security.
+- 🌐 Translations for Portuguese (Brazil), Greek, German, Traditional Chinese, Simplified Chinese, Spanish, Georgian, Danish, and Estonian were enhanced and expanded.
+
+### Fixed
+
+- 🔒 Server-Sent Event (SSE) code injection vulnerability in Direct Connections is resolved by blocking event emission from untrusted external model servers; event emitters from direct connected model servers are no longer supported, preventing arbitrary JavaScript execution in user browsers. [Commit](https://github.com/open-webui/open-webui/commit/8af6a4cf21b756a66cd58378a01c60f74c39b7ca)
+- 🛡️ DOM XSS vulnerability in "Insert Prompt as Rich Text" is resolved by sanitizing HTML content with DOMPurify before rendering. [Commit](https://github.com/open-webui/open-webui/commit/eb9c4c0e358c274aea35f21c2856c0a20051e5f1)
+- ⚙️ MCP server cancellation scope corruption is prevented by reversing disconnection order to follow LIFO and properly handling exceptions, resolving 100% CPU usage when resuming chats with expired tokens or using multiple streamable MCP servers. [#18537](https://github.com/open-webui/open-webui/pull/18537)
+- 🔧 UI freeze when querying models with knowledge bases containing inconsistent distance metrics is resolved by properly initializing the distances array in citations. [#18585](https://github.com/open-webui/open-webui/pull/18585)
+- 🤖 Duplicate model IDs from multiple OpenAI endpoints are now automatically deduplicated server-side, preventing frontend crashes for users with unified gateway proxies that aggregate multiple providers. [Commit](https://github.com/open-webui/open-webui/commit/fdf7ca11d4f3cc8fe63e81c98dc0d1e48e52ba36)
+- 🔐 Login failures with passwords longer than 72 bytes are resolved by safely truncating oversized passwords for bcrypt compatibility. [#18157](https://github.com/open-webui/open-webui/issues/18157)
+- 🔐 OAuth 2.1 MCP tool connections now automatically re-register clients when stored client IDs become stale, preventing unauthorized_client errors after editing tool endpoints and providing detailed error messages for callback failures. [#18415](https://github.com/open-webui/open-webui/pull/18415), [#18309](https://github.com/open-webui/open-webui/issues/18309)
+- 🔓 OAuth 2.1 discovery, metadata fetching, and dynamic client registration now correctly use HTTP proxy environment variables when trust_env is enabled. [Commit](https://github.com/open-webui/open-webui/commit/bafeb76c411483bd6b135f0edbcdce048120f264)
+- 🔌 MCP server connection failures now display clear error messages in the chat interface instead of silently failing. [#18892](https://github.com/open-webui/open-webui/pull/18892), [#18889](https://github.com/open-webui/open-webui/issues/18889)
+- 💬 Chat titles are now properly generated even when title auto-generation is disabled in interface settings, fixing an issue where chats would remain labeled as "New chat". [#18761](https://github.com/open-webui/open-webui/pull/18761), [#18717](https://github.com/open-webui/open-webui/issues/18717), [#6478](https://github.com/open-webui/open-webui/issues/6478)
+- 🔍 Chat query errors are prevented by properly validating and handling the "order_by" parameter to ensure requested columns exist. [#18400](https://github.com/open-webui/open-webui/pull/18400), [#18452](https://github.com/open-webui/open-webui/pull/18452)
+- 🔧 Root-level max_tokens parameter is no longer dropped when proxying to Ollama, properly converting to num_predict to limit output token length as intended. [#18618](https://github.com/open-webui/open-webui/issues/18618)
+- 🔑 Self-hosted Marker instances can now be used without requiring an API key, while keeping it optional for datalab Marker service users. [#18617](https://github.com/open-webui/open-webui/issues/18617)
+- 🔧 OpenAPI specification endpoint conflict between "/api/v1/models" and "/api/v1/models/" is resolved by changing the models router endpoint to "/list", preventing duplicate operationId errors when generating TypeScript API clients. [#18758](https://github.com/open-webui/open-webui/issues/18758)
+- 🏷️ Model tags are now de-duplicated case-insensitively in both the model selector and workspace models page, preventing duplicate entries with different capitalization from appearing in filter dropdowns. [#18716](https://github.com/open-webui/open-webui/pull/18716), [#18711](https://github.com/open-webui/open-webui/issues/18711)
+- 📄 Docling RAG parameter configuration is now correctly saved in the admin UI by fixing the typo in the "DOCLING_PARAMS" parameter name. [#18390](https://github.com/open-webui/open-webui/pull/18390)
+- 📃 Tika document processing now automatically detects content types instead of relying on potentially incorrect browser-provided mime-types, improving file handling accuracy for formats like RTF. [#18765](https://github.com/open-webui/open-webui/pull/18765), [#18683](https://github.com/open-webui/open-webui/issues/18683)
+- 🖼️ Image and video uploads to knowledge bases now display proper error messages instead of showing an infinite spinner when the content extraction engine does not support these file types. [#18514](https://github.com/open-webui/open-webui/issues/18514)
+- 📝 Notes PDF export now properly detects and applies dark mode styling consistently across both the notes list and individual note pages, with a shared utility function to eliminate code duplication. [#18526](https://github.com/open-webui/open-webui/issues/18526)
+- 💭 Details tags for reasoning content are now correctly identified and rendered even when the same tag is present in user messages. [#18840](https://github.com/open-webui/open-webui/pull/18840), [#18294](https://github.com/open-webui/open-webui/issues/18294)
+- 📊 Mermaid and Vega rendering errors now display inline with the code instead of showing repetitive toast notifications, improving user experience when models generate invalid diagram syntax. [Commit](https://github.com/open-webui/open-webui/commit/fdc0f04a8b7dd0bc9f9dc0e7e30854f7a0eea3e9)
+- 📈 Mermaid diagram rendering errors no longer cause UI unavailability or display error messages below the input box. [#18493](https://github.com/open-webui/open-webui/pull/18493), [#18340](https://github.com/open-webui/open-webui/issues/18340)
+- 🔗 Web search SSL verification is now asynchronous, preventing the website from hanging during web search operations. [#18714](https://github.com/open-webui/open-webui/pull/18714), [#18699](https://github.com/open-webui/open-webui/issues/18699)
+- 🌍 Web search results now correctly use HTTP proxy environment variables when WEB_SEARCH_TRUST_ENV is enabled. [#18667](https://github.com/open-webui/open-webui/pull/18667), [#7008](https://github.com/open-webui/open-webui/discussions/7008)
+- 🔍 Google Programmable Search Engine now properly includes referer headers, enabling API keys with HTTP referrer restrictions configured in Google Cloud Console. [#18871](https://github.com/open-webui/open-webui/pull/18871), [#18870](https://github.com/open-webui/open-webui/issues/18870)
+- ⚡ YouTube video transcript fetching now works correctly when using a proxy connection. [#18419](https://github.com/open-webui/open-webui/pull/18419)
+- 🎙️ Speech-to-text transcription no longer deletes or replaces existing text in the prompt input field, properly preserving any previously entered content. [#18540](https://github.com/open-webui/open-webui/issues/18540)
+- 🎙️ The "Instant Auto-Send After Voice Transcription" setting now functions correctly and automatically sends transcribed text when enabled. [#18466](https://github.com/open-webui/open-webui/issues/18466)
+- ⚙️ Chat settings now load properly when reopening a tab or starting a new session by initializing defaults when sessionStorage is empty. [#18438](https://github.com/open-webui/open-webui/pull/18438)
+- 🔎 Folder tag search in the sidebar now correctly handles folder names with multiple spaces by replacing all spaces with underscores. [Commit](https://github.com/open-webui/open-webui/commit/a8fe979af68e47e4e4bb3eb76e48d93d60cd2a45)
+- 🛠️ Functions page now updates immediately after deleting a function, removing the need for a manual page reload. [#18912](https://github.com/open-webui/open-webui/pull/18912), [#18908](https://github.com/open-webui/open-webui/issues/18908)
+- 🛠️ Native tool calling now properly supports sequential tool calls with shared context, allowing tools to access images and data from previous tool executions in the same conversation. [#18664](https://github.com/open-webui/open-webui/pull/18664)
+- 🎯 Globally enabled actions in the model editor now correctly apply as global instead of being treated as disabled. [#18577](https://github.com/open-webui/open-webui/pull/18577)
+- 📋 Clipboard images pasted via the "{{CLIPBOARD}}" prompt variable are now correctly converted to base64 format before being sent to the backend, resolving base64 encoding errors. [#18432](https://github.com/open-webui/open-webui/pull/18432), [#18425](https://github.com/open-webui/open-webui/issues/18425)
+- 📋 File list is now cleared when switching to models that do not support file uploads, preventing files from being sent to incompatible models. [#18496](https://github.com/open-webui/open-webui/pull/18496)
+- 📂 Move menu no longer displays when folders are empty. [#18484](https://github.com/open-webui/open-webui/pull/18484)
+- 📁 Folder and channel creation now validates that names are not empty, preventing creation of folders or channels with no name and showing an error toast if attempted. [#18564](https://github.com/open-webui/open-webui/pull/18564)
+- 🖊️ Rich text input no longer removes text between equals signs when pasting code with comparison operators. [#18551](https://github.com/open-webui/open-webui/issues/18551)
+- ⌨️ Keyboard shortcuts now display the correct keys for international and non-QWERTY keyboard layouts by detecting the user's layout using the Keyboard API. [#18533](https://github.com/open-webui/open-webui/pull/18533)
+- 🌐 "Attach Webpage" button now displays with correct disabled styling when a model does not support file uploads. [#18483](https://github.com/open-webui/open-webui/pull/18483)
+- 🎚️ Divider no longer displays in the integrations menu when no integrations are enabled. [#18487](https://github.com/open-webui/open-webui/pull/18487)
+- 📱 Chat controls button is now properly hidden on mobile for users without admin or explicit chat control permissions. [#18641](https://github.com/open-webui/open-webui/pull/18641)
+- 📍 User menu, download submenu, and move submenu are now repositioned to prevent overlap with the Chat Controls sidebar when it is open. [Commit](https://github.com/open-webui/open-webui/commit/414ab51cb6df1ab0d6c85ac6c1f2c5c9a5f8e2aa)
+- 🎯 Artifacts button no longer appears in the chat menu when there are no artifacts to display. [Commit](https://github.com/open-webui/open-webui/commit/ed6449d35f84f68dc75ee5c6b3f4748a3fda0096)
+- 🎨 Artifacts view now automatically displays when opening an existing conversation containing artifacts, improving user experience. [#18215](https://github.com/open-webui/open-webui/pull/18215)
+- 🖌️ Formatting toolbar is no longer hidden under images or code blocks in chat and now displays correctly above all message content.
+- 🎨 Layout shift near system instructions is prevented by properly rendering the chat component when system prompts are empty. [#18594](https://github.com/open-webui/open-webui/pull/18594)
+- 📐 Modal layout shift caused by scrollbar appearance is prevented by adding a stable scrollbar gutter. [#18591](https://github.com/open-webui/open-webui/pull/18591)
+- ✨ Spacing between icon and label in the user menu dropdown items is now consistent. [#18595](https://github.com/open-webui/open-webui/pull/18595)
+- 💬 Duplicate prompt suggestions no longer cause the webpage to freeze or throw JavaScript errors by implementing proper key management with composite keys. [#18841](https://github.com/open-webui/open-webui/pull/18841), [#18566](https://github.com/open-webui/open-webui/issues/18566)
+- 🔍 Chat preview loading in the search modal now works correctly for all search results by fixing an index boundary check that previously caused out-of-bounds errors. [#18911](https://github.com/open-webui/open-webui/pull/18911)
+- ♿ Screen reader support was enhanced by wrapping messages in semantic elements with descriptive aria-labels, adding "Assistant is typing" and "Response complete" announcements for improved accessibility. [#18735](https://github.com/open-webui/open-webui/pull/18735)
+- 🔒 Incorrect await call in the OAuth 2.1 flow is removed, eliminating a logged exception during authentication. [#18236](https://github.com/open-webui/open-webui/pull/18236)
+- 🛡️ Duplicate crossorigin attribute in the manifest file was removed. [#18413](https://github.com/open-webui/open-webui/pull/18413)
+
+### Changed
+
+- 🔄 Firecrawl integration was refactored to use the official Firecrawl SDK instead of direct HTTP requests and langchain_community FireCrawlLoader, improving reliability and performance with batch scraping support and enhanced error handling. [#18635](https://github.com/open-webui/open-webui/pull/18635)
+- 📄 MinerU content extraction engine now only supports PDF files following the upstream removal of LibreOffice document conversion in version 2.0.0; users needing to process office documents should convert them to PDF format first. [#18448](https://github.com/open-webui/open-webui/issues/18448)
+
## [0.6.34] - 2025-10-16
### Added
diff --git a/CHANGELOG_EXTRA.md b/CHANGELOG_EXTRA.md
index a17ebc2566..fc5536c684 100644
--- a/CHANGELOG_EXTRA.md
+++ b/CHANGELOG_EXTRA.md
@@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.6.36.1] - 2025.11.07
+
+### Changed
+
+- 合并官方 0.6.36 改动
+
## [0.6.34.1] - 2025.10.17
### Changed
diff --git a/backend/open_webui/config.py b/backend/open_webui/config.py
index a146e2a450..b3a466e63d 100644
--- a/backend/open_webui/config.py
+++ b/backend/open_webui/config.py
@@ -620,25 +620,34 @@ def __getattr__(self, key):
os.environ.get("OAUTH_BLOCKED_GROUPS", "[]"),
)
+OAUTH_GROUPS_SEPARATOR = os.environ.get("OAUTH_GROUPS_SEPARATOR", ";")
+
OAUTH_ROLES_CLAIM = PersistentConfig(
"OAUTH_ROLES_CLAIM",
"oauth.roles_claim",
os.environ.get("OAUTH_ROLES_CLAIM", "roles"),
)
+SEP = os.environ.get("OAUTH_ROLES_SEPARATOR", ",")
+
OAUTH_ALLOWED_ROLES = PersistentConfig(
"OAUTH_ALLOWED_ROLES",
"oauth.allowed_roles",
[
role.strip()
- for role in os.environ.get("OAUTH_ALLOWED_ROLES", "user,admin").split(",")
+ for role in os.environ.get("OAUTH_ALLOWED_ROLES", f"user{SEP}admin").split(SEP)
+ if role
],
)
OAUTH_ADMIN_ROLES = PersistentConfig(
"OAUTH_ADMIN_ROLES",
"oauth.admin_roles",
- [role.strip() for role in os.environ.get("OAUTH_ADMIN_ROLES", "admin").split(",")],
+ [
+ role.strip()
+ for role in os.environ.get("OAUTH_ADMIN_ROLES", "admin").split(SEP)
+ if role
+ ],
)
OAUTH_ALLOWED_DOMAINS = PersistentConfig(
@@ -2472,6 +2481,12 @@ class BannerModel(BaseModel):
os.getenv("DOCUMENT_INTELLIGENCE_KEY", ""),
)
+MISTRAL_OCR_API_BASE_URL = PersistentConfig(
+ "MISTRAL_OCR_API_BASE_URL",
+ "rag.MISTRAL_OCR_API_BASE_URL",
+ os.getenv("MISTRAL_OCR_API_BASE_URL", "https://api.mistral.ai/v1"),
+)
+
MISTRAL_OCR_API_KEY = PersistentConfig(
"MISTRAL_OCR_API_KEY",
"rag.mistral_ocr_api_key",
@@ -3068,16 +3083,30 @@ class BannerModel(BaseModel):
# Images
####################################
+ENABLE_IMAGE_GENERATION = PersistentConfig(
+ "ENABLE_IMAGE_GENERATION",
+ "image_generation.enable",
+ os.environ.get("ENABLE_IMAGE_GENERATION", "").lower() == "true",
+)
+
IMAGE_GENERATION_ENGINE = PersistentConfig(
"IMAGE_GENERATION_ENGINE",
"image_generation.engine",
os.getenv("IMAGE_GENERATION_ENGINE", "openai"),
)
-ENABLE_IMAGE_GENERATION = PersistentConfig(
- "ENABLE_IMAGE_GENERATION",
- "image_generation.enable",
- os.environ.get("ENABLE_IMAGE_GENERATION", "").lower() == "true",
+IMAGE_GENERATION_MODEL = PersistentConfig(
+ "IMAGE_GENERATION_MODEL",
+ "image_generation.model",
+ os.getenv("IMAGE_GENERATION_MODEL", ""),
+)
+
+IMAGE_SIZE = PersistentConfig(
+ "IMAGE_SIZE", "image_generation.size", os.getenv("IMAGE_SIZE", "512x512")
+)
+
+IMAGE_STEPS = PersistentConfig(
+ "IMAGE_STEPS", "image_generation.steps", int(os.getenv("IMAGE_STEPS", 50))
)
ENABLE_IMAGE_PROMPT_GENERATION = PersistentConfig(
@@ -3097,34 +3126,17 @@ class BannerModel(BaseModel):
os.getenv("AUTOMATIC1111_API_AUTH", ""),
)
-AUTOMATIC1111_CFG_SCALE = PersistentConfig(
- "AUTOMATIC1111_CFG_SCALE",
- "image_generation.automatic1111.cfg_scale",
- (
- float(os.environ.get("AUTOMATIC1111_CFG_SCALE"))
- if os.environ.get("AUTOMATIC1111_CFG_SCALE")
- else None
- ),
-)
+automatic1111_params = os.getenv("AUTOMATIC1111_PARAMS", "")
+try:
+ automatic1111_params = json.loads(automatic1111_params)
+except json.JSONDecodeError:
+ automatic1111_params = {}
-AUTOMATIC1111_SAMPLER = PersistentConfig(
- "AUTOMATIC1111_SAMPLER",
- "image_generation.automatic1111.sampler",
- (
- os.environ.get("AUTOMATIC1111_SAMPLER")
- if os.environ.get("AUTOMATIC1111_SAMPLER")
- else None
- ),
-)
-AUTOMATIC1111_SCHEDULER = PersistentConfig(
- "AUTOMATIC1111_SCHEDULER",
- "image_generation.automatic1111.scheduler",
- (
- os.environ.get("AUTOMATIC1111_SCHEDULER")
- if os.environ.get("AUTOMATIC1111_SCHEDULER")
- else None
- ),
+AUTOMATIC1111_PARAMS = PersistentConfig(
+ "AUTOMATIC1111_PARAMS",
+ "image_generation.automatic1111.api_auth",
+ automatic1111_params,
)
COMFYUI_BASE_URL = PersistentConfig(
@@ -3289,18 +3301,79 @@ class BannerModel(BaseModel):
os.getenv("IMAGES_GEMINI_API_KEY", GEMINI_API_KEY),
)
-IMAGE_SIZE = PersistentConfig(
- "IMAGE_SIZE", "image_generation.size", os.getenv("IMAGE_SIZE", "512x512")
+IMAGES_GEMINI_ENDPOINT_METHOD = PersistentConfig(
+ "IMAGES_GEMINI_ENDPOINT_METHOD",
+ "image_generation.gemini.endpoint_method",
+ os.getenv("IMAGES_GEMINI_ENDPOINT_METHOD", ""),
)
-IMAGE_STEPS = PersistentConfig(
- "IMAGE_STEPS", "image_generation.steps", int(os.getenv("IMAGE_STEPS", 50))
+
+IMAGE_EDIT_ENGINE = PersistentConfig(
+ "IMAGE_EDIT_ENGINE",
+ "images.edit.engine",
+ os.getenv("IMAGE_EDIT_ENGINE", "openai"),
)
-IMAGE_GENERATION_MODEL = PersistentConfig(
- "IMAGE_GENERATION_MODEL",
- "image_generation.model",
- os.getenv("IMAGE_GENERATION_MODEL", ""),
+IMAGE_EDIT_MODEL = PersistentConfig(
+ "IMAGE_EDIT_MODEL",
+ "images.edit.model",
+ os.getenv("IMAGE_EDIT_MODEL", ""),
+)
+
+IMAGE_EDIT_SIZE = PersistentConfig(
+ "IMAGE_EDIT_SIZE", "images.edit.size", os.getenv("IMAGE_EDIT_SIZE", "")
+)
+
+IMAGES_EDIT_OPENAI_API_BASE_URL = PersistentConfig(
+ "IMAGES_EDIT_OPENAI_API_BASE_URL",
+ "images.edit.openai.api_base_url",
+ os.getenv("IMAGES_EDIT_OPENAI_API_BASE_URL", OPENAI_API_BASE_URL),
+)
+IMAGES_EDIT_OPENAI_API_VERSION = PersistentConfig(
+ "IMAGES_EDIT_OPENAI_API_VERSION",
+ "images.edit.openai.api_version",
+ os.getenv("IMAGES_EDIT_OPENAI_API_VERSION", ""),
+)
+
+IMAGES_EDIT_OPENAI_API_KEY = PersistentConfig(
+ "IMAGES_EDIT_OPENAI_API_KEY",
+ "images.edit.openai.api_key",
+ os.getenv("IMAGES_EDIT_OPENAI_API_KEY", OPENAI_API_KEY),
+)
+
+IMAGES_EDIT_GEMINI_API_BASE_URL = PersistentConfig(
+ "IMAGES_EDIT_GEMINI_API_BASE_URL",
+ "images.edit.gemini.api_base_url",
+ os.getenv("IMAGES_EDIT_GEMINI_API_BASE_URL", GEMINI_API_BASE_URL),
+)
+IMAGES_EDIT_GEMINI_API_KEY = PersistentConfig(
+ "IMAGES_EDIT_GEMINI_API_KEY",
+ "images.edit.gemini.api_key",
+ os.getenv("IMAGES_EDIT_GEMINI_API_KEY", GEMINI_API_KEY),
+)
+
+
+IMAGES_EDIT_COMFYUI_BASE_URL = PersistentConfig(
+ "IMAGES_EDIT_COMFYUI_BASE_URL",
+ "images.edit.comfyui.base_url",
+ os.getenv("IMAGES_EDIT_COMFYUI_BASE_URL", ""),
+)
+IMAGES_EDIT_COMFYUI_API_KEY = PersistentConfig(
+ "IMAGES_EDIT_COMFYUI_API_KEY",
+ "images.edit.comfyui.api_key",
+ os.getenv("IMAGES_EDIT_COMFYUI_API_KEY", ""),
+)
+
+IMAGES_EDIT_COMFYUI_WORKFLOW = PersistentConfig(
+ "IMAGES_EDIT_COMFYUI_WORKFLOW",
+ "images.edit.comfyui.workflow",
+ os.getenv("IMAGES_EDIT_COMFYUI_WORKFLOW", ""),
+)
+
+IMAGES_EDIT_COMFYUI_WORKFLOW_NODES = PersistentConfig(
+ "IMAGES_EDIT_COMFYUI_WORKFLOW_NODES",
+ "images.edit.comfyui.nodes",
+ [],
)
####################################
@@ -3335,6 +3408,11 @@ class BannerModel(BaseModel):
os.getenv("DEEPGRAM_API_KEY", ""),
)
+# ElevenLabs configuration
+ELEVENLABS_API_BASE_URL = os.getenv(
+ "ELEVENLABS_API_BASE_URL", "https://api.elevenlabs.io"
+)
+
AUDIO_STT_OPENAI_API_BASE_URL = PersistentConfig(
"AUDIO_STT_OPENAI_API_BASE_URL",
"audio.stt.openai.api_base_url",
@@ -3401,6 +3479,24 @@ class BannerModel(BaseModel):
os.getenv("AUDIO_STT_AZURE_MAX_SPEAKERS", ""),
)
+AUDIO_STT_MISTRAL_API_KEY = PersistentConfig(
+ "AUDIO_STT_MISTRAL_API_KEY",
+ "audio.stt.mistral.api_key",
+ os.getenv("AUDIO_STT_MISTRAL_API_KEY", ""),
+)
+
+AUDIO_STT_MISTRAL_API_BASE_URL = PersistentConfig(
+ "AUDIO_STT_MISTRAL_API_BASE_URL",
+ "audio.stt.mistral.api_base_url",
+ os.getenv("AUDIO_STT_MISTRAL_API_BASE_URL", "https://api.mistral.ai/v1"),
+)
+
+AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS = PersistentConfig(
+ "AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS",
+ "audio.stt.mistral.use_chat_completions",
+ os.getenv("AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS", "false").lower() == "true",
+)
+
AUDIO_TTS_OPENAI_API_BASE_URL = PersistentConfig(
"AUDIO_TTS_OPENAI_API_BASE_URL",
"audio.tts.openai.api_base_url",
diff --git a/backend/open_webui/main.py b/backend/open_webui/main.py
index 98fb0ad1ff..364e9c87b2 100644
--- a/backend/open_webui/main.py
+++ b/backend/open_webui/main.py
@@ -140,9 +140,7 @@
# Image
AUTOMATIC1111_API_AUTH,
AUTOMATIC1111_BASE_URL,
- AUTOMATIC1111_CFG_SCALE,
- AUTOMATIC1111_SAMPLER,
- AUTOMATIC1111_SCHEDULER,
+ AUTOMATIC1111_PARAMS,
COMFYUI_BASE_URL,
COMFYUI_API_KEY,
COMFYUI_WORKFLOW,
@@ -158,6 +156,19 @@
IMAGES_OPENAI_API_KEY,
IMAGES_GEMINI_API_BASE_URL,
IMAGES_GEMINI_API_KEY,
+ IMAGES_GEMINI_ENDPOINT_METHOD,
+ IMAGE_EDIT_ENGINE,
+ IMAGE_EDIT_MODEL,
+ IMAGE_EDIT_SIZE,
+ IMAGES_EDIT_OPENAI_API_BASE_URL,
+ IMAGES_EDIT_OPENAI_API_KEY,
+ IMAGES_EDIT_OPENAI_API_VERSION,
+ IMAGES_EDIT_GEMINI_API_BASE_URL,
+ IMAGES_EDIT_GEMINI_API_KEY,
+ IMAGES_EDIT_COMFYUI_BASE_URL,
+ IMAGES_EDIT_COMFYUI_API_KEY,
+ IMAGES_EDIT_COMFYUI_WORKFLOW,
+ IMAGES_EDIT_COMFYUI_WORKFLOW_NODES,
# Audio
AUDIO_STT_ENGINE,
AUDIO_STT_MODEL,
@@ -169,6 +180,9 @@
AUDIO_STT_AZURE_LOCALES,
AUDIO_STT_AZURE_BASE_URL,
AUDIO_STT_AZURE_MAX_SPEAKERS,
+ AUDIO_STT_MISTRAL_API_KEY,
+ AUDIO_STT_MISTRAL_API_BASE_URL,
+ AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS,
AUDIO_TTS_ENGINE,
AUDIO_TTS_MODEL,
AUDIO_TTS_VOICE,
@@ -256,6 +270,7 @@
DOCLING_PICTURE_DESCRIPTION_API,
DOCUMENT_INTELLIGENCE_ENDPOINT,
DOCUMENT_INTELLIGENCE_KEY,
+ MISTRAL_OCR_API_BASE_URL,
MISTRAL_OCR_API_KEY,
RAG_TEXT_SPLITTER,
TIKTOKEN_ENCODING_NAME,
@@ -501,9 +516,11 @@
)
from open_webui.utils.plugin import install_tool_and_function_dependencies
from open_webui.utils.oauth import (
+ get_oauth_client_info_with_dynamic_client_registration,
+ encrypt_data,
+ decrypt_data,
OAuthManager,
OAuthClientManager,
- decrypt_data,
OAuthClientInformationFull,
)
from open_webui.utils.security_headers import SecurityHeadersMiddleware
@@ -877,6 +894,7 @@ async def lifespan(app: FastAPI):
app.state.config.DOCLING_PICTURE_DESCRIPTION_API = DOCLING_PICTURE_DESCRIPTION_API
app.state.config.DOCUMENT_INTELLIGENCE_ENDPOINT = DOCUMENT_INTELLIGENCE_ENDPOINT
app.state.config.DOCUMENT_INTELLIGENCE_KEY = DOCUMENT_INTELLIGENCE_KEY
+app.state.config.MISTRAL_OCR_API_BASE_URL = MISTRAL_OCR_API_BASE_URL
app.state.config.MISTRAL_OCR_API_KEY = MISTRAL_OCR_API_KEY
app.state.config.MINERU_API_MODE = MINERU_API_MODE
app.state.config.MINERU_API_URL = MINERU_API_URL
@@ -1079,27 +1097,40 @@ async def lifespan(app: FastAPI):
app.state.config.ENABLE_IMAGE_GENERATION = ENABLE_IMAGE_GENERATION
app.state.config.ENABLE_IMAGE_PROMPT_GENERATION = ENABLE_IMAGE_PROMPT_GENERATION
+app.state.config.IMAGE_GENERATION_MODEL = IMAGE_GENERATION_MODEL
+app.state.config.IMAGE_SIZE = IMAGE_SIZE
+app.state.config.IMAGE_STEPS = IMAGE_STEPS
+
app.state.config.IMAGES_OPENAI_API_BASE_URL = IMAGES_OPENAI_API_BASE_URL
app.state.config.IMAGES_OPENAI_API_VERSION = IMAGES_OPENAI_API_VERSION
app.state.config.IMAGES_OPENAI_API_KEY = IMAGES_OPENAI_API_KEY
app.state.config.IMAGES_GEMINI_API_BASE_URL = IMAGES_GEMINI_API_BASE_URL
app.state.config.IMAGES_GEMINI_API_KEY = IMAGES_GEMINI_API_KEY
-
-app.state.config.IMAGE_GENERATION_MODEL = IMAGE_GENERATION_MODEL
+app.state.config.IMAGES_GEMINI_ENDPOINT_METHOD = IMAGES_GEMINI_ENDPOINT_METHOD
app.state.config.AUTOMATIC1111_BASE_URL = AUTOMATIC1111_BASE_URL
app.state.config.AUTOMATIC1111_API_AUTH = AUTOMATIC1111_API_AUTH
-app.state.config.AUTOMATIC1111_CFG_SCALE = AUTOMATIC1111_CFG_SCALE
-app.state.config.AUTOMATIC1111_SAMPLER = AUTOMATIC1111_SAMPLER
-app.state.config.AUTOMATIC1111_SCHEDULER = AUTOMATIC1111_SCHEDULER
+app.state.config.AUTOMATIC1111_PARAMS = AUTOMATIC1111_PARAMS
+
app.state.config.COMFYUI_BASE_URL = COMFYUI_BASE_URL
app.state.config.COMFYUI_API_KEY = COMFYUI_API_KEY
app.state.config.COMFYUI_WORKFLOW = COMFYUI_WORKFLOW
app.state.config.COMFYUI_WORKFLOW_NODES = COMFYUI_WORKFLOW_NODES
-app.state.config.IMAGE_SIZE = IMAGE_SIZE
-app.state.config.IMAGE_STEPS = IMAGE_STEPS
+
+app.state.config.IMAGE_EDIT_ENGINE = IMAGE_EDIT_ENGINE
+app.state.config.IMAGE_EDIT_MODEL = IMAGE_EDIT_MODEL
+app.state.config.IMAGE_EDIT_SIZE = IMAGE_EDIT_SIZE
+app.state.config.IMAGES_EDIT_OPENAI_API_BASE_URL = IMAGES_EDIT_OPENAI_API_BASE_URL
+app.state.config.IMAGES_EDIT_OPENAI_API_KEY = IMAGES_EDIT_OPENAI_API_KEY
+app.state.config.IMAGES_EDIT_OPENAI_API_VERSION = IMAGES_EDIT_OPENAI_API_VERSION
+app.state.config.IMAGES_EDIT_GEMINI_API_BASE_URL = IMAGES_EDIT_GEMINI_API_BASE_URL
+app.state.config.IMAGES_EDIT_GEMINI_API_KEY = IMAGES_EDIT_GEMINI_API_KEY
+app.state.config.IMAGES_EDIT_COMFYUI_BASE_URL = IMAGES_EDIT_COMFYUI_BASE_URL
+app.state.config.IMAGES_EDIT_COMFYUI_API_KEY = IMAGES_EDIT_COMFYUI_API_KEY
+app.state.config.IMAGES_EDIT_COMFYUI_WORKFLOW = IMAGES_EDIT_COMFYUI_WORKFLOW
+app.state.config.IMAGES_EDIT_COMFYUI_WORKFLOW_NODES = IMAGES_EDIT_COMFYUI_WORKFLOW_NODES
########################################
#
@@ -1124,6 +1155,12 @@ async def lifespan(app: FastAPI):
app.state.config.AUDIO_STT_AZURE_BASE_URL = AUDIO_STT_AZURE_BASE_URL
app.state.config.AUDIO_STT_AZURE_MAX_SPEAKERS = AUDIO_STT_AZURE_MAX_SPEAKERS
+app.state.config.AUDIO_STT_MISTRAL_API_KEY = AUDIO_STT_MISTRAL_API_KEY
+app.state.config.AUDIO_STT_MISTRAL_API_BASE_URL = AUDIO_STT_MISTRAL_API_BASE_URL
+app.state.config.AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS = (
+ AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS
+)
+
app.state.config.TTS_ENGINE = AUDIO_TTS_ENGINE
app.state.config.TTS_MODEL = AUDIO_TTS_MODEL
@@ -1625,11 +1662,15 @@ async def process_chat(request, form_data, user, metadata, model):
log.info("Chat processing was cancelled")
try:
event_emitter = get_event_emitter(metadata)
- await event_emitter(
- {"type": "chat:tasks:cancel"},
+ await asyncio.shield(
+ event_emitter(
+ {"type": "chat:tasks:cancel"},
+ )
)
except Exception as e:
pass
+ finally:
+ raise # re-raise to ensure proper task cancellation handling
except Exception as e:
log.debug(f"Error processing chat payload: {e}")
if metadata.get("chat_id") and metadata.get("message_id"):
@@ -1660,7 +1701,7 @@ async def process_chat(request, form_data, user, metadata, model):
finally:
try:
if mcp_clients := metadata.get("mcp_clients"):
- for client in mcp_clients.values():
+ for client in reversed(mcp_clients.values()):
await client.disconnect()
except Exception as e:
log.debug(f"Error cleaning up: {e}")
@@ -2007,6 +2048,7 @@ async def get_current_usage(user=Depends(get_verified_user)):
if tool_server_connection.get("type", "openapi") == "mcp":
server_id = tool_server_connection.get("info", {}).get("id")
auth_type = tool_server_connection.get("auth_type", "none")
+
if server_id and auth_type == "oauth_2.1":
oauth_client_info = tool_server_connection.get("info", {}).get(
"oauth_client_info", ""
@@ -2052,6 +2094,64 @@ async def get_current_usage(user=Depends(get_verified_user)):
)
+async def register_client(self, request, client_id: str) -> bool:
+ server_type, server_id = client_id.split(":", 1)
+
+ connection = None
+ connection_idx = None
+
+ for idx, conn in enumerate(request.app.state.config.TOOL_SERVER_CONNECTIONS or []):
+ if conn.get("type", "openapi") == server_type:
+ info = conn.get("info", {})
+ if info.get("id") == server_id:
+ connection = conn
+ connection_idx = idx
+ break
+
+ if connection is None or connection_idx is None:
+ log.warning(
+ f"Unable to locate MCP tool server configuration for client {client_id} during re-registration"
+ )
+ return False
+
+ server_url = connection.get("url")
+ oauth_server_key = (connection.get("config") or {}).get("oauth_server_key")
+
+ try:
+ oauth_client_info = (
+ await get_oauth_client_info_with_dynamic_client_registration(
+ request,
+ client_id,
+ server_url,
+ oauth_server_key,
+ )
+ )
+ except Exception as e:
+ log.error(f"Dynamic client re-registration failed for {client_id}: {e}")
+ return False
+
+ try:
+ request.app.state.config.TOOL_SERVER_CONNECTIONS[connection_idx] = {
+ **connection,
+ "info": {
+ **connection.get("info", {}),
+ "oauth_client_info": encrypt_data(
+ oauth_client_info.model_dump(mode="json")
+ ),
+ },
+ }
+ except Exception as e:
+ log.error(
+ f"Failed to persist updated OAuth client info for tool server {client_id}: {e}"
+ )
+ return False
+
+ oauth_client_manager.remove_client(client_id)
+ oauth_client_manager.add_client(client_id, oauth_client_info)
+ log.info(f"Re-registered OAuth client {client_id} for tool server")
+ return True
+
+
@app.get("/oauth/clients/{client_id}/authorize")
async def oauth_client_authorize(
client_id: str,
@@ -2059,6 +2159,41 @@ async def oauth_client_authorize(
response: Response,
user=Depends(get_verified_user),
):
+ # ensure_valid_client_registration
+ client = oauth_client_manager.get_client(client_id)
+ client_info = oauth_client_manager.get_client_info(client_id)
+ if client is None or client_info is None:
+ raise HTTPException(status.HTTP_404_NOT_FOUND)
+
+ if not await oauth_client_manager._preflight_authorization_url(client, client_info):
+ log.info(
+ "Detected invalid OAuth client %s; attempting re-registration",
+ client_id,
+ )
+
+ registered = await register_client(request, client_id)
+ if not registered:
+ raise HTTPException(
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+ detail="Failed to re-register OAuth client",
+ )
+
+ client = oauth_client_manager.get_client(client_id)
+ client_info = oauth_client_manager.get_client_info(client_id)
+ if client is None or client_info is None:
+ raise HTTPException(
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+ detail="OAuth client unavailable after re-registration",
+ )
+
+ if not await oauth_client_manager._preflight_authorization_url(
+ client, client_info
+ ):
+ raise HTTPException(
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+ detail="OAuth client registration is still invalid after re-registration",
+ )
+
return await oauth_client_manager.handle_authorize(request, client_id=client_id)
diff --git a/backend/open_webui/models/chats.py b/backend/open_webui/models/chats.py
index cfcbc004b7..c559932bcd 100644
--- a/backend/open_webui/models/chats.py
+++ b/backend/open_webui/models/chats.py
@@ -440,7 +440,10 @@ def get_archived_chat_list_by_user_id(
order_by = filter.get("order_by")
direction = filter.get("direction")
- if order_by and direction and getattr(Chat, order_by):
+ if order_by and direction:
+ if not getattr(Chat, order_by, None):
+ raise ValueError("Invalid order_by field")
+
if direction.lower() == "asc":
query = query.order_by(getattr(Chat, order_by).asc())
elif direction.lower() == "desc":
@@ -762,15 +765,20 @@ def get_chats_by_user_id_and_search_text(
)
elif dialect_name == "postgresql":
- # PostgreSQL relies on proper JSON query for search
+ # PostgreSQL doesn't allow null bytes in text. We filter those out by checking
+ # the JSON representation for \u0000 before attempting text extraction
postgres_content_sql = (
"EXISTS ("
" SELECT 1 "
" FROM json_array_elements(Chat.chat->'messages') AS message "
- " WHERE LOWER(message->>'content') LIKE '%' || :content_key || '%'"
+ " WHERE message->'content' IS NOT NULL "
+ " AND (message->'content')::text NOT LIKE '%\\u0000%' "
+ " AND LOWER(message->>'content') LIKE '%' || :content_key || '%'"
")"
)
postgres_content_clause = text(postgres_content_sql)
+ # Also filter out chats with null bytes in title
+ query = query.filter(text("Chat.title::text NOT LIKE '%\\x00%'"))
query = query.filter(
or_(
Chat.title.ilike(bindparam("title_key")),
diff --git a/backend/open_webui/models/oauth_sessions.py b/backend/open_webui/models/oauth_sessions.py
index 81ce220384..b0e465dbe7 100644
--- a/backend/open_webui/models/oauth_sessions.py
+++ b/backend/open_webui/models/oauth_sessions.py
@@ -262,5 +262,16 @@ def delete_sessions_by_user_id(self, user_id: str) -> bool:
log.error(f"Error deleting OAuth sessions by user ID: {e}")
return False
+ def delete_sessions_by_provider(self, provider: str) -> bool:
+ """Delete all OAuth sessions for a provider"""
+ try:
+ with get_db() as db:
+ db.query(OAuthSession).filter_by(provider=provider).delete()
+ db.commit()
+ return True
+ except Exception as e:
+ log.error(f"Error deleting OAuth sessions by provider {provider}: {e}")
+ return False
+
OAuthSessions = OAuthSessionTable()
diff --git a/backend/open_webui/retrieval/loaders/external_document.py b/backend/open_webui/retrieval/loaders/external_document.py
index 1be2ca3f24..998afd36f6 100644
--- a/backend/open_webui/retrieval/loaders/external_document.py
+++ b/backend/open_webui/retrieval/loaders/external_document.py
@@ -5,6 +5,7 @@
from langchain_core.document_loaders import BaseLoader
from langchain_core.documents import Document
+from open_webui.utils.headers import include_user_info_headers
from open_webui.env import SRC_LOG_LEVELS
log = logging.getLogger(__name__)
@@ -18,6 +19,7 @@ def __init__(
url: str,
api_key: str,
mime_type=None,
+ user=None,
**kwargs,
) -> None:
self.url = url
@@ -26,6 +28,8 @@ def __init__(
self.file_path = file_path
self.mime_type = mime_type
+ self.user = user
+
def load(self) -> List[Document]:
with open(self.file_path, "rb") as f:
data = f.read()
@@ -42,6 +46,9 @@ def load(self) -> List[Document]:
except:
pass
+ if self.user is not None:
+ headers = include_user_info_headers(headers, self.user)
+
url = self.url
if url.endswith("/"):
url = url[:-1]
diff --git a/backend/open_webui/retrieval/loaders/main.py b/backend/open_webui/retrieval/loaders/main.py
index 2ef1d75e02..bbc3da9bc9 100644
--- a/backend/open_webui/retrieval/loaders/main.py
+++ b/backend/open_webui/retrieval/loaders/main.py
@@ -228,6 +228,7 @@ def load(self) -> list[Document]:
class Loader:
def __init__(self, engine: str = "", **kwargs):
self.engine = engine
+ self.user = kwargs.get("user", None)
self.kwargs = kwargs
def load(
@@ -264,6 +265,7 @@ def _get_loader(self, filename: str, file_content_type: str, file_path: str):
url=self.kwargs.get("EXTERNAL_DOCUMENT_LOADER_URL"),
api_key=self.kwargs.get("EXTERNAL_DOCUMENT_LOADER_API_KEY"),
mime_type=file_content_type,
+ user=self.user,
)
elif self.engine == "tika" and self.kwargs.get("TIKA_SERVER_URL"):
if self._is_text_file(file_ext, file_content_type):
@@ -272,7 +274,6 @@ def _get_loader(self, filename: str, file_content_type: str, file_path: str):
loader = TikaLoader(
url=self.kwargs.get("TIKA_SERVER_URL"),
file_path=file_path,
- mime_type=file_content_type,
extract_images=self.kwargs.get("PDF_EXTRACT_IMAGES"),
)
elif (
@@ -369,14 +370,8 @@ def _get_loader(self, filename: str, file_content_type: str, file_path: str):
azure_credential=DefaultAzureCredential(),
)
elif self.engine == "mineru" and file_ext in [
- "pdf",
- "doc",
- "docx",
- "ppt",
- "pptx",
- "xls",
- "xlsx",
- ]:
+ "pdf"
+ ]: # MinerU currently only supports PDF
loader = MinerULoader(
file_path=file_path,
api_mode=self.kwargs.get("MINERU_API_MODE", "local"),
@@ -391,16 +386,9 @@ def _get_loader(self, filename: str, file_content_type: str, file_path: str):
in ["pdf"] # Mistral OCR currently only supports PDF and images
):
loader = MistralLoader(
- api_key=self.kwargs.get("MISTRAL_OCR_API_KEY"), file_path=file_path
- )
- elif (
- self.engine == "external"
- and self.kwargs.get("MISTRAL_OCR_API_KEY") != ""
- and file_ext
- in ["pdf"] # Mistral OCR currently only supports PDF and images
- ):
- loader = MistralLoader(
- api_key=self.kwargs.get("MISTRAL_OCR_API_KEY"), file_path=file_path
+ base_url=self.kwargs.get("MISTRAL_OCR_API_BASE_URL"),
+ api_key=self.kwargs.get("MISTRAL_OCR_API_KEY"),
+ file_path=file_path,
)
else:
if file_ext == "pdf":
diff --git a/backend/open_webui/retrieval/loaders/mistral.py b/backend/open_webui/retrieval/loaders/mistral.py
index b7f2622f5e..6a2d235559 100644
--- a/backend/open_webui/retrieval/loaders/mistral.py
+++ b/backend/open_webui/retrieval/loaders/mistral.py
@@ -30,10 +30,9 @@ class MistralLoader:
- Enhanced error handling with retryable error classification
"""
- BASE_API_URL = "https://api.mistral.ai/v1"
-
def __init__(
self,
+ base_url: str,
api_key: str,
file_path: str,
timeout: int = 300, # 5 minutes default
@@ -55,6 +54,9 @@ def __init__(
if not os.path.exists(file_path):
raise FileNotFoundError(f"File not found at {file_path}")
+ self.base_url = (
+ base_url.rstrip("/") if base_url else "https://api.mistral.ai/v1"
+ )
self.api_key = api_key
self.file_path = file_path
self.timeout = timeout
@@ -240,7 +242,7 @@ def _upload_file(self) -> str:
in a context manager to minimize memory usage duration.
"""
log.info("Uploading file to Mistral API")
- url = f"{self.BASE_API_URL}/files"
+ url = f"{self.base_url}/files"
def upload_request():
# MEMORY OPTIMIZATION: Use context manager to minimize file handle lifetime
@@ -275,7 +277,7 @@ def upload_request():
async def _upload_file_async(self, session: aiohttp.ClientSession) -> str:
"""Async file upload with streaming for better memory efficiency."""
- url = f"{self.BASE_API_URL}/files"
+ url = f"{self.base_url}/files"
async def upload_request():
# Create multipart writer for streaming upload
@@ -321,7 +323,7 @@ async def upload_request():
def _get_signed_url(self, file_id: str) -> str:
"""Retrieves a temporary signed URL for the uploaded file (sync version)."""
log.info(f"Getting signed URL for file ID: {file_id}")
- url = f"{self.BASE_API_URL}/files/{file_id}/url"
+ url = f"{self.base_url}/files/{file_id}/url"
params = {"expiry": 1}
signed_url_headers = {**self.headers, "Accept": "application/json"}
@@ -346,7 +348,7 @@ async def _get_signed_url_async(
self, session: aiohttp.ClientSession, file_id: str
) -> str:
"""Async signed URL retrieval."""
- url = f"{self.BASE_API_URL}/files/{file_id}/url"
+ url = f"{self.base_url}/files/{file_id}/url"
params = {"expiry": 1}
headers = {**self.headers, "Accept": "application/json"}
@@ -373,7 +375,7 @@ async def url_request():
def _process_ocr(self, signed_url: str) -> Dict[str, Any]:
"""Sends the signed URL to the OCR endpoint for processing (sync version)."""
log.info("Processing OCR via Mistral API")
- url = f"{self.BASE_API_URL}/ocr"
+ url = f"{self.base_url}/ocr"
ocr_headers = {
**self.headers,
"Content-Type": "application/json",
@@ -407,7 +409,7 @@ async def _process_ocr_async(
self, session: aiohttp.ClientSession, signed_url: str
) -> Dict[str, Any]:
"""Async OCR processing with timing metrics."""
- url = f"{self.BASE_API_URL}/ocr"
+ url = f"{self.base_url}/ocr"
headers = {
**self.headers,
@@ -446,7 +448,7 @@ async def ocr_request():
def _delete_file(self, file_id: str) -> None:
"""Deletes the file from Mistral storage (sync version)."""
log.info(f"Deleting uploaded file ID: {file_id}")
- url = f"{self.BASE_API_URL}/files/{file_id}"
+ url = f"{self.base_url}/files/{file_id}"
try:
response = requests.delete(
@@ -467,7 +469,7 @@ async def _delete_file_async(
async def delete_request():
self._debug_log(f"Deleting file ID: {file_id}")
async with session.delete(
- url=f"{self.BASE_API_URL}/files/{file_id}",
+ url=f"{self.base_url}/files/{file_id}",
headers=self.headers,
timeout=aiohttp.ClientTimeout(
total=self.cleanup_timeout
diff --git a/backend/open_webui/retrieval/loaders/youtube.py b/backend/open_webui/retrieval/loaders/youtube.py
index da17eaef65..cba602ed87 100644
--- a/backend/open_webui/retrieval/loaders/youtube.py
+++ b/backend/open_webui/retrieval/loaders/youtube.py
@@ -83,6 +83,7 @@ def load(self) -> List[Document]:
TranscriptsDisabled,
YouTubeTranscriptApi,
)
+ from youtube_transcript_api.proxies import GenericProxyConfig
except ImportError:
raise ImportError(
'Could not import "youtube_transcript_api" Python package. '
@@ -90,10 +91,9 @@ def load(self) -> List[Document]:
)
if self.proxy_url:
- youtube_proxies = {
- "http": self.proxy_url,
- "https": self.proxy_url,
- }
+ youtube_proxies = GenericProxyConfig(
+ http_url=self.proxy_url, https_url=self.proxy_url
+ )
log.debug(f"Using proxy URL: {self.proxy_url[:14]}...")
else:
youtube_proxies = None
diff --git a/backend/open_webui/retrieval/utils.py b/backend/open_webui/retrieval/utils.py
index 3470ead1d6..7db4208935 100644
--- a/backend/open_webui/retrieval/utils.py
+++ b/backend/open_webui/retrieval/utils.py
@@ -73,6 +73,7 @@ def get_loader(request, url: str):
url,
verify_ssl=request.app.state.config.ENABLE_WEB_LOADER_SSL_VERIFICATION,
requests_per_second=request.app.state.config.WEB_LOADER_CONCURRENT_REQUESTS,
+ trust_env=request.app.state.config.WEB_SEARCH_TRUST_ENV,
)
@@ -670,46 +671,51 @@ def get_sources_from_items(
collection_names.append(f"file-{item['id']}")
elif item.get("type") == "collection":
- if (
- item.get("context") == "full"
- or request.app.state.config.BYPASS_EMBEDDING_AND_RETRIEVAL
- ):
- # Manual Full Mode Toggle for Collection
- knowledge_base = Knowledges.get_knowledge_by_id(item.get("id"))
+ # Manual Full Mode Toggle for Collection
+ knowledge_base = Knowledges.get_knowledge_by_id(item.get("id"))
- if knowledge_base and (
- user.role == "admin"
- or knowledge_base.user_id == user.id
- or has_access(user.id, "read", knowledge_base.access_control)
+ if knowledge_base and (
+ user.role == "admin"
+ or knowledge_base.user_id == user.id
+ or has_access(user.id, "read", knowledge_base.access_control)
+ ):
+ if (
+ item.get("context") == "full"
+ or request.app.state.config.BYPASS_EMBEDDING_AND_RETRIEVAL
):
+ if knowledge_base and (
+ user.role == "admin"
+ or knowledge_base.user_id == user.id
+ or has_access(user.id, "read", knowledge_base.access_control)
+ ):
+
+ file_ids = knowledge_base.data.get("file_ids", [])
+
+ documents = []
+ metadatas = []
+ for file_id in file_ids:
+ file_object = Files.get_file_by_id(file_id)
+
+ if file_object:
+ documents.append(file_object.data.get("content", ""))
+ metadatas.append(
+ {
+ "file_id": file_id,
+ "name": file_object.filename,
+ "source": file_object.filename,
+ }
+ )
- file_ids = knowledge_base.data.get("file_ids", [])
-
- documents = []
- metadatas = []
- for file_id in file_ids:
- file_object = Files.get_file_by_id(file_id)
-
- if file_object:
- documents.append(file_object.data.get("content", ""))
- metadatas.append(
- {
- "file_id": file_id,
- "name": file_object.filename,
- "source": file_object.filename,
- }
- )
-
- query_result = {
- "documents": [documents],
- "metadatas": [metadatas],
- }
- else:
- # Fallback to collection names
- if item.get("legacy"):
- collection_names = item.get("collection_names", [])
+ query_result = {
+ "documents": [documents],
+ "metadatas": [metadatas],
+ }
else:
- collection_names.append(item["id"])
+ # Fallback to collection names
+ if item.get("legacy"):
+ collection_names = item.get("collection_names", [])
+ else:
+ collection_names.append(item["id"])
elif item.get("docs"):
# BYPASS_WEB_SEARCH_EMBEDDING_AND_RETRIEVAL
diff --git a/backend/open_webui/retrieval/web/firecrawl.py b/backend/open_webui/retrieval/web/firecrawl.py
index a85fc51fbd..2d9b104bca 100644
--- a/backend/open_webui/retrieval/web/firecrawl.py
+++ b/backend/open_webui/retrieval/web/firecrawl.py
@@ -1,11 +1,10 @@
import logging
from typing import Optional, List
-from urllib.parse import urljoin
-import requests
from open_webui.retrieval.web.main import SearchResult, get_filtered_results
from open_webui.env import SRC_LOG_LEVELS
+
log = logging.getLogger(__name__)
log.setLevel(SRC_LOG_LEVELS["RAG"])
@@ -18,27 +17,20 @@ def search_firecrawl(
filter_list: Optional[List[str]] = None,
) -> List[SearchResult]:
try:
- firecrawl_search_url = urljoin(firecrawl_url, "/v1/search")
- response = requests.post(
- firecrawl_search_url,
- headers={
- "User-Agent": "Open WebUI (https://github.com/open-webui/open-webui) RAG Bot",
- "Authorization": f"Bearer {firecrawl_api_key}",
- },
- json={
- "query": query,
- "limit": count,
- },
+ from firecrawl import FirecrawlApp
+
+ firecrawl = FirecrawlApp(api_key=firecrawl_api_key, api_url=firecrawl_url)
+ response = firecrawl.search(
+ query=query, limit=count, ignore_invalid_urls=True, timeout=count * 3
)
- response.raise_for_status()
- results = response.json().get("data", [])
+ results = response.web
if filter_list:
results = get_filtered_results(results, filter_list)
results = [
SearchResult(
- link=result.get("url"),
- title=result.get("title"),
- snippet=result.get("description"),
+ link=result.url,
+ title=result.title,
+ snippet=result.description,
)
for result in results[:count]
]
diff --git a/backend/open_webui/retrieval/web/google_pse.py b/backend/open_webui/retrieval/web/google_pse.py
index 2d2b863b42..69de24711a 100644
--- a/backend/open_webui/retrieval/web/google_pse.py
+++ b/backend/open_webui/retrieval/web/google_pse.py
@@ -15,6 +15,7 @@ def search_google_pse(
query: str,
count: int,
filter_list: Optional[list[str]] = None,
+ referer: Optional[str] = None,
) -> list[SearchResult]:
"""Search using Google's Programmable Search Engine API and return the results as a list of SearchResult objects.
Handles pagination for counts greater than 10.
@@ -30,7 +31,11 @@ def search_google_pse(
list[SearchResult]: A list of SearchResult objects.
"""
url = "https://www.googleapis.com/customsearch/v1"
+
headers = {"Content-Type": "application/json"}
+ if referer:
+ headers["Referer"] = referer
+
all_results = []
start_index = 1 # Google PSE start parameter is 1-based
diff --git a/backend/open_webui/retrieval/web/utils.py b/backend/open_webui/retrieval/web/utils.py
index 61356adb56..91699a157b 100644
--- a/backend/open_webui/retrieval/web/utils.py
+++ b/backend/open_webui/retrieval/web/utils.py
@@ -4,7 +4,6 @@
import ssl
import urllib.parse
import urllib.request
-from collections import defaultdict
from datetime import datetime, time, timedelta
from typing import (
Any,
@@ -17,11 +16,12 @@
Union,
Literal,
)
+
+from fastapi.concurrency import run_in_threadpool
import aiohttp
import certifi
import validators
from langchain_community.document_loaders import PlaywrightURLLoader, WebBaseLoader
-from langchain_community.document_loaders.firecrawl import FireCrawlLoader
from langchain_community.document_loaders.base import BaseLoader
from langchain_core.documents import Document
from open_webui.retrieval.loaders.tavily import TavilyLoader
@@ -39,7 +39,8 @@
EXTERNAL_WEB_LOADER_URL,
EXTERNAL_WEB_LOADER_API_KEY,
)
-from open_webui.env import SRC_LOG_LEVELS, AIOHTTP_CLIENT_SESSION_SSL
+from open_webui.env import SRC_LOG_LEVELS
+
log = logging.getLogger(__name__)
log.setLevel(SRC_LOG_LEVELS["RAG"])
@@ -142,13 +143,13 @@ def _sync_wait_for_rate_limit(self):
class URLProcessingMixin:
- def _verify_ssl_cert(self, url: str) -> bool:
+ async def _verify_ssl_cert(self, url: str) -> bool:
"""Verify SSL certificate for a URL."""
- return verify_ssl_cert(url)
+ return await run_in_threadpool(verify_ssl_cert, url)
async def _safe_process_url(self, url: str) -> bool:
"""Perform safety checks before processing a URL."""
- if self.verify_ssl and not self._verify_ssl_cert(url):
+ if self.verify_ssl and not await self._verify_ssl_cert(url):
raise ValueError(f"SSL certificate verification failed for {url}")
await self._wait_for_rate_limit()
return True
@@ -189,13 +190,12 @@ def __init__(
(uses FIRE_CRAWL_API_KEY environment variable if not provided).
api_url: Base URL for FireCrawl API. Defaults to official API endpoint.
mode: Operation mode selection:
- - 'crawl': Website crawling mode (default)
- - 'scrape': Direct page scraping
+ - 'crawl': Website crawling mode
+ - 'scrape': Direct page scraping (default)
- 'map': Site map generation
proxy: Proxy override settings for the FireCrawl API.
params: The parameters to pass to the Firecrawl API.
- Examples include crawlerOptions.
- For more details, visit: https://github.com/mendableai/firecrawl-py
+ For more details, visit: https://docs.firecrawl.dev/sdks/python#batch-scrape
"""
proxy_server = proxy.get("server") if proxy else None
if trust_env and not proxy_server:
@@ -215,50 +215,88 @@ def __init__(
self.api_key = api_key
self.api_url = api_url
self.mode = mode
- self.params = params
+ self.params = params or {}
def lazy_load(self) -> Iterator[Document]:
- """Load documents concurrently using FireCrawl."""
- for url in self.web_paths:
- try:
- self._safe_process_url_sync(url)
- loader = FireCrawlLoader(
- url=url,
- api_key=self.api_key,
- api_url=self.api_url,
- mode=self.mode,
- params=self.params,
+ """Load documents using FireCrawl batch_scrape."""
+ log.debug(
+ "Starting FireCrawl batch scrape for %d URLs, mode: %s, params: %s",
+ len(self.web_paths),
+ self.mode,
+ self.params,
+ )
+ try:
+ from firecrawl import FirecrawlApp
+
+ firecrawl = FirecrawlApp(api_key=self.api_key, api_url=self.api_url)
+ result = firecrawl.batch_scrape(
+ self.web_paths,
+ formats=["markdown"],
+ skip_tls_verification=not self.verify_ssl,
+ ignore_invalid_urls=True,
+ remove_base64_images=True,
+ max_age=300000, # 5 minutes https://docs.firecrawl.dev/features/fast-scraping#common-maxage-values
+ wait_timeout=len(self.web_paths) * 3,
+ **self.params,
+ )
+
+ if result.status != "completed":
+ raise RuntimeError(
+ f"FireCrawl batch scrape did not complete successfully. result: {result}"
)
- for document in loader.lazy_load():
- if not document.metadata.get("source"):
- document.metadata["source"] = document.metadata.get("sourceURL")
- yield document
- except Exception as e:
- if self.continue_on_failure:
- log.exception(f"Error loading {url}: {e}")
- continue
+
+ for data in result.data:
+ metadata = data.metadata or {}
+ yield Document(
+ page_content=data.markdown or "",
+ metadata={"source": metadata.url or metadata.source_url or ""},
+ )
+
+ except Exception as e:
+ if self.continue_on_failure:
+ log.exception(f"Error extracting content from URLs: {e}")
+ else:
raise e
async def alazy_load(self):
"""Async version of lazy_load."""
- for url in self.web_paths:
- try:
- await self._safe_process_url(url)
- loader = FireCrawlLoader(
- url=url,
- api_key=self.api_key,
- api_url=self.api_url,
- mode=self.mode,
- params=self.params,
+ log.debug(
+ "Starting FireCrawl batch scrape for %d URLs, mode: %s, params: %s",
+ len(self.web_paths),
+ self.mode,
+ self.params,
+ )
+ try:
+ from firecrawl import FirecrawlApp
+
+ firecrawl = FirecrawlApp(api_key=self.api_key, api_url=self.api_url)
+ result = firecrawl.batch_scrape(
+ self.web_paths,
+ formats=["markdown"],
+ skip_tls_verification=not self.verify_ssl,
+ ignore_invalid_urls=True,
+ remove_base64_images=True,
+ max_age=300000, # 5 minutes https://docs.firecrawl.dev/features/fast-scraping#common-maxage-values
+ wait_timeout=len(self.web_paths) * 3,
+ **self.params,
+ )
+
+ if result.status != "completed":
+ raise RuntimeError(
+ f"FireCrawl batch scrape did not complete successfully. result: {result}"
)
- async for document in loader.alazy_load():
- if not document.metadata.get("source"):
- document.metadata["source"] = document.metadata.get("sourceURL")
- yield document
- except Exception as e:
- if self.continue_on_failure:
- log.exception(f"Error loading {url}: {e}")
- continue
+
+ for data in result.data:
+ metadata = data.metadata or {}
+ yield Document(
+ page_content=data.markdown or "",
+ metadata={"source": metadata.url or metadata.source_url or ""},
+ )
+
+ except Exception as e:
+ if self.continue_on_failure:
+ log.exception(f"Error extracting content from URLs: {e}")
+ else:
raise e
diff --git a/backend/open_webui/routers/audio.py b/backend/open_webui/routers/audio.py
index cb7a57b5b7..45b4f1e692 100644
--- a/backend/open_webui/routers/audio.py
+++ b/backend/open_webui/routers/audio.py
@@ -4,6 +4,7 @@
import os
import uuid
import html
+import base64
from functools import lru_cache
from pydub import AudioSegment
from pydub.silence import split_on_silence
@@ -39,13 +40,14 @@
WHISPER_MODEL_DIR,
CACHE_DIR,
WHISPER_LANGUAGE,
+ ELEVENLABS_API_BASE_URL,
)
from open_webui.constants import ERROR_MESSAGES
from open_webui.env import (
+ ENV,
AIOHTTP_CLIENT_SESSION_SSL,
AIOHTTP_CLIENT_TIMEOUT,
- ENV,
SRC_LOG_LEVELS,
DEVICE_TYPE,
ENABLE_FORWARD_USER_INFO_HEADERS,
@@ -178,6 +180,9 @@ class STTConfigForm(BaseModel):
AZURE_LOCALES: str
AZURE_BASE_URL: str
AZURE_MAX_SPEAKERS: str
+ MISTRAL_API_KEY: str
+ MISTRAL_API_BASE_URL: str
+ MISTRAL_USE_CHAT_COMPLETIONS: bool
class AudioConfigUpdateForm(BaseModel):
@@ -214,6 +219,9 @@ async def get_audio_config(request: Request, user=Depends(get_admin_user)):
"AZURE_LOCALES": request.app.state.config.AUDIO_STT_AZURE_LOCALES,
"AZURE_BASE_URL": request.app.state.config.AUDIO_STT_AZURE_BASE_URL,
"AZURE_MAX_SPEAKERS": request.app.state.config.AUDIO_STT_AZURE_MAX_SPEAKERS,
+ "MISTRAL_API_KEY": request.app.state.config.AUDIO_STT_MISTRAL_API_KEY,
+ "MISTRAL_API_BASE_URL": request.app.state.config.AUDIO_STT_MISTRAL_API_BASE_URL,
+ "MISTRAL_USE_CHAT_COMPLETIONS": request.app.state.config.AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS,
},
}
@@ -255,6 +263,13 @@ async def update_audio_config(
request.app.state.config.AUDIO_STT_AZURE_MAX_SPEAKERS = (
form_data.stt.AZURE_MAX_SPEAKERS
)
+ request.app.state.config.AUDIO_STT_MISTRAL_API_KEY = form_data.stt.MISTRAL_API_KEY
+ request.app.state.config.AUDIO_STT_MISTRAL_API_BASE_URL = (
+ form_data.stt.MISTRAL_API_BASE_URL
+ )
+ request.app.state.config.AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS = (
+ form_data.stt.MISTRAL_USE_CHAT_COMPLETIONS
+ )
if request.app.state.config.STT_ENGINE == "":
request.app.state.faster_whisper_model = set_faster_whisper_model(
@@ -290,6 +305,9 @@ async def update_audio_config(
"AZURE_LOCALES": request.app.state.config.AUDIO_STT_AZURE_LOCALES,
"AZURE_BASE_URL": request.app.state.config.AUDIO_STT_AZURE_BASE_URL,
"AZURE_MAX_SPEAKERS": request.app.state.config.AUDIO_STT_AZURE_MAX_SPEAKERS,
+ "MISTRAL_API_KEY": request.app.state.config.AUDIO_STT_MISTRAL_API_KEY,
+ "MISTRAL_API_BASE_URL": request.app.state.config.AUDIO_STT_MISTRAL_API_BASE_URL,
+ "MISTRAL_USE_CHAT_COMPLETIONS": request.app.state.config.AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS,
},
}
@@ -413,7 +431,7 @@ async def speech(request: Request, user=Depends(get_verified_user)):
timeout=timeout, trust_env=True
) as session:
async with session.post(
- f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}",
+ f"{ELEVENLABS_API_BASE_URL}/v1/text-to-speech/{voice_id}",
json={
"text": payload["input"],
"model_id": request.app.state.config.TTS_MODEL,
@@ -828,6 +846,186 @@ def transcription_handler(request, file_path, metadata):
detail=detail if detail else "Open WebUI: Server Connection Error",
)
+ elif request.app.state.config.STT_ENGINE == "mistral":
+ # Check file exists
+ if not os.path.exists(file_path):
+ raise HTTPException(status_code=400, detail="Audio file not found")
+
+ # Check file size
+ file_size = os.path.getsize(file_path)
+ if file_size > MAX_FILE_SIZE:
+ raise HTTPException(
+ status_code=400,
+ detail=f"File size exceeds limit of {MAX_FILE_SIZE_MB}MB",
+ )
+
+ api_key = request.app.state.config.AUDIO_STT_MISTRAL_API_KEY
+ api_base_url = (
+ request.app.state.config.AUDIO_STT_MISTRAL_API_BASE_URL
+ or "https://api.mistral.ai/v1"
+ )
+ use_chat_completions = (
+ request.app.state.config.AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS
+ )
+
+ if not api_key:
+ raise HTTPException(
+ status_code=400,
+ detail="Mistral API key is required for Mistral STT",
+ )
+
+ r = None
+ try:
+ # Use voxtral-mini-latest as the default model for transcription
+ model = request.app.state.config.STT_MODEL or "voxtral-mini-latest"
+
+ log.info(
+ f"Mistral STT - model: {model}, "
+ f"method: {'chat_completions' if use_chat_completions else 'transcriptions'}"
+ )
+
+ if use_chat_completions:
+ # Use chat completions API with audio input
+ # This method requires mp3 or wav format
+ audio_file_to_use = file_path
+
+ if is_audio_conversion_required(file_path):
+ log.debug("Converting audio to mp3 for chat completions API")
+ converted_path = convert_audio_to_mp3(file_path)
+ if converted_path:
+ audio_file_to_use = converted_path
+ else:
+ log.error("Audio conversion failed")
+ raise HTTPException(
+ status_code=500,
+ detail="Audio conversion failed. Chat completions API requires mp3 or wav format.",
+ )
+
+ # Read and encode audio file as base64
+ with open(audio_file_to_use, "rb") as audio_file:
+ audio_base64 = base64.b64encode(audio_file.read()).decode("utf-8")
+
+ # Prepare chat completions request
+ url = f"{api_base_url}/chat/completions"
+
+ # Add language instruction if specified
+ language = metadata.get("language", None) if metadata else None
+ if language:
+ text_instruction = f"Transcribe this audio exactly as spoken in {language}. Do not translate it."
+ else:
+ text_instruction = "Transcribe this audio exactly as spoken in its original language. Do not translate it to another language."
+
+ payload = {
+ "model": model,
+ "messages": [
+ {
+ "role": "user",
+ "content": [
+ {
+ "type": "input_audio",
+ "input_audio": audio_base64,
+ },
+ {"type": "text", "text": text_instruction},
+ ],
+ }
+ ],
+ }
+
+ r = requests.post(
+ url=url,
+ json=payload,
+ headers={
+ "Authorization": f"Bearer {api_key}",
+ "Content-Type": "application/json",
+ },
+ )
+
+ r.raise_for_status()
+ response = r.json()
+
+ # Extract transcript from chat completion response
+ transcript = (
+ response.get("choices", [{}])[0]
+ .get("message", {})
+ .get("content", "")
+ .strip()
+ )
+ if not transcript:
+ raise ValueError("Empty transcript in response")
+
+ data = {"text": transcript}
+
+ else:
+ # Use dedicated transcriptions API
+ url = f"{api_base_url}/audio/transcriptions"
+
+ # Determine the MIME type
+ mime_type, _ = mimetypes.guess_type(file_path)
+ if not mime_type:
+ mime_type = "audio/webm"
+
+ # Use context manager to ensure file is properly closed
+ with open(file_path, "rb") as audio_file:
+ files = {"file": (filename, audio_file, mime_type)}
+ data_form = {"model": model}
+
+ # Add language if specified in metadata
+ language = metadata.get("language", None) if metadata else None
+ if language:
+ data_form["language"] = language
+
+ r = requests.post(
+ url=url,
+ files=files,
+ data=data_form,
+ headers={
+ "Authorization": f"Bearer {api_key}",
+ },
+ )
+
+ r.raise_for_status()
+ response = r.json()
+
+ # Extract transcript from response
+ transcript = response.get("text", "").strip()
+ if not transcript:
+ raise ValueError("Empty transcript in response")
+
+ data = {"text": transcript}
+
+ # Save transcript to json file (consistent with other providers)
+ transcript_file = f"{file_dir}/{id}.json"
+ with open(transcript_file, "w") as f:
+ json.dump(data, f)
+
+ log.debug(data)
+ return data
+
+ except ValueError as e:
+ log.exception("Error parsing Mistral response")
+ raise HTTPException(
+ status_code=500,
+ detail=f"Failed to parse Mistral response: {str(e)}",
+ )
+ except requests.exceptions.RequestException as e:
+ log.exception(e)
+ detail = None
+
+ try:
+ if r is not None and r.status_code != 200:
+ res = r.json()
+ if "error" in res:
+ detail = f"External: {res['error'].get('message', '')}"
+ else:
+ detail = f"External: {r.text}"
+ except Exception:
+ detail = f"External: {e}"
+
+ raise HTTPException(
+ status_code=getattr(r, "status_code", 500) if r else 500,
+ detail=detail if detail else "Open WebUI: Server Connection Error",
+ )
+
def transcribe(request: Request, file_path: str, metadata: Optional[dict] = None):
log.info(f"transcribe: {file_path} {metadata}")
@@ -1037,7 +1235,7 @@ def get_available_models(request: Request) -> list[dict]:
elif request.app.state.config.TTS_ENGINE == "elevenlabs":
try:
response = requests.get(
- "https://api.elevenlabs.io/v1/models",
+ f"{ELEVENLABS_API_BASE_URL}/v1/models",
headers={
"xi-api-key": request.app.state.config.TTS_API_KEY,
"Content-Type": "application/json",
@@ -1141,7 +1339,7 @@ def get_elevenlabs_voices(api_key: str) -> dict:
try:
# TODO: Add retries
response = requests.get(
- "https://api.elevenlabs.io/v1/voices",
+ f"{ELEVENLABS_API_BASE_URL}/v1/voices",
headers={
"xi-api-key": api_key,
"Content-Type": "application/json",
diff --git a/backend/open_webui/routers/auths.py b/backend/open_webui/routers/auths.py
index 8e8dd626d0..5846df84b1 100644
--- a/backend/open_webui/routers/auths.py
+++ b/backend/open_webui/routers/auths.py
@@ -524,6 +524,15 @@ async def signin(request: Request, response: Response, form_data: SigninForm):
user = Auths.authenticate_user(admin_email.lower(), admin_password)
else:
+ password_bytes = form_data.password.encode("utf-8")
+ if len(password_bytes) > 72:
+ # TODO: Implement other hashing algorithms that support longer passwords
+ log.info("Password too long, truncating to 72 bytes for bcrypt")
+ password_bytes = password_bytes[:72]
+
+ # decode safely — ignore incomplete UTF-8 sequences
+ form_data.password = password_bytes.decode("utf-8", errors="ignore")
+
user = Auths.authenticate_user(form_data.email.lower(), form_data.password)
if user:
diff --git a/backend/open_webui/routers/configs.py b/backend/open_webui/routers/configs.py
index 28bbea8710..683068da0f 100644
--- a/backend/open_webui/routers/configs.py
+++ b/backend/open_webui/routers/configs.py
@@ -1,4 +1,5 @@
import logging
+import copy
from fastapi import APIRouter, Depends, Request, HTTPException
from pydantic import BaseModel, ConfigDict, Field
import aiohttp
@@ -15,6 +16,7 @@
set_tool_servers,
)
from open_webui.utils.mcp.client import MCPClient
+from open_webui.models.oauth_sessions import OAuthSessions
from open_webui.env import SRC_LOG_LEVELS
@@ -165,6 +167,21 @@ async def set_tool_servers_config(
form_data: ToolServersConfigForm,
user=Depends(get_admin_user),
):
+ for connection in request.app.state.config.TOOL_SERVER_CONNECTIONS:
+ server_type = connection.get("type", "openapi")
+ auth_type = connection.get("auth_type", "none")
+
+ if auth_type == "oauth_2.1":
+ # Remove existing OAuth clients for tool servers
+ server_id = connection.get("info", {}).get("id")
+ client_key = f"{server_type}:{server_id}"
+
+ try:
+ request.app.state.oauth_client_manager.remove_client(client_key)
+ except:
+ pass
+
+ # Set new tool server connections
request.app.state.config.TOOL_SERVER_CONNECTIONS = [
connection.model_dump() for connection in form_data.TOOL_SERVER_CONNECTIONS
]
@@ -176,6 +193,7 @@ async def set_tool_servers_config(
if server_type == "mcp":
server_id = connection.get("info", {}).get("id")
auth_type = connection.get("auth_type", "none")
+
if auth_type == "oauth_2.1" and server_id:
try:
oauth_client_info = connection.get("info", {}).get(
@@ -211,7 +229,7 @@ async def verify_tool_servers_config(
log.debug(
f"Trying to fetch OAuth 2.1 discovery document from {discovery_url}"
)
- async with aiohttp.ClientSession() as session:
+ async with aiohttp.ClientSession(trust_env=True) as session:
async with session.get(
discovery_url
) as oauth_server_metadata_response:
diff --git a/backend/open_webui/routers/files.py b/backend/open_webui/routers/files.py
index 84d8f841cf..2a5c3e5bb1 100644
--- a/backend/open_webui/routers/files.py
+++ b/backend/open_webui/routers/files.py
@@ -115,6 +115,10 @@ def process_uploaded_file(request, file, file_path, file_item, file_metadata, us
request.app.state.config.CONTENT_EXTRACTION_ENGINE == "external"
):
process_file(request, ProcessFileForm(file_id=file_item.id), user=user)
+ else:
+ raise Exception(
+ f"File type {file.content_type} is not supported for processing"
+ )
else:
log.info(
f"File type {file.content_type} is not provided, but trying to process anyway"
diff --git a/backend/open_webui/routers/images.py b/backend/open_webui/routers/images.py
index 059b3a23d7..b1b3994968 100644
--- a/backend/open_webui/routers/images.py
+++ b/backend/open_webui/routers/images.py
@@ -1,5 +1,6 @@
import asyncio
import base64
+import uuid
import io
import json
import logging
@@ -10,23 +11,22 @@
from urllib.parse import quote
import requests
-from fastapi import (
- APIRouter,
- Depends,
- HTTPException,
- Request,
- UploadFile,
-)
+from fastapi import APIRouter, Depends, HTTPException, Request, UploadFile
+from fastapi.responses import FileResponse
from open_webui.config import CACHE_DIR
from open_webui.constants import ERROR_MESSAGES
from open_webui.env import ENABLE_FORWARD_USER_INFO_HEADERS, SRC_LOG_LEVELS
-from open_webui.routers.files import upload_file_handler
+from open_webui.routers.files import upload_file_handler, get_file_content_by_id
from open_webui.utils.auth import get_admin_user, get_verified_user
+from open_webui.utils.headers import include_user_info_headers
from open_webui.utils.images.comfyui import (
- ComfyUIGenerateImageForm,
+ ComfyUICreateImageForm,
+ ComfyUIEditImageForm,
ComfyUIWorkflow,
- comfyui_generate_image,
+ comfyui_upload_image,
+ comfyui_create_image,
+ comfyui_edit_image,
)
from pydantic import BaseModel
@@ -36,210 +36,9 @@
IMAGE_CACHE_DIR = CACHE_DIR / "image" / "generations"
IMAGE_CACHE_DIR.mkdir(parents=True, exist_ok=True)
-
router = APIRouter()
-@router.get("/config")
-async def get_config(request: Request, user=Depends(get_admin_user)):
- return {
- "enabled": request.app.state.config.ENABLE_IMAGE_GENERATION,
- "engine": request.app.state.config.IMAGE_GENERATION_ENGINE,
- "prompt_generation": request.app.state.config.ENABLE_IMAGE_PROMPT_GENERATION,
- "openai": {
- "OPENAI_API_BASE_URL": request.app.state.config.IMAGES_OPENAI_API_BASE_URL,
- "OPENAI_API_VERSION": request.app.state.config.IMAGES_OPENAI_API_VERSION,
- "OPENAI_API_KEY": request.app.state.config.IMAGES_OPENAI_API_KEY,
- },
- "automatic1111": {
- "AUTOMATIC1111_BASE_URL": request.app.state.config.AUTOMATIC1111_BASE_URL,
- "AUTOMATIC1111_API_AUTH": request.app.state.config.AUTOMATIC1111_API_AUTH,
- "AUTOMATIC1111_CFG_SCALE": request.app.state.config.AUTOMATIC1111_CFG_SCALE,
- "AUTOMATIC1111_SAMPLER": request.app.state.config.AUTOMATIC1111_SAMPLER,
- "AUTOMATIC1111_SCHEDULER": request.app.state.config.AUTOMATIC1111_SCHEDULER,
- },
- "comfyui": {
- "COMFYUI_BASE_URL": request.app.state.config.COMFYUI_BASE_URL,
- "COMFYUI_API_KEY": request.app.state.config.COMFYUI_API_KEY,
- "COMFYUI_WORKFLOW": request.app.state.config.COMFYUI_WORKFLOW,
- "COMFYUI_WORKFLOW_NODES": request.app.state.config.COMFYUI_WORKFLOW_NODES,
- },
- "gemini": {
- "GEMINI_API_BASE_URL": request.app.state.config.IMAGES_GEMINI_API_BASE_URL,
- "GEMINI_API_KEY": request.app.state.config.IMAGES_GEMINI_API_KEY,
- },
- }
-
-
-class OpenAIConfigForm(BaseModel):
- OPENAI_API_BASE_URL: str
- OPENAI_API_VERSION: str
- OPENAI_API_KEY: str
-
-
-class Automatic1111ConfigForm(BaseModel):
- AUTOMATIC1111_BASE_URL: str
- AUTOMATIC1111_API_AUTH: str
- AUTOMATIC1111_CFG_SCALE: Optional[str | float | int]
- AUTOMATIC1111_SAMPLER: Optional[str]
- AUTOMATIC1111_SCHEDULER: Optional[str]
-
-
-class ComfyUIConfigForm(BaseModel):
- COMFYUI_BASE_URL: str
- COMFYUI_API_KEY: str
- COMFYUI_WORKFLOW: str
- COMFYUI_WORKFLOW_NODES: list[dict]
-
-
-class GeminiConfigForm(BaseModel):
- GEMINI_API_BASE_URL: str
- GEMINI_API_KEY: str
-
-
-class ConfigForm(BaseModel):
- enabled: bool
- engine: str
- prompt_generation: bool
- openai: OpenAIConfigForm
- automatic1111: Automatic1111ConfigForm
- comfyui: ComfyUIConfigForm
- gemini: GeminiConfigForm
-
-
-@router.post("/config/update")
-async def update_config(
- request: Request, form_data: ConfigForm, user=Depends(get_admin_user)
-):
- request.app.state.config.IMAGE_GENERATION_ENGINE = form_data.engine
- request.app.state.config.ENABLE_IMAGE_GENERATION = form_data.enabled
-
- request.app.state.config.ENABLE_IMAGE_PROMPT_GENERATION = (
- form_data.prompt_generation
- )
-
- request.app.state.config.IMAGES_OPENAI_API_BASE_URL = (
- form_data.openai.OPENAI_API_BASE_URL
- )
- request.app.state.config.IMAGES_OPENAI_API_VERSION = (
- form_data.openai.OPENAI_API_VERSION
- )
- request.app.state.config.IMAGES_OPENAI_API_KEY = form_data.openai.OPENAI_API_KEY
-
- request.app.state.config.IMAGES_GEMINI_API_BASE_URL = (
- form_data.gemini.GEMINI_API_BASE_URL
- )
- request.app.state.config.IMAGES_GEMINI_API_KEY = form_data.gemini.GEMINI_API_KEY
-
- request.app.state.config.AUTOMATIC1111_BASE_URL = (
- form_data.automatic1111.AUTOMATIC1111_BASE_URL
- )
- request.app.state.config.AUTOMATIC1111_API_AUTH = (
- form_data.automatic1111.AUTOMATIC1111_API_AUTH
- )
-
- request.app.state.config.AUTOMATIC1111_CFG_SCALE = (
- float(form_data.automatic1111.AUTOMATIC1111_CFG_SCALE)
- if form_data.automatic1111.AUTOMATIC1111_CFG_SCALE
- else None
- )
- request.app.state.config.AUTOMATIC1111_SAMPLER = (
- form_data.automatic1111.AUTOMATIC1111_SAMPLER
- if form_data.automatic1111.AUTOMATIC1111_SAMPLER
- else None
- )
- request.app.state.config.AUTOMATIC1111_SCHEDULER = (
- form_data.automatic1111.AUTOMATIC1111_SCHEDULER
- if form_data.automatic1111.AUTOMATIC1111_SCHEDULER
- else None
- )
-
- request.app.state.config.COMFYUI_BASE_URL = (
- form_data.comfyui.COMFYUI_BASE_URL.strip("/")
- )
- request.app.state.config.COMFYUI_API_KEY = form_data.comfyui.COMFYUI_API_KEY
-
- request.app.state.config.COMFYUI_WORKFLOW = form_data.comfyui.COMFYUI_WORKFLOW
- request.app.state.config.COMFYUI_WORKFLOW_NODES = (
- form_data.comfyui.COMFYUI_WORKFLOW_NODES
- )
-
- return {
- "enabled": request.app.state.config.ENABLE_IMAGE_GENERATION,
- "engine": request.app.state.config.IMAGE_GENERATION_ENGINE,
- "prompt_generation": request.app.state.config.ENABLE_IMAGE_PROMPT_GENERATION,
- "openai": {
- "OPENAI_API_BASE_URL": request.app.state.config.IMAGES_OPENAI_API_BASE_URL,
- "OPENAI_API_VERSION": request.app.state.config.IMAGES_OPENAI_API_VERSION,
- "OPENAI_API_KEY": request.app.state.config.IMAGES_OPENAI_API_KEY,
- },
- "automatic1111": {
- "AUTOMATIC1111_BASE_URL": request.app.state.config.AUTOMATIC1111_BASE_URL,
- "AUTOMATIC1111_API_AUTH": request.app.state.config.AUTOMATIC1111_API_AUTH,
- "AUTOMATIC1111_CFG_SCALE": request.app.state.config.AUTOMATIC1111_CFG_SCALE,
- "AUTOMATIC1111_SAMPLER": request.app.state.config.AUTOMATIC1111_SAMPLER,
- "AUTOMATIC1111_SCHEDULER": request.app.state.config.AUTOMATIC1111_SCHEDULER,
- },
- "comfyui": {
- "COMFYUI_BASE_URL": request.app.state.config.COMFYUI_BASE_URL,
- "COMFYUI_API_KEY": request.app.state.config.COMFYUI_API_KEY,
- "COMFYUI_WORKFLOW": request.app.state.config.COMFYUI_WORKFLOW,
- "COMFYUI_WORKFLOW_NODES": request.app.state.config.COMFYUI_WORKFLOW_NODES,
- },
- "gemini": {
- "GEMINI_API_BASE_URL": request.app.state.config.IMAGES_GEMINI_API_BASE_URL,
- "GEMINI_API_KEY": request.app.state.config.IMAGES_GEMINI_API_KEY,
- },
- }
-
-
-def get_automatic1111_api_auth(request: Request):
- if request.app.state.config.AUTOMATIC1111_API_AUTH is None:
- return ""
- else:
- auth1111_byte_string = request.app.state.config.AUTOMATIC1111_API_AUTH.encode(
- "utf-8"
- )
- auth1111_base64_encoded_bytes = base64.b64encode(auth1111_byte_string)
- auth1111_base64_encoded_string = auth1111_base64_encoded_bytes.decode("utf-8")
- return f"Basic {auth1111_base64_encoded_string}"
-
-
-@router.get("/config/url/verify")
-async def verify_url(request: Request, user=Depends(get_admin_user)):
- if request.app.state.config.IMAGE_GENERATION_ENGINE == "automatic1111":
- try:
- r = requests.get(
- url=f"{request.app.state.config.AUTOMATIC1111_BASE_URL}/sdapi/v1/options",
- headers={"authorization": get_automatic1111_api_auth(request)},
- )
- r.raise_for_status()
- return True
- except Exception:
- request.app.state.config.ENABLE_IMAGE_GENERATION = False
- raise HTTPException(status_code=400, detail=ERROR_MESSAGES.INVALID_URL)
- elif request.app.state.config.IMAGE_GENERATION_ENGINE == "comfyui":
-
- headers = None
- if request.app.state.config.COMFYUI_API_KEY:
- headers = {
- "Authorization": f"Bearer {request.app.state.config.COMFYUI_API_KEY}"
- }
-
- try:
- r = requests.get(
- url=f"{request.app.state.config.COMFYUI_BASE_URL}/object_info",
- headers=headers,
- )
- r.raise_for_status()
- return True
- except Exception:
- request.app.state.config.ENABLE_IMAGE_GENERATION = False
- raise HTTPException(status_code=400, detail=ERROR_MESSAGES.INVALID_URL)
- else:
- return True
-
-
def set_image_model(request: Request, model: str):
log.info(f"Setting image model to {model}")
request.app.state.config.IMAGE_GENERATION_MODEL = model
@@ -295,28 +94,101 @@ def get_image_model(request):
raise HTTPException(status_code=400, detail=ERROR_MESSAGES.DEFAULT(e))
-class ImageConfigForm(BaseModel):
- MODEL: str
- IMAGE_SIZE: str
- IMAGE_STEPS: int
+class ImagesConfig(BaseModel):
+ ENABLE_IMAGE_GENERATION: bool
+ ENABLE_IMAGE_PROMPT_GENERATION: bool
+
+ IMAGE_GENERATION_ENGINE: str
+ IMAGE_GENERATION_MODEL: str
+ IMAGE_SIZE: Optional[str]
+ IMAGE_STEPS: Optional[int]
+
+ IMAGES_OPENAI_API_BASE_URL: str
+ IMAGES_OPENAI_API_KEY: str
+ IMAGES_OPENAI_API_VERSION: str
+
+ AUTOMATIC1111_BASE_URL: str
+ AUTOMATIC1111_API_AUTH: str
+ AUTOMATIC1111_PARAMS: Optional[dict | str]
+
+ COMFYUI_BASE_URL: str
+ COMFYUI_API_KEY: str
+ COMFYUI_WORKFLOW: str
+ COMFYUI_WORKFLOW_NODES: list[dict]
+
+ IMAGES_GEMINI_API_BASE_URL: str
+ IMAGES_GEMINI_API_KEY: str
+ IMAGES_GEMINI_ENDPOINT_METHOD: str
+
+ IMAGE_EDIT_ENGINE: str
+ IMAGE_EDIT_MODEL: str
+ IMAGE_EDIT_SIZE: Optional[str]
+
+ IMAGES_EDIT_OPENAI_API_BASE_URL: str
+ IMAGES_EDIT_OPENAI_API_KEY: str
+ IMAGES_EDIT_OPENAI_API_VERSION: str
+ IMAGES_EDIT_GEMINI_API_BASE_URL: str
+ IMAGES_EDIT_GEMINI_API_KEY: str
+ IMAGES_EDIT_COMFYUI_BASE_URL: str
+ IMAGES_EDIT_COMFYUI_API_KEY: str
+ IMAGES_EDIT_COMFYUI_WORKFLOW: str
+ IMAGES_EDIT_COMFYUI_WORKFLOW_NODES: list[dict]
-@router.get("/image/config")
-async def get_image_config(request: Request, user=Depends(get_admin_user)):
+@router.get("/config", response_model=ImagesConfig)
+async def get_config(request: Request, user=Depends(get_admin_user)):
return {
- "MODEL": request.app.state.config.IMAGE_GENERATION_MODEL,
+ "ENABLE_IMAGE_GENERATION": request.app.state.config.ENABLE_IMAGE_GENERATION,
+ "ENABLE_IMAGE_PROMPT_GENERATION": request.app.state.config.ENABLE_IMAGE_PROMPT_GENERATION,
+ "IMAGE_GENERATION_ENGINE": request.app.state.config.IMAGE_GENERATION_ENGINE,
+ "IMAGE_GENERATION_MODEL": request.app.state.config.IMAGE_GENERATION_MODEL,
"IMAGE_SIZE": request.app.state.config.IMAGE_SIZE,
"IMAGE_STEPS": request.app.state.config.IMAGE_STEPS,
+ "IMAGES_OPENAI_API_BASE_URL": request.app.state.config.IMAGES_OPENAI_API_BASE_URL,
+ "IMAGES_OPENAI_API_KEY": request.app.state.config.IMAGES_OPENAI_API_KEY,
+ "IMAGES_OPENAI_API_VERSION": request.app.state.config.IMAGES_OPENAI_API_VERSION,
+ "AUTOMATIC1111_BASE_URL": request.app.state.config.AUTOMATIC1111_BASE_URL,
+ "AUTOMATIC1111_API_AUTH": request.app.state.config.AUTOMATIC1111_API_AUTH,
+ "AUTOMATIC1111_PARAMS": request.app.state.config.AUTOMATIC1111_PARAMS,
+ "COMFYUI_BASE_URL": request.app.state.config.COMFYUI_BASE_URL,
+ "COMFYUI_API_KEY": request.app.state.config.COMFYUI_API_KEY,
+ "COMFYUI_WORKFLOW": request.app.state.config.COMFYUI_WORKFLOW,
+ "COMFYUI_WORKFLOW_NODES": request.app.state.config.COMFYUI_WORKFLOW_NODES,
+ "IMAGES_GEMINI_API_BASE_URL": request.app.state.config.IMAGES_GEMINI_API_BASE_URL,
+ "IMAGES_GEMINI_API_KEY": request.app.state.config.IMAGES_GEMINI_API_KEY,
+ "IMAGES_GEMINI_ENDPOINT_METHOD": request.app.state.config.IMAGES_GEMINI_ENDPOINT_METHOD,
+ "IMAGE_EDIT_ENGINE": request.app.state.config.IMAGE_EDIT_ENGINE,
+ "IMAGE_EDIT_MODEL": request.app.state.config.IMAGE_EDIT_MODEL,
+ "IMAGE_EDIT_SIZE": request.app.state.config.IMAGE_EDIT_SIZE,
+ "IMAGES_EDIT_OPENAI_API_BASE_URL": request.app.state.config.IMAGES_EDIT_OPENAI_API_BASE_URL,
+ "IMAGES_EDIT_OPENAI_API_KEY": request.app.state.config.IMAGES_EDIT_OPENAI_API_KEY,
+ "IMAGES_EDIT_OPENAI_API_VERSION": request.app.state.config.IMAGES_EDIT_OPENAI_API_VERSION,
+ "IMAGES_EDIT_GEMINI_API_BASE_URL": request.app.state.config.IMAGES_EDIT_GEMINI_API_BASE_URL,
+ "IMAGES_EDIT_GEMINI_API_KEY": request.app.state.config.IMAGES_EDIT_GEMINI_API_KEY,
+ "IMAGES_EDIT_COMFYUI_BASE_URL": request.app.state.config.IMAGES_EDIT_COMFYUI_BASE_URL,
+ "IMAGES_EDIT_COMFYUI_API_KEY": request.app.state.config.IMAGES_EDIT_COMFYUI_API_KEY,
+ "IMAGES_EDIT_COMFYUI_WORKFLOW": request.app.state.config.IMAGES_EDIT_COMFYUI_WORKFLOW,
+ "IMAGES_EDIT_COMFYUI_WORKFLOW_NODES": request.app.state.config.IMAGES_EDIT_COMFYUI_WORKFLOW_NODES,
}
-@router.post("/image/config/update")
-async def update_image_config(
- request: Request, form_data: ImageConfigForm, user=Depends(get_admin_user)
+@router.post("/config/update")
+async def update_config(
+ request: Request, form_data: ImagesConfig, user=Depends(get_admin_user)
):
- set_image_model(request, form_data.MODEL)
+ request.app.state.config.ENABLE_IMAGE_GENERATION = form_data.ENABLE_IMAGE_GENERATION
+
+ # Create Image
+ request.app.state.config.ENABLE_IMAGE_PROMPT_GENERATION = (
+ form_data.ENABLE_IMAGE_PROMPT_GENERATION
+ )
- if form_data.IMAGE_SIZE == "auto" and form_data.MODEL != "gpt-image-1":
+ request.app.state.config.IMAGE_GENERATION_ENGINE = form_data.IMAGE_GENERATION_ENGINE
+ set_image_model(request, form_data.IMAGE_GENERATION_MODEL)
+ if (
+ form_data.IMAGE_SIZE == "auto"
+ and form_data.IMAGE_GENERATION_MODEL != "gpt-image-1"
+ ):
raise HTTPException(
status_code=400,
detail=ERROR_MESSAGES.INCORRECT_FORMAT(
@@ -325,7 +197,11 @@ async def update_image_config(
)
pattern = r"^\d+x\d+$"
- if form_data.IMAGE_SIZE == "auto" or re.match(pattern, form_data.IMAGE_SIZE):
+ if (
+ form_data.IMAGE_SIZE == "auto"
+ or form_data.IMAGE_SIZE == ""
+ or re.match(pattern, form_data.IMAGE_SIZE)
+ ):
request.app.state.config.IMAGE_SIZE = form_data.IMAGE_SIZE
else:
raise HTTPException(
@@ -341,13 +217,146 @@ async def update_image_config(
detail=ERROR_MESSAGES.INCORRECT_FORMAT(" (e.g., 50)."),
)
+ request.app.state.config.IMAGES_OPENAI_API_BASE_URL = (
+ form_data.IMAGES_OPENAI_API_BASE_URL
+ )
+ request.app.state.config.IMAGES_OPENAI_API_KEY = form_data.IMAGES_OPENAI_API_KEY
+ request.app.state.config.IMAGES_OPENAI_API_VERSION = (
+ form_data.IMAGES_OPENAI_API_VERSION
+ )
+
+ request.app.state.config.AUTOMATIC1111_BASE_URL = form_data.AUTOMATIC1111_BASE_URL
+ request.app.state.config.AUTOMATIC1111_API_AUTH = form_data.AUTOMATIC1111_API_AUTH
+ request.app.state.config.AUTOMATIC1111_PARAMS = form_data.AUTOMATIC1111_PARAMS
+
+ request.app.state.config.COMFYUI_BASE_URL = form_data.COMFYUI_BASE_URL.strip("/")
+ request.app.state.config.COMFYUI_API_KEY = form_data.COMFYUI_API_KEY
+ request.app.state.config.COMFYUI_WORKFLOW = form_data.COMFYUI_WORKFLOW
+ request.app.state.config.COMFYUI_WORKFLOW_NODES = form_data.COMFYUI_WORKFLOW_NODES
+
+ request.app.state.config.IMAGES_GEMINI_API_BASE_URL = (
+ form_data.IMAGES_GEMINI_API_BASE_URL
+ )
+ request.app.state.config.IMAGES_GEMINI_API_KEY = form_data.IMAGES_GEMINI_API_KEY
+ request.app.state.config.IMAGES_GEMINI_ENDPOINT_METHOD = (
+ form_data.IMAGES_GEMINI_ENDPOINT_METHOD
+ )
+
+ # Edit Image
+ request.app.state.config.IMAGE_EDIT_ENGINE = form_data.IMAGE_EDIT_ENGINE
+ request.app.state.config.IMAGE_EDIT_MODEL = form_data.IMAGE_EDIT_MODEL
+ request.app.state.config.IMAGE_EDIT_SIZE = form_data.IMAGE_EDIT_SIZE
+
+ request.app.state.config.IMAGES_EDIT_OPENAI_API_BASE_URL = (
+ form_data.IMAGES_OPENAI_API_BASE_URL
+ )
+ request.app.state.config.IMAGES_EDIT_OPENAI_API_KEY = (
+ form_data.IMAGES_OPENAI_API_KEY
+ )
+ request.app.state.config.IMAGES_EDIT_OPENAI_API_VERSION = (
+ form_data.IMAGES_EDIT_OPENAI_API_VERSION
+ )
+
+ request.app.state.config.IMAGES_EDIT_GEMINI_API_BASE_URL = (
+ form_data.IMAGES_EDIT_GEMINI_API_BASE_URL
+ )
+ request.app.state.config.IMAGES_EDIT_GEMINI_API_KEY = (
+ form_data.IMAGES_EDIT_GEMINI_API_KEY
+ )
+
+ request.app.state.config.IMAGES_EDIT_COMFYUI_BASE_URL = (
+ form_data.IMAGES_EDIT_COMFYUI_BASE_URL.strip("/")
+ )
+ request.app.state.config.IMAGES_EDIT_COMFYUI_API_KEY = (
+ form_data.IMAGES_EDIT_COMFYUI_API_KEY
+ )
+ request.app.state.config.IMAGES_EDIT_COMFYUI_WORKFLOW = (
+ form_data.IMAGES_EDIT_COMFYUI_WORKFLOW
+ )
+ request.app.state.config.IMAGES_EDIT_COMFYUI_WORKFLOW_NODES = (
+ form_data.IMAGES_EDIT_COMFYUI_WORKFLOW_NODES
+ )
+
return {
- "MODEL": request.app.state.config.IMAGE_GENERATION_MODEL,
+ "ENABLE_IMAGE_GENERATION": request.app.state.config.ENABLE_IMAGE_GENERATION,
+ "ENABLE_IMAGE_PROMPT_GENERATION": request.app.state.config.ENABLE_IMAGE_PROMPT_GENERATION,
+ "IMAGE_GENERATION_ENGINE": request.app.state.config.IMAGE_GENERATION_ENGINE,
+ "IMAGE_GENERATION_MODEL": request.app.state.config.IMAGE_GENERATION_MODEL,
"IMAGE_SIZE": request.app.state.config.IMAGE_SIZE,
"IMAGE_STEPS": request.app.state.config.IMAGE_STEPS,
+ "IMAGES_OPENAI_API_BASE_URL": request.app.state.config.IMAGES_OPENAI_API_BASE_URL,
+ "IMAGES_OPENAI_API_KEY": request.app.state.config.IMAGES_OPENAI_API_KEY,
+ "IMAGES_OPENAI_API_VERSION": request.app.state.config.IMAGES_OPENAI_API_VERSION,
+ "AUTOMATIC1111_BASE_URL": request.app.state.config.AUTOMATIC1111_BASE_URL,
+ "AUTOMATIC1111_API_AUTH": request.app.state.config.AUTOMATIC1111_API_AUTH,
+ "AUTOMATIC1111_PARAMS": request.app.state.config.AUTOMATIC1111_PARAMS,
+ "COMFYUI_BASE_URL": request.app.state.config.COMFYUI_BASE_URL,
+ "COMFYUI_API_KEY": request.app.state.config.COMFYUI_API_KEY,
+ "COMFYUI_WORKFLOW": request.app.state.config.COMFYUI_WORKFLOW,
+ "COMFYUI_WORKFLOW_NODES": request.app.state.config.COMFYUI_WORKFLOW_NODES,
+ "IMAGES_GEMINI_API_BASE_URL": request.app.state.config.IMAGES_GEMINI_API_BASE_URL,
+ "IMAGES_GEMINI_API_KEY": request.app.state.config.IMAGES_GEMINI_API_KEY,
+ "IMAGES_GEMINI_ENDPOINT_METHOD": request.app.state.config.IMAGES_GEMINI_ENDPOINT_METHOD,
+ "IMAGE_EDIT_ENGINE": request.app.state.config.IMAGE_EDIT_ENGINE,
+ "IMAGE_EDIT_MODEL": request.app.state.config.IMAGE_EDIT_MODEL,
+ "IMAGE_EDIT_SIZE": request.app.state.config.IMAGE_EDIT_SIZE,
+ "IMAGES_EDIT_OPENAI_API_BASE_URL": request.app.state.config.IMAGES_EDIT_OPENAI_API_BASE_URL,
+ "IMAGES_EDIT_OPENAI_API_KEY": request.app.state.config.IMAGES_EDIT_OPENAI_API_KEY,
+ "IMAGES_EDIT_OPENAI_API_VERSION": request.app.state.config.IMAGES_EDIT_OPENAI_API_VERSION,
+ "IMAGES_EDIT_GEMINI_API_BASE_URL": request.app.state.config.IMAGES_EDIT_GEMINI_API_BASE_URL,
+ "IMAGES_EDIT_GEMINI_API_KEY": request.app.state.config.IMAGES_EDIT_GEMINI_API_KEY,
+ "IMAGES_EDIT_COMFYUI_BASE_URL": request.app.state.config.IMAGES_EDIT_COMFYUI_BASE_URL,
+ "IMAGES_EDIT_COMFYUI_API_KEY": request.app.state.config.IMAGES_EDIT_COMFYUI_API_KEY,
+ "IMAGES_EDIT_COMFYUI_WORKFLOW": request.app.state.config.IMAGES_EDIT_COMFYUI_WORKFLOW,
+ "IMAGES_EDIT_COMFYUI_WORKFLOW_NODES": request.app.state.config.IMAGES_EDIT_COMFYUI_WORKFLOW_NODES,
}
+def get_automatic1111_api_auth(request: Request):
+ if request.app.state.config.AUTOMATIC1111_API_AUTH is None:
+ return ""
+ else:
+ auth1111_byte_string = request.app.state.config.AUTOMATIC1111_API_AUTH.encode(
+ "utf-8"
+ )
+ auth1111_base64_encoded_bytes = base64.b64encode(auth1111_byte_string)
+ auth1111_base64_encoded_string = auth1111_base64_encoded_bytes.decode("utf-8")
+ return f"Basic {auth1111_base64_encoded_string}"
+
+
+@router.get("/config/url/verify")
+async def verify_url(request: Request, user=Depends(get_admin_user)):
+ if request.app.state.config.IMAGE_GENERATION_ENGINE == "automatic1111":
+ try:
+ r = requests.get(
+ url=f"{request.app.state.config.AUTOMATIC1111_BASE_URL}/sdapi/v1/options",
+ headers={"authorization": get_automatic1111_api_auth(request)},
+ )
+ r.raise_for_status()
+ return True
+ except Exception:
+ request.app.state.config.ENABLE_IMAGE_GENERATION = False
+ raise HTTPException(status_code=400, detail=ERROR_MESSAGES.INVALID_URL)
+ elif request.app.state.config.IMAGE_GENERATION_ENGINE == "comfyui":
+ headers = None
+ if request.app.state.config.COMFYUI_API_KEY:
+ headers = {
+ "Authorization": f"Bearer {request.app.state.config.COMFYUI_API_KEY}"
+ }
+ try:
+ r = requests.get(
+ url=f"{request.app.state.config.COMFYUI_BASE_URL}/object_info",
+ headers=headers,
+ )
+ r.raise_for_status()
+ return True
+ except Exception:
+ request.app.state.config.ENABLE_IMAGE_GENERATION = False
+ raise HTTPException(status_code=400, detail=ERROR_MESSAGES.INVALID_URL)
+ else:
+ return True
+
+
@router.get("/models")
def get_models(request: Request, user=Depends(get_verified_user)):
try:
@@ -430,7 +439,7 @@ def get_models(request: Request, user=Depends(get_verified_user)):
raise HTTPException(status_code=400, detail=ERROR_MESSAGES.DEFAULT(e))
-class GenerateImageForm(BaseModel):
+class CreateImageForm(BaseModel):
model: Optional[str] = None
prompt: str
size: Optional[str] = None
@@ -438,39 +447,36 @@ class GenerateImageForm(BaseModel):
negative_prompt: Optional[str] = None
-def load_b64_image_data(b64_str):
- try:
- if "," in b64_str:
- header, encoded = b64_str.split(",", 1)
- mime_type = header.split(";")[0].lstrip("data:")
- img_data = base64.b64decode(encoded)
- else:
- mime_type = "image/png"
- img_data = base64.b64decode(b64_str)
- return img_data, mime_type
- except Exception as e:
- log.exception(f"Error loading image data: {e}")
- return None, None
+GenerateImageForm = CreateImageForm # Alias for backward compatibility
-def load_url_image_data(url, headers=None):
+def get_image_data(data: str, headers=None):
try:
- if headers:
- r = requests.get(url, headers=headers)
- else:
- r = requests.get(url)
+ if data.startswith("http://") or data.startswith("https://"):
+ if headers:
+ r = requests.get(data, headers=headers)
+ else:
+ r = requests.get(data)
- r.raise_for_status()
- if r.headers["content-type"].split("/")[0] == "image":
- mime_type = r.headers["content-type"]
- return r.content, mime_type
+ r.raise_for_status()
+ if r.headers["content-type"].split("/")[0] == "image":
+ mime_type = r.headers["content-type"]
+ return r.content, mime_type
+ else:
+ log.error("Url does not point to an image.")
+ return None
else:
- log.error("Url does not point to an image.")
- return None
-
+ if "," in data:
+ header, encoded = data.split(",", 1)
+ mime_type = header.split(";")[0].lstrip("data:")
+ img_data = base64.b64decode(encoded)
+ else:
+ mime_type = "image/png"
+ img_data = base64.b64decode(data)
+ return img_data, mime_type
except Exception as e:
- log.exception(f"Error saving image: {e}")
- return None
+ log.exception(f"Error loading image data: {e}")
+ return None, None
def upload_image(request, image_data, content_type, metadata, user):
@@ -496,7 +502,7 @@ def upload_image(request, image_data, content_type, metadata, user):
@router.post("/generations")
async def image_generations(
request: Request,
- form_data: GenerateImageForm,
+ form_data: CreateImageForm,
user=Depends(get_verified_user),
):
# if IMAGE_SIZE = 'auto', default WidthxHeight to the 512x512 default
@@ -519,17 +525,14 @@ async def image_generations(
r = None
try:
if request.app.state.config.IMAGE_GENERATION_ENGINE == "openai":
- headers = {}
- headers["Authorization"] = (
- f"Bearer {request.app.state.config.IMAGES_OPENAI_API_KEY}"
- )
- headers["Content-Type"] = "application/json"
+
+ headers = {
+ "Authorization": f"Bearer {request.app.state.config.IMAGES_OPENAI_API_KEY}",
+ "Content-Type": "application/json",
+ }
if ENABLE_FORWARD_USER_INFO_HEADERS:
- headers["X-OpenWebUI-User-Name"] = quote(user.name, safe=" ")
- headers["X-OpenWebUI-User-Id"] = user.id
- headers["X-OpenWebUI-User-Email"] = user.email
- headers["X-OpenWebUI-User-Role"] = user.role
+ headers = include_user_info_headers(headers, user)
data = {
"model": model,
@@ -568,31 +571,46 @@ async def image_generations(
for image in res["data"]:
if image_url := image.get("url", None):
- image_data, content_type = load_url_image_data(image_url, headers)
+ image_data, content_type = get_image_data(image_url, headers)
else:
- image_data, content_type = load_b64_image_data(image["b64_json"])
+ image_data, content_type = get_image_data(image["b64_json"])
url = upload_image(request, image_data, content_type, data, user)
images.append({"url": url})
return images
elif request.app.state.config.IMAGE_GENERATION_ENGINE == "gemini":
- headers = {}
- headers["Content-Type"] = "application/json"
- headers["x-goog-api-key"] = request.app.state.config.IMAGES_GEMINI_API_KEY
-
- data = {
- "instances": {"prompt": form_data.prompt},
- "parameters": {
- "sampleCount": form_data.n,
- "outputOptions": {"mimeType": "image/png"},
- },
+ headers = {
+ "Content-Type": "application/json",
+ "x-goog-api-key": request.app.state.config.IMAGES_GEMINI_API_KEY,
}
+ data = {}
+
+ if (
+ request.app.state.config.IMAGES_GEMINI_ENDPOINT_METHOD == ""
+ or request.app.state.config.IMAGES_GEMINI_ENDPOINT_METHOD == "predict"
+ ):
+ model = f"{model}:predict"
+ data = {
+ "instances": {"prompt": form_data.prompt},
+ "parameters": {
+ "sampleCount": form_data.n,
+ "outputOptions": {"mimeType": "image/png"},
+ },
+ }
+
+ elif (
+ request.app.state.config.IMAGES_GEMINI_ENDPOINT_METHOD
+ == "generateContent"
+ ):
+ model = f"{model}:generateContent"
+ data = {"contents": [{"parts": [{"text": form_data.prompt}]}]}
+
# Use asyncio.to_thread for the requests.post call
r = await asyncio.to_thread(
requests.post,
- url=f"{request.app.state.config.IMAGES_GEMINI_API_BASE_URL}/models/{model}:predict",
+ url=f"{request.app.state.config.IMAGES_GEMINI_API_BASE_URL}/models/{model}",
json=data,
headers=headers,
)
@@ -601,12 +619,25 @@ async def image_generations(
res = r.json()
images = []
- for image in res["predictions"]:
- image_data, content_type = load_b64_image_data(
- image["bytesBase64Encoded"]
- )
- url = upload_image(request, image_data, content_type, data, user)
- images.append({"url": url})
+
+ if model.endswith(":predict"):
+ for image in res["predictions"]:
+ image_data, content_type = get_image_data(
+ image["bytesBase64Encoded"]
+ )
+ url = upload_image(request, image_data, content_type, data, user)
+ images.append({"url": url})
+ elif model.endswith(":generateContent"):
+ for image in res["candidates"]:
+ for part in image["content"]["parts"]:
+ if part.get("inlineData", {}).get("data"):
+ image_data, content_type = get_image_data(
+ part["inlineData"]["data"]
+ )
+ url = upload_image(
+ request, image_data, content_type, data, user
+ )
+ images.append({"url": url})
return images
@@ -624,7 +655,7 @@ async def image_generations(
if form_data.negative_prompt is not None:
data["negative_prompt"] = form_data.negative_prompt
- form_data = ComfyUIGenerateImageForm(
+ form_data = ComfyUICreateImageForm(
**{
"workflow": ComfyUIWorkflow(
**{
@@ -635,7 +666,7 @@ async def image_generations(
**data,
}
)
- res = await comfyui_generate_image(
+ res = await comfyui_create_image(
model,
form_data,
user.id,
@@ -653,7 +684,7 @@ async def image_generations(
"Authorization": f"Bearer {request.app.state.config.COMFYUI_API_KEY}"
}
- image_data, content_type = load_url_image_data(image["url"], headers)
+ image_data, content_type = get_image_data(image["url"], headers)
url = upload_image(
request,
image_data,
@@ -683,14 +714,8 @@ async def image_generations(
if form_data.negative_prompt is not None:
data["negative_prompt"] = form_data.negative_prompt
- if request.app.state.config.AUTOMATIC1111_CFG_SCALE:
- data["cfg_scale"] = request.app.state.config.AUTOMATIC1111_CFG_SCALE
-
- if request.app.state.config.AUTOMATIC1111_SAMPLER:
- data["sampler_name"] = request.app.state.config.AUTOMATIC1111_SAMPLER
-
- if request.app.state.config.AUTOMATIC1111_SCHEDULER:
- data["scheduler"] = request.app.state.config.AUTOMATIC1111_SCHEDULER
+ if request.app.state.config.AUTOMATIC1111_PARAMS:
+ data = {**data, **request.app.state.config.AUTOMATIC1111_PARAMS}
# Use asyncio.to_thread for the requests.post call
r = await asyncio.to_thread(
@@ -706,7 +731,7 @@ async def image_generations(
images = []
for image in res["images"]:
- image_data, content_type = load_b64_image_data(image)
+ image_data, content_type = get_image_data(image)
url = upload_image(
request,
image_data,
@@ -723,3 +748,292 @@ async def image_generations(
if "error" in data:
error = data["error"]["message"]
raise HTTPException(status_code=400, detail=ERROR_MESSAGES.DEFAULT(error))
+
+
+class EditImageForm(BaseModel):
+ image: str | list[str] # base64-encoded image(s) or URL(s)
+ prompt: str
+ model: Optional[str] = None
+ size: Optional[str] = None
+ n: Optional[int] = None
+ negative_prompt: Optional[str] = None
+
+
+@router.post("/edit")
+async def image_edits(
+ request: Request,
+ form_data: EditImageForm,
+ user=Depends(get_verified_user),
+):
+ size = None
+ width, height = None, None
+ if (
+ request.app.state.config.IMAGE_EDIT_SIZE
+ and "x" in request.app.state.config.IMAGE_EDIT_SIZE
+ ) or (form_data.size and "x" in form_data.size):
+ size = (
+ form_data.size
+ if form_data.size
+ else request.app.state.config.IMAGE_EDIT_SIZE
+ )
+ width, height = tuple(map(int, size.split("x")))
+
+ model = (
+ request.app.state.config.IMAGE_EDIT_MODEL
+ if form_data.model is None
+ else form_data.model
+ )
+
+ try:
+
+ async def load_url_image(data):
+ if data.startswith("http://") or data.startswith("https://"):
+ r = await asyncio.to_thread(requests.get, data)
+ r.raise_for_status()
+
+ image_data = base64.b64encode(r.content).decode("utf-8")
+ return f"data:{r.headers['content-type']};base64,{image_data}"
+
+ elif data.startswith("/api/v1/files"):
+ file_id = data.split("/api/v1/files/")[1].split("/content")[0]
+ file_response = await get_file_content_by_id(file_id, user)
+
+ if isinstance(file_response, FileResponse):
+ file_path = file_response.path
+
+ with open(file_path, "rb") as f:
+ file_bytes = f.read()
+ image_data = base64.b64encode(file_bytes).decode("utf-8")
+ mime_type, _ = mimetypes.guess_type(file_path)
+
+ return f"data:{mime_type};base64,{image_data}"
+
+ return data
+
+ # Load image(s) from URL(s) if necessary
+ if isinstance(form_data.image, str):
+ form_data.image = await load_url_image(form_data.image)
+ elif isinstance(form_data.image, list):
+ form_data.image = [await load_url_image(img) for img in form_data.image]
+ except Exception as e:
+ raise HTTPException(status_code=400, detail=ERROR_MESSAGES.DEFAULT(e))
+
+ def get_image_file_item(base64_string):
+ data = base64_string
+ header, encoded = data.split(",", 1)
+ mime_type = header.split(";")[0].lstrip("data:")
+ image_data = base64.b64decode(encoded)
+ return (
+ "image",
+ (
+ f"{uuid.uuid4()}.png",
+ io.BytesIO(image_data),
+ mime_type if mime_type else "image/png",
+ ),
+ )
+
+ r = None
+ try:
+ if request.app.state.config.IMAGE_EDIT_ENGINE == "openai":
+ headers = {
+ "Authorization": f"Bearer {request.app.state.config.IMAGES_EDIT_OPENAI_API_KEY}",
+ }
+
+ if ENABLE_FORWARD_USER_INFO_HEADERS:
+ headers = include_user_info_headers(headers, user)
+
+ data = {
+ "model": model,
+ "prompt": form_data.prompt,
+ **({"n": form_data.n} if form_data.n else {}),
+ **({"size": size} if size else {}),
+ **(
+ {}
+ if "gpt-image-1" in request.app.state.config.IMAGE_EDIT_MODEL
+ else {"response_format": "b64_json"}
+ ),
+ }
+
+ files = []
+ if isinstance(form_data.image, str):
+ files = [get_image_file_item(form_data.image)]
+ elif isinstance(form_data.image, list):
+ for img in form_data.image:
+ files.append(get_image_file_item(img))
+
+ url_search_params = ""
+ if request.app.state.config.IMAGES_EDIT_OPENAI_API_VERSION:
+ url_search_params += f"?api-version={request.app.state.config.IMAGES_EDIT_OPENAI_API_VERSION}"
+
+ # Use asyncio.to_thread for the requests.post call
+ r = await asyncio.to_thread(
+ requests.post,
+ url=f"{request.app.state.config.IMAGES_EDIT_OPENAI_API_BASE_URL}/images/edits{url_search_params}",
+ headers=headers,
+ files=files,
+ data=data,
+ )
+
+ r.raise_for_status()
+ res = r.json()
+
+ images = []
+ for image in res["data"]:
+ if image_url := image.get("url", None):
+ image_data, content_type = get_image_data(image_url, headers)
+ else:
+ image_data, content_type = get_image_data(image["b64_json"])
+
+ url = upload_image(request, image_data, content_type, data, user)
+ images.append({"url": url})
+ return images
+
+ elif request.app.state.config.IMAGE_EDIT_ENGINE == "gemini":
+ headers = {
+ "Content-Type": "application/json",
+ "x-goog-api-key": request.app.state.config.IMAGES_EDIT_GEMINI_API_KEY,
+ }
+
+ model = f"{model}:generateContent"
+ data = {"contents": [{"parts": [{"text": form_data.prompt}]}]}
+
+ if isinstance(form_data.image, str):
+ data["contents"][0]["parts"].append(
+ {
+ "inline_data": {
+ "mime_type": "image/png",
+ "data": form_data.image.split(",", 1)[1],
+ }
+ }
+ )
+ elif isinstance(form_data.image, list):
+ data["contents"][0]["parts"].extend(
+ [
+ {
+ "inline_data": {
+ "mime_type": "image/png",
+ "data": image.split(",", 1)[1],
+ }
+ }
+ for image in form_data.image
+ ]
+ )
+
+ # Use asyncio.to_thread for the requests.post call
+ r = await asyncio.to_thread(
+ requests.post,
+ url=f"{request.app.state.config.IMAGES_EDIT_GEMINI_API_BASE_URL}/models/{model}",
+ json=data,
+ headers=headers,
+ )
+
+ r.raise_for_status()
+ res = r.json()
+
+ images = []
+ for image in res["candidates"]:
+ for part in image["content"]["parts"]:
+ if part.get("inlineData", {}).get("data"):
+ image_data, content_type = get_image_data(
+ part["inlineData"]["data"]
+ )
+ url = upload_image(
+ request, image_data, content_type, data, user
+ )
+ images.append({"url": url})
+
+ return images
+
+ elif request.app.state.config.IMAGE_EDIT_ENGINE == "comfyui":
+ try:
+ files = []
+ if isinstance(form_data.image, str):
+ files = [get_image_file_item(form_data.image)]
+ elif isinstance(form_data.image, list):
+ for img in form_data.image:
+ files.append(get_image_file_item(img))
+
+ # Upload images to ComfyUI and get their names
+ comfyui_images = []
+ for file_item in files:
+ res = await comfyui_upload_image(
+ file_item,
+ request.app.state.config.IMAGES_EDIT_COMFYUI_BASE_URL,
+ request.app.state.config.IMAGES_EDIT_COMFYUI_API_KEY,
+ )
+ comfyui_images.append(res.get("name", file_item[1][0]))
+ except Exception as e:
+ log.debug(f"Error uploading images to ComfyUI: {e}")
+ raise Exception("Failed to upload images to ComfyUI.")
+
+ data = {
+ "image": comfyui_images,
+ "prompt": form_data.prompt,
+ **({"width": width} if width is not None else {}),
+ **({"height": height} if height is not None else {}),
+ **({"n": form_data.n} if form_data.n else {}),
+ }
+
+ form_data = ComfyUIEditImageForm(
+ **{
+ "workflow": ComfyUIWorkflow(
+ **{
+ "workflow": request.app.state.config.IMAGES_EDIT_COMFYUI_WORKFLOW,
+ "nodes": request.app.state.config.IMAGES_EDIT_COMFYUI_WORKFLOW_NODES,
+ }
+ ),
+ **data,
+ }
+ )
+ res = await comfyui_edit_image(
+ model,
+ form_data,
+ user.id,
+ request.app.state.config.IMAGES_EDIT_COMFYUI_BASE_URL,
+ request.app.state.config.IMAGES_EDIT_COMFYUI_API_KEY,
+ )
+ log.debug(f"res: {res}")
+
+ image_urls = set()
+ for image in res["data"]:
+ image_urls.add(image["url"])
+ image_urls = list(image_urls)
+
+ # Prioritize output type URLs if available
+ output_type_urls = [url for url in image_urls if "type=output" in url]
+ if output_type_urls:
+ image_urls = output_type_urls
+
+ log.debug(f"Image URLs: {image_urls}")
+ images = []
+
+ for image_url in image_urls:
+ headers = None
+ if request.app.state.config.IMAGES_EDIT_COMFYUI_API_KEY:
+ headers = {
+ "Authorization": f"Bearer {request.app.state.config.IMAGES_EDIT_COMFYUI_API_KEY}"
+ }
+
+ image_data, content_type = get_image_data(image_url, headers)
+ url = upload_image(
+ request,
+ image_data,
+ content_type,
+ form_data.model_dump(exclude_none=True),
+ user,
+ )
+ images.append({"url": url})
+
+ return images
+ except Exception as e:
+ error = e
+ if r != None:
+ data = r.text
+ try:
+ data = json.loads(data)
+ if "error" in data:
+ error = data["error"]["message"]
+ except Exception:
+ error = data
+
+ raise HTTPException(status_code=400, detail=ERROR_MESSAGES.DEFAULT(error))
diff --git a/backend/open_webui/routers/models.py b/backend/open_webui/routers/models.py
index 215cd8426c..d69cd4ee42 100644
--- a/backend/open_webui/routers/models.py
+++ b/backend/open_webui/routers/models.py
@@ -44,7 +44,9 @@ def validate_model_id(model_id: str) -> bool:
###########################
-@router.get("/", response_model=list[ModelUserResponse])
+@router.get(
+ "/list", response_model=list[ModelUserResponse]
+) # do NOT use "/" as path, conflicts with main.py
async def get_models(id: Optional[str] = None, user=Depends(get_verified_user)):
if user.role == "admin" and BYPASS_ADMIN_ACCESS_CONTROL:
return Models.get_models()
diff --git a/backend/open_webui/routers/openai.py b/backend/open_webui/routers/openai.py
index ed6249e526..c5294d2c85 100644
--- a/backend/open_webui/routers/openai.py
+++ b/backend/open_webui/routers/openai.py
@@ -502,49 +502,55 @@ def extract_data(response):
return response
return None
- def merge_models_lists(model_lists):
- log.debug(f"merge_models_lists {model_lists}")
- merged_list = []
+ def is_supported_openai_models(model_id):
+ if any(
+ name in model_id
+ for name in [
+ "babbage",
+ "dall-e",
+ "davinci",
+ "embedding",
+ "tts",
+ "whisper",
+ ]
+ ):
+ return False
+ return True
- for idx, models in enumerate(model_lists):
- if models is not None and "error" not in models:
- merged_list.extend(
- [
- {
+ def get_merged_models(model_lists):
+ log.debug(f"merge_models_lists {model_lists}")
+ models = {}
+
+ for idx, model_list in enumerate(model_lists):
+ if model_list is not None and "error" not in model_list:
+ for model in model_list:
+ model_id = model.get("id") or model.get("name")
+
+ if (
+ "api.openai.com"
+ in request.app.state.config.OPENAI_API_BASE_URLS[idx]
+ and not is_supported_openai_models(model_id)
+ ):
+ # Skip unwanted OpenAI models
+ continue
+
+ if model_id and model_id not in models:
+ models[model_id] = {
**model,
- "name": model.get("name", model["id"]),
+ "name": model.get("name", model_id),
"owned_by": "openai",
"openai": model,
"connection_type": model.get("connection_type", "external"),
"urlIdx": idx,
}
- for model in models
- if (model.get("id") or model.get("name"))
- and (
- "api.openai.com"
- not in request.app.state.config.OPENAI_API_BASE_URLS[idx]
- or not any(
- name in model["id"]
- for name in [
- "babbage",
- "dall-e",
- "davinci",
- "embedding",
- "tts",
- "whisper",
- ]
- )
- )
- ]
- )
- return merged_list
+ return models
- models = {"data": merge_models_lists(map(extract_data, responses))}
+ models = get_merged_models(map(extract_data, responses))
log.debug(f"models: {models}")
- request.app.state.OPENAI_MODELS = {model["id"]: model for model in models["data"]}
- return models
+ request.app.state.OPENAI_MODELS = models
+ return {"data": list(models.values())}
@router.get("/models")
diff --git a/backend/open_webui/routers/retrieval.py b/backend/open_webui/routers/retrieval.py
index cb66e8926e..f8147372fd 100644
--- a/backend/open_webui/routers/retrieval.py
+++ b/backend/open_webui/routers/retrieval.py
@@ -465,6 +465,7 @@ async def get_rag_config(request: Request, user=Depends(get_admin_user)):
"DOCLING_PICTURE_DESCRIPTION_API": request.app.state.config.DOCLING_PICTURE_DESCRIPTION_API,
"DOCUMENT_INTELLIGENCE_ENDPOINT": request.app.state.config.DOCUMENT_INTELLIGENCE_ENDPOINT,
"DOCUMENT_INTELLIGENCE_KEY": request.app.state.config.DOCUMENT_INTELLIGENCE_KEY,
+ "MISTRAL_OCR_API_BASE_URL": request.app.state.config.MISTRAL_OCR_API_BASE_URL,
"MISTRAL_OCR_API_KEY": request.app.state.config.MISTRAL_OCR_API_KEY,
# MinerU settings
"MINERU_API_MODE": request.app.state.config.MINERU_API_MODE,
@@ -650,6 +651,7 @@ class ConfigForm(BaseModel):
DOCLING_PICTURE_DESCRIPTION_API: Optional[dict] = None
DOCUMENT_INTELLIGENCE_ENDPOINT: Optional[str] = None
DOCUMENT_INTELLIGENCE_KEY: Optional[str] = None
+ MISTRAL_OCR_API_BASE_URL: Optional[str] = None
MISTRAL_OCR_API_KEY: Optional[str] = None
# MinerU settings
@@ -891,6 +893,12 @@ async def update_rag_config(
if form_data.DOCUMENT_INTELLIGENCE_KEY is not None
else request.app.state.config.DOCUMENT_INTELLIGENCE_KEY
)
+
+ request.app.state.config.MISTRAL_OCR_API_BASE_URL = (
+ form_data.MISTRAL_OCR_API_BASE_URL
+ if form_data.MISTRAL_OCR_API_BASE_URL is not None
+ else request.app.state.config.MISTRAL_OCR_API_BASE_URL
+ )
request.app.state.config.MISTRAL_OCR_API_KEY = (
form_data.MISTRAL_OCR_API_KEY
if form_data.MISTRAL_OCR_API_KEY is not None
@@ -1182,6 +1190,7 @@ async def update_rag_config(
"DOCLING_PICTURE_DESCRIPTION_API": request.app.state.config.DOCLING_PICTURE_DESCRIPTION_API,
"DOCUMENT_INTELLIGENCE_ENDPOINT": request.app.state.config.DOCUMENT_INTELLIGENCE_ENDPOINT,
"DOCUMENT_INTELLIGENCE_KEY": request.app.state.config.DOCUMENT_INTELLIGENCE_KEY,
+ "MISTRAL_OCR_API_BASE_URL": request.app.state.config.MISTRAL_OCR_API_BASE_URL,
"MISTRAL_OCR_API_KEY": request.app.state.config.MISTRAL_OCR_API_KEY,
# MinerU settings
"MINERU_API_MODE": request.app.state.config.MINERU_API_MODE,
@@ -1565,6 +1574,7 @@ def process_file(
file_path = Storage.get_file(file_path)
loader = Loader(
engine=request.app.state.config.CONTENT_EXTRACTION_ENGINE,
+ user=user,
DATALAB_MARKER_API_KEY=request.app.state.config.DATALAB_MARKER_API_KEY,
DATALAB_MARKER_API_BASE_URL=request.app.state.config.DATALAB_MARKER_API_BASE_URL,
DATALAB_MARKER_ADDITIONAL_CONFIG=request.app.state.config.DATALAB_MARKER_ADDITIONAL_CONFIG,
@@ -1597,6 +1607,7 @@ def process_file(
PDF_EXTRACT_IMAGES=request.app.state.config.PDF_EXTRACT_IMAGES,
DOCUMENT_INTELLIGENCE_ENDPOINT=request.app.state.config.DOCUMENT_INTELLIGENCE_ENDPOINT,
DOCUMENT_INTELLIGENCE_KEY=request.app.state.config.DOCUMENT_INTELLIGENCE_KEY,
+ MISTRAL_OCR_API_BASE_URL=request.app.state.config.MISTRAL_OCR_API_BASE_URL,
MISTRAL_OCR_API_KEY=request.app.state.config.MISTRAL_OCR_API_KEY,
MINERU_API_MODE=request.app.state.config.MINERU_API_MODE,
MINERU_API_URL=request.app.state.config.MINERU_API_URL,
@@ -1875,6 +1886,7 @@ def search_web(request: Request, engine: str, query: str) -> list[SearchResult]:
query,
request.app.state.config.WEB_SEARCH_RESULT_COUNT,
request.app.state.config.WEB_SEARCH_DOMAIN_FILTER_LIST,
+ referer=request.app.state.config.WEBUI_URL,
)
else:
raise Exception(
diff --git a/backend/open_webui/routers/users.py b/backend/open_webui/routers/users.py
index 7a8a60d565..0ba37d9f65 100644
--- a/backend/open_webui/routers/users.py
+++ b/backend/open_webui/routers/users.py
@@ -388,7 +388,7 @@ async def get_user_by_id(user_id: str, user=Depends(get_verified_user)):
)
-@router.get("/{user_id}/oauth/sessions", response_model=Optional[dict])
+@router.get("/{user_id}/oauth/sessions")
async def get_user_oauth_sessions_by_id(user_id: str, user=Depends(get_admin_user)):
sessions = OAuthSessions.get_sessions_by_user_id(user_id)
if sessions and len(sessions) > 0:
diff --git a/backend/open_webui/socket/main.py b/backend/open_webui/socket/main.py
index 47b2c57961..818a57807f 100644
--- a/backend/open_webui/socket/main.py
+++ b/backend/open_webui/socket/main.py
@@ -18,7 +18,12 @@
get_sentinel_url_from_env,
)
+from open_webui.config import (
+ CORS_ALLOW_ORIGIN,
+)
+
from open_webui.env import (
+ VERSION,
ENABLE_WEBSOCKET_SUPPORT,
WEBSOCKET_MANAGER,
WEBSOCKET_REDIS_URL,
@@ -48,6 +53,9 @@
REDIS = None
+# Configure CORS for Socket.IO
+SOCKETIO_CORS_ORIGINS = "*" if CORS_ALLOW_ORIGIN == ["*"] else CORS_ALLOW_ORIGIN
+
if WEBSOCKET_MANAGER == "redis":
if WEBSOCKET_SENTINEL_HOSTS:
mgr = socketio.AsyncRedisManager(
@@ -58,7 +66,7 @@
else:
mgr = socketio.AsyncRedisManager(WEBSOCKET_REDIS_URL)
sio = socketio.AsyncServer(
- cors_allowed_origins=[],
+ cors_allowed_origins=SOCKETIO_CORS_ORIGINS,
async_mode="asgi",
transports=(["websocket"] if ENABLE_WEBSOCKET_SUPPORT else ["polling"]),
allow_upgrades=ENABLE_WEBSOCKET_SUPPORT,
@@ -67,7 +75,7 @@
)
else:
sio = socketio.AsyncServer(
- cors_allowed_origins=[],
+ cors_allowed_origins=SOCKETIO_CORS_ORIGINS,
async_mode="asgi",
transports=(["websocket"] if ENABLE_WEBSOCKET_SUPPORT else ["polling"]),
allow_upgrades=ENABLE_WEBSOCKET_SUPPORT,
diff --git a/backend/open_webui/utils/files.py b/backend/open_webui/utils/files.py
index b410cbab50..29573cab19 100644
--- a/backend/open_webui/utils/files.py
+++ b/backend/open_webui/utils/files.py
@@ -1,5 +1,5 @@
from open_webui.routers.images import (
- load_b64_image_data,
+ get_image_data,
upload_image,
)
@@ -22,7 +22,7 @@ def get_image_url_from_base64(request, base64_image_string, metadata, user):
if "data:image/png;base64" in base64_image_string:
image_url = ""
# Extract base64 image data from the line
- image_data, content_type = load_b64_image_data(base64_image_string)
+ image_data, content_type = get_image_data(base64_image_string)
if image_data is not None:
image_url = upload_image(
request,
diff --git a/backend/open_webui/utils/headers.py b/backend/open_webui/utils/headers.py
new file mode 100644
index 0000000000..3caee50334
--- /dev/null
+++ b/backend/open_webui/utils/headers.py
@@ -0,0 +1,11 @@
+from urllib.parse import quote
+
+
+def include_user_info_headers(headers, user):
+ return {
+ **headers,
+ "X-OpenWebUI-User-Name": quote(user.name, safe=" "),
+ "X-OpenWebUI-User-Id": user.id,
+ "X-OpenWebUI-User-Email": user.email,
+ "X-OpenWebUI-User-Role": user.role,
+ }
diff --git a/backend/open_webui/utils/images/comfyui.py b/backend/open_webui/utils/images/comfyui.py
index b86c257591..506723bc92 100644
--- a/backend/open_webui/utils/images/comfyui.py
+++ b/backend/open_webui/utils/images/comfyui.py
@@ -2,6 +2,8 @@
import json
import logging
import random
+import requests
+import aiohttp
import urllib.parse
import urllib.request
from typing import Optional
@@ -91,6 +93,25 @@ def get_images(ws, prompt, client_id, base_url, api_key):
return {"data": output_images}
+async def comfyui_upload_image(image_file_item, base_url, api_key):
+ url = f"{base_url}/api/upload/image"
+ headers = {}
+
+ if api_key:
+ headers["Authorization"] = f"Bearer {api_key}"
+
+ _, (filename, file_bytes, mime_type) = image_file_item
+
+ form = aiohttp.FormData()
+ form.add_field("image", file_bytes, filename=filename, content_type=mime_type)
+ form.add_field("type", "input") # required by ComfyUI
+
+ async with aiohttp.ClientSession() as session:
+ async with session.post(url, data=form, headers=headers) as resp:
+ resp.raise_for_status()
+ return await resp.json()
+
+
class ComfyUINodeInput(BaseModel):
type: Optional[str] = None
node_ids: list[str] = []
@@ -103,7 +124,7 @@ class ComfyUIWorkflow(BaseModel):
nodes: list[ComfyUINodeInput]
-class ComfyUIGenerateImageForm(BaseModel):
+class ComfyUICreateImageForm(BaseModel):
workflow: ComfyUIWorkflow
prompt: str
@@ -116,8 +137,98 @@ class ComfyUIGenerateImageForm(BaseModel):
seed: Optional[int] = None
-async def comfyui_generate_image(
- model: str, payload: ComfyUIGenerateImageForm, client_id, base_url, api_key
+async def comfyui_create_image(
+ model: str, payload: ComfyUICreateImageForm, client_id, base_url, api_key
+):
+ ws_url = base_url.replace("http://", "ws://").replace("https://", "wss://")
+ workflow = json.loads(payload.workflow.workflow)
+
+ for node in payload.workflow.nodes:
+ if node.type:
+ if node.type == "model":
+ for node_id in node.node_ids:
+ workflow[node_id]["inputs"][node.key] = model
+ elif node.type == "prompt":
+ for node_id in node.node_ids:
+ workflow[node_id]["inputs"][
+ node.key if node.key else "text"
+ ] = payload.prompt
+ elif node.type == "negative_prompt":
+ for node_id in node.node_ids:
+ workflow[node_id]["inputs"][
+ node.key if node.key else "text"
+ ] = payload.negative_prompt
+ elif node.type == "width":
+ for node_id in node.node_ids:
+ workflow[node_id]["inputs"][
+ node.key if node.key else "width"
+ ] = payload.width
+ elif node.type == "height":
+ for node_id in node.node_ids:
+ workflow[node_id]["inputs"][
+ node.key if node.key else "height"
+ ] = payload.height
+ elif node.type == "n":
+ for node_id in node.node_ids:
+ workflow[node_id]["inputs"][
+ node.key if node.key else "batch_size"
+ ] = payload.n
+ elif node.type == "steps":
+ for node_id in node.node_ids:
+ workflow[node_id]["inputs"][
+ node.key if node.key else "steps"
+ ] = payload.steps
+ elif node.type == "seed":
+ seed = (
+ payload.seed
+ if payload.seed
+ else random.randint(0, 1125899906842624)
+ )
+ for node_id in node.node_ids:
+ workflow[node_id]["inputs"][node.key] = seed
+ else:
+ for node_id in node.node_ids:
+ workflow[node_id]["inputs"][node.key] = node.value
+
+ try:
+ ws = websocket.WebSocket()
+ headers = {"Authorization": f"Bearer {api_key}"}
+ ws.connect(f"{ws_url}/ws?clientId={client_id}", header=headers)
+ log.info("WebSocket connection established.")
+ except Exception as e:
+ log.exception(f"Failed to connect to WebSocket server: {e}")
+ return None
+
+ try:
+ log.info("Sending workflow to WebSocket server.")
+ log.info(f"Workflow: {workflow}")
+ images = await asyncio.to_thread(
+ get_images, ws, workflow, client_id, base_url, api_key
+ )
+ except Exception as e:
+ log.exception(f"Error while receiving images: {e}")
+ images = None
+
+ ws.close()
+
+ return images
+
+
+class ComfyUIEditImageForm(BaseModel):
+ workflow: ComfyUIWorkflow
+
+ image: str | list[str]
+ prompt: str
+ width: Optional[int] = None
+ height: Optional[int] = None
+ n: Optional[int] = None
+
+ steps: Optional[int] = None
+ seed: Optional[int] = None
+
+
+async def comfyui_edit_image(
+ model: str, payload: ComfyUIEditImageForm, client_id, base_url, api_key
):
ws_url = base_url.replace("http://", "ws://").replace("https://", "wss://")
workflow = json.loads(payload.workflow.workflow)
@@ -127,6 +238,15 @@ async def comfyui_generate_image(
if node.type == "model":
for node_id in node.node_ids:
workflow[node_id]["inputs"][node.key] = model
+ elif node.type == "image":
+ if isinstance(payload.image, list):
+ # check if multiple images are provided
+ for idx, node_id in enumerate(node.node_ids):
+ if idx < len(payload.image):
+ workflow[node_id]["inputs"][node.key] = payload.image[idx]
+ else:
+ for node_id in node.node_ids:
+ workflow[node_id]["inputs"][node.key] = payload.image
elif node.type == "prompt":
for node_id in node.node_ids:
workflow[node_id]["inputs"][
diff --git a/backend/open_webui/utils/mcp/client.py b/backend/open_webui/utils/mcp/client.py
index 01df38886c..6edfca4f6c 100644
--- a/backend/open_webui/utils/mcp/client.py
+++ b/backend/open_webui/utils/mcp/client.py
@@ -2,6 +2,8 @@
from typing import Optional
from contextlib import AsyncExitStack
+import anyio
+
from mcp import ClientSession
from mcp.client.auth import OAuthClientProvider, TokenStorage
from mcp.client.streamable_http import streamablehttp_client
@@ -11,26 +13,29 @@
class MCPClient:
def __init__(self):
self.session: Optional[ClientSession] = None
- self.exit_stack = AsyncExitStack()
+ self.exit_stack = None
async def connect(self, url: str, headers: Optional[dict] = None):
- try:
- self._streams_context = streamablehttp_client(url, headers=headers)
-
- transport = await self.exit_stack.enter_async_context(self._streams_context)
- read_stream, write_stream, _ = transport
-
- self._session_context = ClientSession(
- read_stream, write_stream
- ) # pylint: disable=W0201
-
- self.session = await self.exit_stack.enter_async_context(
- self._session_context
- )
- await self.session.initialize()
- except Exception as e:
- await self.disconnect()
- raise e
+ async with AsyncExitStack() as exit_stack:
+ try:
+ self._streams_context = streamablehttp_client(url, headers=headers)
+
+ transport = await exit_stack.enter_async_context(self._streams_context)
+ read_stream, write_stream, _ = transport
+
+ self._session_context = ClientSession(
+ read_stream, write_stream
+ ) # pylint: disable=W0201
+
+ self.session = await exit_stack.enter_async_context(
+ self._session_context
+ )
+ with anyio.fail_after(10):
+ await self.session.initialize()
+ self.exit_stack = exit_stack.pop_all()
+ except Exception as e:
+ await asyncio.shield(self.disconnect())
+ raise e
async def list_tool_specs(self) -> Optional[dict]:
if not self.session:
diff --git a/backend/open_webui/utils/middleware.py b/backend/open_webui/utils/middleware.py
index dd42612eee..e5b84a3d79 100644
--- a/backend/open_webui/utils/middleware.py
+++ b/backend/open_webui/utils/middleware.py
@@ -45,10 +45,10 @@
SearchForm,
)
from open_webui.routers.images import (
- load_b64_image_data,
image_generations,
- GenerateImageForm,
- upload_image,
+ CreateImageForm,
+ image_edits,
+ EditImageForm,
)
from open_webui.routers.pipelines import (
process_pipeline_inlet_filter,
@@ -91,7 +91,7 @@
convert_logit_bias_input_to_json,
get_content_from_message,
)
-from open_webui.utils.tools import get_tools
+from open_webui.utils.tools import get_tools, get_updated_tool_function
from open_webui.utils.plugin import load_function_module_by_id
from open_webui.utils.filter import (
get_sorted_filter_ids,
@@ -718,9 +718,31 @@ async def chat_web_search_handler(
return form_data
+def get_last_images(message_list):
+ images = []
+ for message in reversed(message_list):
+ images_flag = False
+ for file in message.get("files", []):
+ if file.get("type") == "image":
+ images.append(file.get("url"))
+ images_flag = True
+
+ if images_flag:
+ break
+
+ return images
+
+
async def chat_image_generation_handler(
request: Request, form_data: dict, extra_params: dict, user
):
+ metadata = extra_params.get("__metadata__", {})
+ chat_id = metadata.get("chat_id", None)
+ if not chat_id:
+ return form_data
+
+ chat = Chats.get_chat_by_id_and_user_id(chat_id, user.id)
+
__event_emitter__ = extra_params["__event_emitter__"]
await __event_emitter__(
{
@@ -729,87 +751,151 @@ async def chat_image_generation_handler(
}
)
- messages = form_data["messages"]
- user_message = get_last_user_message(messages)
+ messages_map = chat.chat.get("history", {}).get("messages", {})
+ message_id = chat.chat.get("history", {}).get("currentId")
+ message_list = get_message_list(messages_map, message_id)
+ user_message = get_last_user_message(message_list)
prompt = user_message
- negative_prompt = ""
+ input_images = get_last_images(message_list)
- if request.app.state.config.ENABLE_IMAGE_PROMPT_GENERATION:
- try:
- res = await generate_image_prompt(
- request,
- {
- "model": form_data["model"],
- "messages": messages,
- },
- user,
- )
+ system_message_content = ""
+ if len(input_images) == 0:
+ # Create image(s)
+ if request.app.state.config.ENABLE_IMAGE_PROMPT_GENERATION:
+ try:
+ res = await generate_image_prompt(
+ request,
+ {
+ "model": form_data["model"],
+ "messages": form_data["messages"],
+ },
+ user,
+ )
- response = res["choices"][0]["message"]["content"]
+ response = res["choices"][0]["message"]["content"]
- try:
- bracket_start = response.find("{")
- bracket_end = response.rfind("}") + 1
+ try:
+ bracket_start = response.find("{")
+ bracket_end = response.rfind("}") + 1
- if bracket_start == -1 or bracket_end == -1:
- raise Exception("No JSON object found in the response")
+ if bracket_start == -1 or bracket_end == -1:
+ raise Exception("No JSON object found in the response")
+
+ response = response[bracket_start:bracket_end]
+ response = json.loads(response)
+ prompt = response.get("prompt", [])
+ except Exception as e:
+ prompt = user_message
- response = response[bracket_start:bracket_end]
- response = json.loads(response)
- prompt = response.get("prompt", [])
except Exception as e:
+ log.exception(e)
prompt = user_message
+ try:
+ images = await image_generations(
+ request=request,
+ form_data=CreateImageForm(**{"prompt": prompt}),
+ user=user,
+ )
+
+ await __event_emitter__(
+ {
+ "type": "status",
+ "data": {"description": "Image created", "done": True},
+ }
+ )
+
+ await __event_emitter__(
+ {
+ "type": "files",
+ "data": {
+ "files": [
+ {
+ "type": "image",
+ "url": image["url"],
+ }
+ for image in images
+ ]
+ },
+ }
+ )
+
+ system_message_content = "