Problem
Using the gemini_text adapter, text-only prompts work, but prompts with images fail during the Gemini web upload flow.
The upstream caller has already uploaded/fetched the image successfully, and WebAI2API receives the request with images=1. The failure happens after entering the adapter's image upload step, before the file chooser/upload completes.
Environment
- Docker image:
foxhui/webai-2api:latest
- Image revision label:
84729fb15a8c86e585658fb52f4b5dd26baef031
- Adapter:
gemini_text
- Model:
gemini_text/gemini-3.1-pro
- Browser: Camoufox / Playwright
- Gemini URL:
https://gemini.google.com/app?hl=en
headless: false
temporaryChat: true
Log excerpt
[INFO] 触发模型: gemini_text/gemini-3.1-pro
[INFO] 请求入队: 这是什么... | images=1
[INFO] 执行任务 -> gemini_text/gemini-3.1-pro
[INFO] 开启新会话...
[INFO] 开始上传 1 张图片...
[ERRO] 点击操作超时: 点击操作失败 (元素): CLICK_TIMEOUT
Suspected cause
In src/backend/adapter/gemini_text.js, the image upload path currently depends on a single accessible-name selector for the upload menu button:
const uploadMenuBtn = page.getByRole('button', { name: 'Open upload file menu' });
await safeClick(page, uploadMenuBtn, { bias: 'button' });
const uploadFilesBtn = page.getByRole('menuitem', { name: /Upload files/ });
await uploadFilesViaChooser(page, uploadFilesBtn, imgPaths, ...)
On the current Gemini UI, the upload entry appears as a + button in the prompt box, and this role/name selector no longer seems to match reliably. safeClick then times out after the default ELEMENT_CLICK timeout, producing CLICK_TIMEOUT.
Expected behavior
Image prompts through gemini_text should click the current Gemini upload entry, open the file chooser, upload the image, and then send the prompt normally.
Suggested fix
Make the Gemini upload selectors more tolerant, for example by trying multiple upload-entry selectors before failing:
button[aria-label="Open upload file menu"]
button[aria-label="Upload files"]
button.upload-card-button
The menu item selector may also need text fallbacks for current/localized Gemini UI labels, for example:
/Upload files|Upload from computer|上传文件/
This should only affect the gemini_text image upload path; text-only prompts are working.
Problem
Using the
gemini_textadapter, text-only prompts work, but prompts with images fail during the Gemini web upload flow.The upstream caller has already uploaded/fetched the image successfully, and WebAI2API receives the request with
images=1. The failure happens after entering the adapter's image upload step, before the file chooser/upload completes.Environment
foxhui/webai-2api:latest84729fb15a8c86e585658fb52f4b5dd26baef031gemini_textgemini_text/gemini-3.1-prohttps://gemini.google.com/app?hl=enheadless: falsetemporaryChat: trueLog excerpt
Suspected cause
In
src/backend/adapter/gemini_text.js, the image upload path currently depends on a single accessible-name selector for the upload menu button:On the current Gemini UI, the upload entry appears as a
+button in the prompt box, and this role/name selector no longer seems to match reliably.safeClickthen times out after the defaultELEMENT_CLICKtimeout, producingCLICK_TIMEOUT.Expected behavior
Image prompts through
gemini_textshould click the current Gemini upload entry, open the file chooser, upload the image, and then send the prompt normally.Suggested fix
Make the Gemini upload selectors more tolerant, for example by trying multiple upload-entry selectors before failing:
The menu item selector may also need text fallbacks for current/localized Gemini UI labels, for example:
/Upload files|Upload from computer|上传文件/This should only affect the
gemini_textimage upload path; text-only prompts are working.