Skip to content

Commit 9279a20

Browse files
committed
feat: add Qwen 3.5 XL (9B) local multimodal model
- Added textagent/Qwen3.5-9B-Onnx (~16 GB) as qwen-local-9b - Supports vision (image-text-to-text architecture) - Marked requiresHighEnd due to large model size - Size progression: 0.8B → 2B → 4B → 9B - Updated README model table and release notes
1 parent e0dc8e7 commit 9279a20

File tree

3 files changed

+46
-1
lines changed

3 files changed

+46
-1
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,13 +55,14 @@
5555

5656
## 🤖 AI Assistant
5757

58-
TextAgent includes a built-in AI assistant panel with **three local model sizes** and cloud providers:
58+
TextAgent includes a built-in AI assistant panel with **four local model sizes** and cloud providers:
5959

6060
| Model | Provider | Type | Speed |
6161
|:------|:---------|:-----|:------|
6262
| **Qwen 3.5 Small (0.8B)** | Local (WebGPU/WASM) | 🔒 Private — no data leaves browser | ⚡ Fast |
6363
| **Qwen 3.5 Medium (2B)** | Local (WebGPU/WASM) | 🔒 Private — smarter, ~1.2 GB | ⚡ Fast |
6464
| **Qwen 3.5 Large (4B)** | Local (WebGPU/WASM) | 🔒 Private — best quality, ~2.5 GB | ⚡ High-end |
65+
| **Qwen 3.5 XL (9B)** | Local (WebGPU/WASM) | 🔒 Private — multimodal vision, ~16 GB | 🧠 High-end |
6566
| **Gemini 3.1 Flash Lite** | Google (free tier) | ☁️ Cloud — 1M tokens/min | 🚀 Very Fast |
6667
| **Llama 3.3 70B** | Groq (free tier) | ☁️ Cloud — ultra-low latency | ⚡ Ultra Fast |
6768
| **Auto · Best Free** | OpenRouter (free tier) | ☁️ Cloud — multi-model routing | 🧠 Powerful |
@@ -547,6 +548,7 @@ TextAgent has undergone significant evolution since its inception. What started
547548

548549
| Date | Commits | Feature / Update |
549550
|------|---------|-----------------:|
551+
| **2026-04-03** || 🤖 **Qwen 3.5 XL (9B) Local Model** — added `textagent/Qwen3.5-9B-Onnx` (~16 GB) as the largest local multimodal Qwen model; supports vision (image-text-to-text); marked `requiresHighEnd`; placed after 4B in size progression (0.8B → 2B → 4B → 9B) |
550552
| **2026-04-03** | — | 🔌 **Connector AI Pipeline** — new "My Connectors" system for plugging third-party data sources into the AI assistant; Hacker News connector fetches top stories with full URLs, author metadata, self-post body text, and top community comments; connector toggle in AI panel header with green active indicator; unified parallel fetch pipeline (`Promise.all`) merges connector + web search context; grounding instruction header ("LIVE DATA...Answer using this data") forces models to use fetched data; **Fixed:** Gemma 4 E4B worker completely discarded `context` parameter — only `userPrompt` was used in the messages array; context now injected as `context + "\n---\nUser question: " + userText`; Gemma 4 system prompt enhanced with "data is real and live" grounding instruction; context trimmed to 6000 chars for WebGPU memory safety; connector label click bug fixed (`e.preventDefault()` stops checkbox toggle via event bubbling); `hasActiveConnectors()` decoupled from DOM — reads `localStorage` directly; auto-repair re-enables connected-but-paused connectors on init; default HN stories 10→5; connector registry extensible for Slack, Notion, GitHub, Confluence |
551553
| **2026-04-03** | — | 👁️ **Gemma 4 Vision Tag** — new `{{@Vision:}}` DocGen tag backed by Gemma 4 E2B/E4B running locally via WebGPU/WASM; `ai-worker-gemma4.js` Web Worker with `Gemma4Processor` instantiation (bypasses `AutoProcessor` which lacks `image_processor_type`) and system-prompt persona fix; primary `onnx-community/gemma-4-E2B-it-ONNX` with `textagent/gemma-4-E4B-it-ONNX` fallback; cyan-themed Vision card with 📷 camera capture + 📎 omni-modal upload (image/audio/video); **video frame extraction** — `extractVideoFrames()` seeks 4 evenly-spaced timestamps in a hidden `<video>` element, draws each to Canvas at max 1280px, stores as JPEG 0.85; audio stored as direct base64; upload handler detects Vision card type, sets `accept="image/*,audio/*,video/*"`; generation handler maps attachments to typed inputs and calls `switchToModel('gemma4-e2b')` before execution, restores prior model after; 👁️ Vision toolbar button in AI Tags dropdown; Fixed: Vision card double-rendering raw `@upload:` / `@prompt:` lines caused by broken `\\s*` regex (quadruple-escaped) — now correct `\s*`; removed duplicate static text row; `gemma4-e2b` / `gemma4-e4b` entries in `ai-models.js` with `isDocModel: true` + `supportsVision: true` |
552554
| **2026-04-02** | `55538f3` | 🔧 **DocGen Reject Block Fix** — fixed: rejecting a generated Translate/OCR/TTS/STT/Image/AI block restored a generic hardcoded "AI Generate" card, losing all type-specific UI (language dropdown, mode pills, camera button, step inputs, etc.); reject handler now calls `M.transformDocgenMarkdown(block.fullMatch)` to re-render the exact original typed card with all controls intact; `data-ai-index` patched on restored card and all children; review panel header label+icon now shows correct type for all blocks ("🌐 Translate — Review", "🔍 OCR Scan — Review", etc.) instead of always "✨ AI Generate — Review" |
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Qwen 3.5 9B Local Model — Add XL Multimodal Option
2+
3+
- Added Qwen 3.5 XL (9B) as a new local multimodal model (`qwen-local-9b`)
4+
- Model source: `textagent/Qwen3.5-9B-Onnx` on HuggingFace (~16 GB download)
5+
- Supports vision (image-text-to-text architecture with vision encoder)
6+
- Marked as `requiresHighEnd` due to large model size
7+
- Placed after 4B variant to maintain logical size progression (0.8B → 2B → 4B → 9B)
8+
9+
---
10+
11+
## Summary
12+
Adds the Qwen 3.5 9B ONNX model to the local model roster, giving users with high-end hardware access to the largest and most capable Qwen 3.5 variant directly in-browser.
13+
14+
---
15+
16+
## 1. Qwen 3.5 XL (9B) Model Entry
17+
**Files:** `js/ai-models.js`
18+
**What:** Added a new `qwen-local-9b` entry in `AI_MODELS` with model ID `textagent/Qwen3.5-9B-Onnx`, category `local-multimodal`, `supportsVision: true`, `requiresHighEnd: true`, and `~16 GB` download size.
19+
**Impact:** Users with sufficient hardware (VRAM/RAM) can now select Qwen 3.5 9B from the model dropdown for the highest-quality local multimodal inference, including text and image understanding.
20+
21+
---
22+
23+
## Files Changed (1 total)
24+
25+
| File | Lines Changed | Type |
26+
|------|:---:|------|
27+
| `js/ai-models.js` | +16 | New model entry |

js/ai-models.js

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,22 @@
5656
requiresHighEnd: true,
5757
},
5858

59+
// ── Local: Qwen 3.5 XL (9B) ──────────────────────────
60+
'qwen-local-9b': {
61+
label: 'Qwen 3.5 9B · Local',
62+
badge: 'Qwen 3.5 9B · Local',
63+
icon: 'bi bi-pc-display',
64+
statusReady: 'Qwen 3.5 9B · Local',
65+
dropdownName: 'Qwen 3.5 XL (9B)',
66+
dropdownDesc: 'Local · Multimodal · ~16 GB · High-end',
67+
isLocal: true,
68+
category: 'local-multimodal',
69+
localModelId: 'textagent/Qwen3.5-9B-Onnx',
70+
downloadSize: '~16 GB',
71+
requiresHighEnd: true,
72+
supportsVision: true,
73+
},
74+
5975
// ── Local: Qwen 3 4B Thinking ────────────────────────
6076
'qwen3-thinking-4b': {
6177
label: 'Qwen 3 Thinking · Local',

0 commit comments

Comments
 (0)