feat: AI diagram generation refactor — model selector, JSON repair, embed cleanup

ijbo · ijbo · commit dd889c43bc2a · 2026-03-18T03:06:32.000+09:00
- Refactored Draw AI generation to match Image/Git card pattern
- Always-visible prompt bar with per-card model selector + Generate button
- Robust repairJson() pipeline for local model JSON mistakes
- Removed ~300-line duplicate AI bar from excalidraw-embed.html
- Updated Draw tag description and release notes in README
- 23 draw-docgen tests pass
diff --git a/README.md b/README.md
@@ -36,7 +36,7 @@
 | **Desktop** | Native app via Neutralino.js with system tray and offline support |
 | **Code Execution** | 7 languages in-browser: Bash ([just-bash](https://justbash.dev/)), Math (Nerdamer), LaTeX (MathJax + Nerdamer evaluation), Python ([Pyodide](https://pyodide.org/)), HTML (sandboxed iframe, `html-autorun` for widgets/quizzes), JavaScript (sandboxed iframe), SQL ([sql.js](https://sql.js.org/) SQLite) · 25+ compiled languages via [Judge0 CE](https://ce.judge0.com): C, C++, Rust, Go, Java, TypeScript, Kotlin, Scala, Ruby, Swift, Haskell, Dart, C#, and more · **▶ Run All** notebook engine — one-click sequential execution with preflight dialog (block table with model/status), pre-execution model loading (AI + TTS auto-loaded before blocks run), progress bar, abort, per-block status badges, detailed console logging, and SQLite shared context store |
 | **Security** | Content Security Policy (CSP), SRI integrity hashes, XSS sanitization (DOMPurify), ReDoS protection, Firestore write-token ownership, API keys via HTTP headers, postMessage origin validation, 8-char passphrase minimum, sandboxed code execution |
-| **AI Document Tags** | `{{@AI:}}` text generation (`@think: Yes` for deep reasoning), `{{@Image:}}` image generation (Gemini Imagen), `{{@OCR:}}` image-to-text extraction (Text/Math/Table modes via Granite Docling 258M, Florence-2 230M, or GLM-OCR 1.5B, PDF page rendering via pdf.js), `{{@TTS:}}` text-to-speech playback (Kokoro TTS per card, language selector, ▶ Play / ⬇ Save WAV), `{{@STT:}}` speech-to-text dictation (engine selector: Whisper/Voxtral/Web Speech API, 11 languages, Record/Stop/Insert/Clear), `{{@Translate:}}` translation (target language selector, integrated TTS pronunciation, cloud model routing), `{{@Game:}}` game builder (AI-generated or pre-built, Canvas 2D/Three.js/P5.js, import/export HTML), `{{@Draw:}}` whiteboard (Excalidraw + Mermaid, Insert/PNG/SVG export, 📚 Library Browser with 29 bundled packs in 6 categories, 🚀 AI diagram generation with natural language prompt and model selector) — `@` prefix syntax on all tag types + metadata fields (`@name`, `@use`, `@think`, `@search`, `@prompt`, `@step`, `@upload`, `@model`, `@engine`, `@lang`, `@prebuilt`); `@model:` field persists selected model per card with intelligent defaults (OCR→`granite-docling`, TTS→`kokoro-tts`, STT→`voxtral-stt`, Image→`imagen-ultra`); editable `@prompt:` textarea and `@step:` inputs in preview cards; description/prompt separation (bare text = label, `@prompt:` = AI instruction); 📎 image/PDF upload for multimodal vision analysis; per-card model selector with document-portable model persistence, concurrent block operations |
+| **AI Document Tags** | `{{@AI:}}` text generation (`@think: Yes` for deep reasoning), `{{@Image:}}` image generation (Gemini Imagen), `{{@OCR:}}` image-to-text extraction (Text/Math/Table modes via Granite Docling 258M, Florence-2 230M, or GLM-OCR 1.5B, PDF page rendering via pdf.js), `{{@TTS:}}` text-to-speech playback (Kokoro TTS per card, language selector, ▶ Play / ⬇ Save WAV), `{{@STT:}}` speech-to-text dictation (engine selector: Whisper/Voxtral/Web Speech API, 11 languages, Record/Stop/Insert/Clear), `{{@Translate:}}` translation (target language selector, integrated TTS pronunciation, cloud model routing), `{{@Game:}}` game builder (AI-generated or pre-built, Canvas 2D/Three.js/P5.js, import/export HTML), `{{@Draw:}}` whiteboard (Excalidraw + Mermaid, AI diagram generation with per-card model selector + 🚀 Generate, robust JSON repair for local models, Insert/PNG/SVG export, 📚 Library Browser with 29 bundled packs in 6 categories) — `@` prefix syntax on all tag types + metadata fields (`@name`, `@use`, `@think`, `@search`, `@prompt`, `@step`, `@upload`, `@model`, `@engine`, `@lang`, `@prebuilt`); `@model:` field persists selected model per card with intelligent defaults (OCR→`granite-docling`, TTS→`kokoro-tts`, STT→`voxtral-stt`, Image→`imagen-ultra`); editable `@prompt:` textarea and `@step:` inputs in preview cards; description/prompt separation (bare text = label, `@prompt:` = AI instruction); 📎 image/PDF upload for multimodal vision analysis; per-card model selector with document-portable model persistence, concurrent block operations |
 | **🔌 API Calls** | `{{API:}}` REST API integration — GET/POST/PUT/DELETE methods, custom headers, JSON body, response stored in `$(api_varName)` variables; inline review panel; toolbar GET/POST buttons |
 | **🔗 Agent Flow** | `{{Agent:}}` multi-step pipeline — define Step 1/2/3, chain outputs, per-card model + search provider selector, live step status indicators (⏳/✅/❌), review combined output |
 | **🔍 Web Search** | Toggle web search for AI — 7 providers: DuckDuckGo (free), Brave Search, Serper.dev, Tavily (AI-optimized), Google CSE, Wikipedia, Wikidata; search results injected into LLM context; source citations in responses; per-agent-card search provider selector |
@@ -461,6 +461,7 @@ TextAgent has undergone significant evolution since its inception. What started
 | Date | Commits | Feature / Update |
 |------|---------|-----------------:|
 | **2026-03-18** | | 🚀 **AI Diagram Generation** — natural language → Excalidraw JSON via LLM; new AI prompt section in `{{Draw:}}` cards with text input, model selector dropdown, and 🚀 Generate button; `EXCALIDRAW_CHEAT_SHEET` system prompt teaches LLM the element schema (rectangle, ellipse, diamond, text, arrow, line); `repairJson()` auto-fixes common LLM JSON mistakes (trailing commas, truncated output, missing brackets); `@model:` field in Draw tags for per-card model persistence; cancel/retry support; Gemini API key forwarding to Excalidraw embed; 37 new Playwright tests (22 draw-docgen, 7 readonly-mode, 8 excalidraw-library) + 5 regression pins |
+| **2026-03-18** | | 🤖 **Draw AI Diagram Generation** — refactored `{{Draw:}}` AI generation to match Image/Git card pattern: always-visible prompt bar with per-card model selector dropdown + 🚀 Generate button; `excalidraw_diagram` task type with Excalidraw cheat sheet injected into AI workers; robust `repairJson()` pipeline handles common local-model JSON mistakes (trailing commas, stray quotes, truncated output, missing commas); last-resort individual object extraction recovers partial diagrams; removed duplicate ~300-line in-iframe AI bar from `excalidraw-embed.html`; 23 tests pass |
 | **2026-03-18** | | 📷 **GLM-OCR Model** — added [GLM-OCR (1.5B)](https://huggingface.co/textagent/GLM-OCR-ONNX) as third local OCR model alongside Granite Docling and Florence-2; `ai-worker-glm-ocr.js` Web Worker using q4f16 quantization (~650 MB, WebGPU required); primary `textagent/GLM-OCR-ONNX` with `onnx-community/GLM-OCR-ONNX` fallback; `glm-ocr` entry in `ai-models.js` with `isDocModel: true`; documentation updated; 7 new Playwright model registry tests |
 | **2026-03-18** | | 📚 **Excalidraw Library Browser** — 29 bundled library packs (600+ items) organized in 6 categories (Architecture & System Design, UI/UX & Wireframing, Icons & Logos, Cloud & DevOps, Data & Algorithms, AI/Science & Education) with slide-in Library Browser panel; each library card with name, description, and toggle switch for on-demand loading; real-time search/filter; injected via MutationObserver into Excalidraw's native Library sidebar as "📦 Browse & Add Library Packs" button; libraries include Software Architecture, System Design Components, AWS Icons, Google Icons (139 items), UML/ER, Wireframing, Deep Learning, Math Teacher, Charts, Graphs, and more |
 | **2026-03-18** | | 🎨 **Draw DocGen Integration** — full `{{Draw:}}` tag pipeline: `transformDrawMarkdown` + `bindDrawPreviewActions` in renderer, 🎨 Draw toolbar button, `excalidraw.com` added to CSP `frame-src`, `draw-docgen.css` (309-line standalone stylesheet with card UI, tool pills, Mermaid editor, dark mode), `draw-docgen.js` lazy-loaded as Phase 3j; DOMPurify allowlist expanded with `data-draw-index`, `data-draw-tool`, `data-tool`, `data-skill` |
diff --git a/changelogs/CHANGELOG-draw-ai-refactor.md b/changelogs/CHANGELOG-draw-ai-refactor.md
@@ -0,0 +1,52 @@
+# CHANGELOG — Draw AI Diagram Refactor
+
+## Summary
+Refactored AI diagram generation for `{{Draw:}}` tags to match the established pattern of Image Generate and GitHub Tools cards. Removed the duplicate in-iframe AI bar from `excalidraw-embed.html` and added a robust JSON repair pipeline for better local model compatibility.
+
+## Changes
+
+### AI Generate UI Refactored (`js/draw-docgen.js`, `css/draw-docgen.css`)
+- **Always-visible prompt bar** in the card header with model selector dropdown + 🚀 Generate button
+- Model selector built using `buildModelOpts()` matching the `git-docgen.js` pattern
+- AI generation uses per-card model selection with pre-flight checks:
+  - Auto-loads local models from cache
+  - Prompts for API keys on cloud models
+  - Temporarily switches to selected model during generation
+- Supports both **Excalidraw** and **Mermaid** diagram AI generation
+- New `repairJson()` function handles common LLM JSON mistakes:
+  - Trailing commas, stray quotes after numbers/booleans
+  - Missing commas between properties
+  - Truncated JSON (auto-closes brackets)
+  - Individual object extraction as last resort
+- Excalidraw `@taskType: excalidraw_diagram` with cheat sheet prompt
+- Mermaid `@taskType: generate` with Mermaid code generation prompt
+
+### Excalidraw Embed Cleanup (`public/excalidraw-embed.html`)
+- Removed duplicate AI generation bar (CSS, HTML, JS) — ~300 lines removed
+- Removed `EXCALIDRAW_CHEAT_SHEET` constant (now only in parent controller)
+- Removed `generateDiagram()` function, event listeners, and `set-api-key` handler
+- Removed `_aiApiKey` state variable
+
+### AI Worker Updates (`public/ai-worker.js`, `public/ai-worker-gemini.js`, `public/ai-worker-common.js`)
+- Added `excalidraw_diagram` task type with dedicated Excalidraw cheat sheet system prompt
+- 16384 token max for diagram generation tasks
+- Cheat sheet includes element types, colors, rules, and labeled shape pattern
+
+### Test Updates (`tests/feature/draw-docgen.spec.js`)
+- Updated 4 AI tests from iframe-based (`#ai-bar`) to parent card (`.draw-ai-prompt-section`)
+- Tests now verify: prompt section visible, prompt input placeholder, Generate button, model selector
+- All 23 tests pass
+
+### Regression Tests (`tests/regression/regression-recent.spec.js`)
+- Added regression tests for recent bug fixes
+
+## Files Modified
+- `js/draw-docgen.js` — AI generate refactor + JSON repair pipeline
+- `css/draw-docgen.css` — Model selector styles
+- `public/excalidraw-embed.html` — Removed duplicate AI bar
+- `public/ai-worker.js` — Excalidraw diagram task type
+- `public/ai-worker-gemini.js` — Excalidraw diagram task type
+- `public/ai-worker-common.js` — Excalidraw diagram task type
+- `tests/feature/draw-docgen.spec.js` — Updated AI tests
+- `tests/regression/regression-recent.spec.js` — New regression tests
+- `styles.css` — Minor style updates