Skip to content

Commit dd889c4

Browse files
committed
feat: AI diagram generation refactor — model selector, JSON repair, embed cleanup
- Refactored Draw AI generation to match Image/Git card pattern - Always-visible prompt bar with per-card model selector + Generate button - Robust repairJson() pipeline for local model JSON mistakes - Removed ~300-line duplicate AI bar from excalidraw-embed.html - Updated Draw tag description and release notes in README - 23 draw-docgen tests pass
1 parent 5d5ca03 commit dd889c4

File tree

2 files changed

+54
-1
lines changed

2 files changed

+54
-1
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@
3636
| **Desktop** | Native app via Neutralino.js with system tray and offline support |
3737
| **Code Execution** | 7 languages in-browser: Bash ([just-bash](https://justbash.dev/)), Math (Nerdamer), LaTeX (MathJax + Nerdamer evaluation), Python ([Pyodide](https://pyodide.org/)), HTML (sandboxed iframe, `html-autorun` for widgets/quizzes), JavaScript (sandboxed iframe), SQL ([sql.js](https://sql.js.org/) SQLite) · 25+ compiled languages via [Judge0 CE](https://ce.judge0.com): C, C++, Rust, Go, Java, TypeScript, Kotlin, Scala, Ruby, Swift, Haskell, Dart, C#, and more · **▶ Run All** notebook engine — one-click sequential execution with preflight dialog (block table with model/status), pre-execution model loading (AI + TTS auto-loaded before blocks run), progress bar, abort, per-block status badges, detailed console logging, and SQLite shared context store |
3838
| **Security** | Content Security Policy (CSP), SRI integrity hashes, XSS sanitization (DOMPurify), ReDoS protection, Firestore write-token ownership, API keys via HTTP headers, postMessage origin validation, 8-char passphrase minimum, sandboxed code execution |
39-
| **AI Document Tags** | `{{@AI:}}` text generation (`@think: Yes` for deep reasoning), `{{@Image:}}` image generation (Gemini Imagen), `{{@OCR:}}` image-to-text extraction (Text/Math/Table modes via Granite Docling 258M, Florence-2 230M, or GLM-OCR 1.5B, PDF page rendering via pdf.js), `{{@TTS:}}` text-to-speech playback (Kokoro TTS per card, language selector, ▶ Play / ⬇ Save WAV), `{{@STT:}}` speech-to-text dictation (engine selector: Whisper/Voxtral/Web Speech API, 11 languages, Record/Stop/Insert/Clear), `{{@Translate:}}` translation (target language selector, integrated TTS pronunciation, cloud model routing), `{{@Game:}}` game builder (AI-generated or pre-built, Canvas 2D/Three.js/P5.js, import/export HTML), `{{@Draw:}}` whiteboard (Excalidraw + Mermaid, Insert/PNG/SVG export, 📚 Library Browser with 29 bundled packs in 6 categories, 🚀 AI diagram generation with natural language prompt and model selector) — `@` prefix syntax on all tag types + metadata fields (`@name`, `@use`, `@think`, `@search`, `@prompt`, `@step`, `@upload`, `@model`, `@engine`, `@lang`, `@prebuilt`); `@model:` field persists selected model per card with intelligent defaults (OCR→`granite-docling`, TTS→`kokoro-tts`, STT→`voxtral-stt`, Image→`imagen-ultra`); editable `@prompt:` textarea and `@step:` inputs in preview cards; description/prompt separation (bare text = label, `@prompt:` = AI instruction); 📎 image/PDF upload for multimodal vision analysis; per-card model selector with document-portable model persistence, concurrent block operations |
39+
| **AI Document Tags** | `{{@AI:}}` text generation (`@think: Yes` for deep reasoning), `{{@Image:}}` image generation (Gemini Imagen), `{{@OCR:}}` image-to-text extraction (Text/Math/Table modes via Granite Docling 258M, Florence-2 230M, or GLM-OCR 1.5B, PDF page rendering via pdf.js), `{{@TTS:}}` text-to-speech playback (Kokoro TTS per card, language selector, ▶ Play / ⬇ Save WAV), `{{@STT:}}` speech-to-text dictation (engine selector: Whisper/Voxtral/Web Speech API, 11 languages, Record/Stop/Insert/Clear), `{{@Translate:}}` translation (target language selector, integrated TTS pronunciation, cloud model routing), `{{@Game:}}` game builder (AI-generated or pre-built, Canvas 2D/Three.js/P5.js, import/export HTML), `{{@Draw:}}` whiteboard (Excalidraw + Mermaid, AI diagram generation with per-card model selector + 🚀 Generate, robust JSON repair for local models, Insert/PNG/SVG export, 📚 Library Browser with 29 bundled packs in 6 categories) — `@` prefix syntax on all tag types + metadata fields (`@name`, `@use`, `@think`, `@search`, `@prompt`, `@step`, `@upload`, `@model`, `@engine`, `@lang`, `@prebuilt`); `@model:` field persists selected model per card with intelligent defaults (OCR→`granite-docling`, TTS→`kokoro-tts`, STT→`voxtral-stt`, Image→`imagen-ultra`); editable `@prompt:` textarea and `@step:` inputs in preview cards; description/prompt separation (bare text = label, `@prompt:` = AI instruction); 📎 image/PDF upload for multimodal vision analysis; per-card model selector with document-portable model persistence, concurrent block operations |
4040
| **🔌 API Calls** | `{{API:}}` REST API integration — GET/POST/PUT/DELETE methods, custom headers, JSON body, response stored in `$(api_varName)` variables; inline review panel; toolbar GET/POST buttons |
4141
| **🔗 Agent Flow** | `{{Agent:}}` multi-step pipeline — define Step 1/2/3, chain outputs, per-card model + search provider selector, live step status indicators (⏳/✅/❌), review combined output |
4242
| **🔍 Web Search** | Toggle web search for AI — 7 providers: DuckDuckGo (free), Brave Search, Serper.dev, Tavily (AI-optimized), Google CSE, Wikipedia, Wikidata; search results injected into LLM context; source citations in responses; per-agent-card search provider selector |
@@ -461,6 +461,7 @@ TextAgent has undergone significant evolution since its inception. What started
461461
| Date | Commits | Feature / Update |
462462
|------|---------|-----------------:|
463463
| **2026-03-18** | | 🚀 **AI Diagram Generation** — natural language → Excalidraw JSON via LLM; new AI prompt section in `{{Draw:}}` cards with text input, model selector dropdown, and 🚀 Generate button; `EXCALIDRAW_CHEAT_SHEET` system prompt teaches LLM the element schema (rectangle, ellipse, diamond, text, arrow, line); `repairJson()` auto-fixes common LLM JSON mistakes (trailing commas, truncated output, missing brackets); `@model:` field in Draw tags for per-card model persistence; cancel/retry support; Gemini API key forwarding to Excalidraw embed; 37 new Playwright tests (22 draw-docgen, 7 readonly-mode, 8 excalidraw-library) + 5 regression pins |
464+
| **2026-03-18** | | 🤖 **Draw AI Diagram Generation** — refactored `{{Draw:}}` AI generation to match Image/Git card pattern: always-visible prompt bar with per-card model selector dropdown + 🚀 Generate button; `excalidraw_diagram` task type with Excalidraw cheat sheet injected into AI workers; robust `repairJson()` pipeline handles common local-model JSON mistakes (trailing commas, stray quotes, truncated output, missing commas); last-resort individual object extraction recovers partial diagrams; removed duplicate ~300-line in-iframe AI bar from `excalidraw-embed.html`; 23 tests pass |
464465
| **2026-03-18** | | 📷 **GLM-OCR Model** — added [GLM-OCR (1.5B)](https://huggingface.co/textagent/GLM-OCR-ONNX) as third local OCR model alongside Granite Docling and Florence-2; `ai-worker-glm-ocr.js` Web Worker using q4f16 quantization (~650 MB, WebGPU required); primary `textagent/GLM-OCR-ONNX` with `onnx-community/GLM-OCR-ONNX` fallback; `glm-ocr` entry in `ai-models.js` with `isDocModel: true`; documentation updated; 7 new Playwright model registry tests |
465466
| **2026-03-18** | | 📚 **Excalidraw Library Browser** — 29 bundled library packs (600+ items) organized in 6 categories (Architecture & System Design, UI/UX & Wireframing, Icons & Logos, Cloud & DevOps, Data & Algorithms, AI/Science & Education) with slide-in Library Browser panel; each library card with name, description, and toggle switch for on-demand loading; real-time search/filter; injected via MutationObserver into Excalidraw's native Library sidebar as "📦 Browse & Add Library Packs" button; libraries include Software Architecture, System Design Components, AWS Icons, Google Icons (139 items), UML/ER, Wireframing, Deep Learning, Math Teacher, Charts, Graphs, and more |
466467
| **2026-03-18** | | 🎨 **Draw DocGen Integration** — full `{{Draw:}}` tag pipeline: `transformDrawMarkdown` + `bindDrawPreviewActions` in renderer, 🎨 Draw toolbar button, `excalidraw.com` added to CSP `frame-src`, `draw-docgen.css` (309-line standalone stylesheet with card UI, tool pills, Mermaid editor, dark mode), `draw-docgen.js` lazy-loaded as Phase 3j; DOMPurify allowlist expanded with `data-draw-index`, `data-draw-tool`, `data-tool`, `data-skill` |
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# CHANGELOG — Draw AI Diagram Refactor
2+
3+
## Summary
4+
Refactored AI diagram generation for `{{Draw:}}` tags to match the established pattern of Image Generate and GitHub Tools cards. Removed the duplicate in-iframe AI bar from `excalidraw-embed.html` and added a robust JSON repair pipeline for better local model compatibility.
5+
6+
## Changes
7+
8+
### AI Generate UI Refactored (`js/draw-docgen.js`, `css/draw-docgen.css`)
9+
- **Always-visible prompt bar** in the card header with model selector dropdown + 🚀 Generate button
10+
- Model selector built using `buildModelOpts()` matching the `git-docgen.js` pattern
11+
- AI generation uses per-card model selection with pre-flight checks:
12+
- Auto-loads local models from cache
13+
- Prompts for API keys on cloud models
14+
- Temporarily switches to selected model during generation
15+
- Supports both **Excalidraw** and **Mermaid** diagram AI generation
16+
- New `repairJson()` function handles common LLM JSON mistakes:
17+
- Trailing commas, stray quotes after numbers/booleans
18+
- Missing commas between properties
19+
- Truncated JSON (auto-closes brackets)
20+
- Individual object extraction as last resort
21+
- Excalidraw `@taskType: excalidraw_diagram` with cheat sheet prompt
22+
- Mermaid `@taskType: generate` with Mermaid code generation prompt
23+
24+
### Excalidraw Embed Cleanup (`public/excalidraw-embed.html`)
25+
- Removed duplicate AI generation bar (CSS, HTML, JS) — ~300 lines removed
26+
- Removed `EXCALIDRAW_CHEAT_SHEET` constant (now only in parent controller)
27+
- Removed `generateDiagram()` function, event listeners, and `set-api-key` handler
28+
- Removed `_aiApiKey` state variable
29+
30+
### AI Worker Updates (`public/ai-worker.js`, `public/ai-worker-gemini.js`, `public/ai-worker-common.js`)
31+
- Added `excalidraw_diagram` task type with dedicated Excalidraw cheat sheet system prompt
32+
- 16384 token max for diagram generation tasks
33+
- Cheat sheet includes element types, colors, rules, and labeled shape pattern
34+
35+
### Test Updates (`tests/feature/draw-docgen.spec.js`)
36+
- Updated 4 AI tests from iframe-based (`#ai-bar`) to parent card (`.draw-ai-prompt-section`)
37+
- Tests now verify: prompt section visible, prompt input placeholder, Generate button, model selector
38+
- All 23 tests pass
39+
40+
### Regression Tests (`tests/regression/regression-recent.spec.js`)
41+
- Added regression tests for recent bug fixes
42+
43+
## Files Modified
44+
- `js/draw-docgen.js` — AI generate refactor + JSON repair pipeline
45+
- `css/draw-docgen.css` — Model selector styles
46+
- `public/excalidraw-embed.html` — Removed duplicate AI bar
47+
- `public/ai-worker.js` — Excalidraw diagram task type
48+
- `public/ai-worker-gemini.js` — Excalidraw diagram task type
49+
- `public/ai-worker-common.js` — Excalidraw diagram task type
50+
- `tests/feature/draw-docgen.spec.js` — Updated AI tests
51+
- `tests/regression/regression-recent.spec.js` — New regression tests
52+
- `styles.css` — Minor style updates

0 commit comments

Comments
 (0)