|
37 | 37 | | **Desktop** | Native app via Neutralino.js with system tray and offline support | |
38 | 38 | | **Code Execution** | 7 languages in-browser: Bash ([just-bash](https://justbash.dev/)), Math (Nerdamer), LaTeX (MathJax + Nerdamer evaluation), Python ([Pyodide](https://pyodide.org/)), HTML (sandboxed iframe, `html-autorun` for widgets/quizzes), JavaScript (sandboxed iframe), SQL ([sql.js](https://sql.js.org/) SQLite) · 25+ compiled languages via [Judge0 CE](https://ce.judge0.com): C, C++, Rust, Go, Java, TypeScript, Kotlin, Scala, Ruby, Swift, Haskell, Dart, C#, and more · **▶ Run All** notebook engine — one-click sequential execution with preflight dialog (block table with model/status), pre-execution model loading (AI + TTS auto-loaded before blocks run), progress bar, abort, per-block status badges, detailed console logging, and SQLite shared context store | |
39 | 39 | | **Security** | Content Security Policy (CSP), SRI integrity hashes, XSS sanitization (DOMPurify), ReDoS protection, Firestore write-token ownership, API keys via HTTP headers, postMessage origin validation, 8-char password minimum, sandboxed code execution, Cloudflare Turnstile CAPTCHA on email endpoint, automated security scanner (`scripts/security-check.sh`) with pre-commit integration | |
40 | | -| **AI Document Tags** | `{{@AI:}}` text generation (`@think: Yes` for deep reasoning), `{{@Image:}}` image generation (Gemini Imagen), `{{@OCR:}}` image-to-text extraction (Text/Math/Table modes via Granite Docling 258M, Florence-2 230M, or GLM-OCR 1.5B, 📷 live camera capture + 📎 image/PDF upload, PDF page rendering via pdf.js), `{{@TTS:}}` text-to-speech playback (Kokoro TTS per card, language selector, ▶ Play / ⬇ Save WAV), `{{@STT:}}` speech-to-text dictation (engine selector: Whisper/Voxtral/Web Speech API, 11 languages, Record/Stop/Insert/Clear), `{{@Translate:}}` translation (target language selector, integrated TTS pronunciation, cloud model routing), `{{@Game:}}` game builder (AI-generated or pre-built, Canvas 2D/Three.js/P5.js, import/export HTML), `{{@Draw:}}` whiteboard (Excalidraw + Mermaid, AI diagram generation with per-card model selector + 🚀 Generate, robust JSON repair for local models, Insert/PNG/SVG export, 📚 Library Browser with 29 bundled packs in 6 categories), `{{@Tools:}}` web tools (Jina Reader scrape + Jina Search, multi-URL scraping, API key modal, Accept/Reject/Copy results) — `@` prefix syntax on all tag types + metadata fields (`@name`, `@use`, `@think`, `@search`, `@prompt`, `@step`, `@upload`, `@model`, `@engine`, `@lang`, `@prebuilt`); `@model:` field persists selected model per card with intelligent defaults (OCR→`granite-docling`, TTS→`kokoro-tts`, STT→`voxtral-stt`, Image→`imagen-ultra`); editable `@prompt:` textarea and `@step:` inputs in preview cards; description/prompt separation (bare text = label, `@prompt:` = AI instruction); 📎 image/PDF upload for multimodal vision analysis; per-card model selector with document-portable model persistence, concurrent block operations | |
| 40 | +| **AI Document Tags** | `{{@AI:}}` text generation (`@think: Yes` for deep reasoning), `{{@Image:}}` image generation (Gemini Imagen), `{{@OCR:}}` image-to-text extraction (Text/Math/Table modes via Granite Docling 258M, Florence-2 230M, or GLM-OCR 1.5B, 📷 live camera capture + 📎 image/PDF upload, PDF page rendering via pdf.js), `{{@TTS:}}` text-to-speech playback (Kokoro TTS per card, language selector, ▶ Play / ⬇ Save WAV), `{{@STT:}}` speech-to-text dictation (engine selector: Whisper/Voxtral/Web Speech API, 11 languages, Record/Stop/Insert/Clear), `{{@Translate:}}` translation (target language selector, integrated TTS pronunciation, cloud model routing), `{{@Game:}}` game builder (AI-generated or pre-built, Canvas 2D/Three.js/P5.js, import/export HTML), `{{@Draw:}}` whiteboard (Excalidraw + Mermaid, AI diagram generation with per-card model selector + 🚀 Generate, robust JSON repair for local models, Insert/PNG/SVG export, 📚 Library Browser with 29 bundled packs in 6 categories), `{{@Tools:}}` web tools (Jina Reader scrape + Jina Search, multi-URL scraping, API key modal, Accept/Reject/Copy results), `{{@Research:}}` autonomous experiment loop (Pyodide-powered, AI-driven code optimization with live results table, keep/discard metric tracking, configurable @metric/@direction/@max_iterations) — `@` prefix syntax on all tag types + metadata fields (`@name`, `@use`, `@think`, `@search`, `@prompt`, `@step`, `@upload`, `@model`, `@engine`, `@lang`, `@prebuilt`); `@model:` field persists selected model per card with intelligent defaults (OCR→`granite-docling`, TTS→`kokoro-tts`, STT→`voxtral-stt`, Image→`imagen-ultra`); editable `@prompt:` textarea and `@step:` inputs in preview cards; description/prompt separation (bare text = label, `@prompt:` = AI instruction); 📎 image/PDF upload for multimodal vision analysis; per-card model selector with document-portable model persistence, concurrent block operations | |
41 | 41 | | **🔌 API Calls** | `{{API:}}` REST API integration — GET/POST/PUT/DELETE methods, custom headers, JSON body, response stored in `$(api_varName)` variables; inline review panel; toolbar GET/POST buttons | |
42 | 42 | | **🔗 Agent Flow** | `{{Agent:}}` multi-step pipeline — define Step 1/2/3, chain outputs, per-card model + search provider selector, live step status indicators (⏳/✅/❌), review combined output; `@cloud: yes/no` with ☁️ Cloud / 🖥️ Local badge; `@agenttype:` dropdown selector (openclaw/openfang) for external agents; Docker-based local execution via `agent-runner/server.js`; Agent Execution Settings UI (Codespaces/Local Docker/Custom endpoint); GitHub Codespaces cloud execution via ☁️ toggle; **📦 Agent Containers panel** — floating toolbar panel showing running Docker containers with live status, uptime, instant stop (`docker rm -f`), badge count, daemon readiness check, startup container recovery, and floating toggle accessible when header is fully hidden | |
43 | 43 | | **🔍 Web Search** | Toggle web search for AI — 7 providers: DuckDuckGo (free), Brave Search, Serper.dev, Tavily (AI-optimized), Google CSE, Wikipedia, Wikidata; search results injected into LLM context; source citations in responses; per-agent-card search provider selector | |
@@ -544,6 +544,7 @@ TextAgent has undergone significant evolution since its inception. What started |
544 | 544 |
|
545 | 545 | | Date | Commits | Feature / Update | |
546 | 546 | |------|---------|-----------------:| |
| 547 | +| **2026-04-01** | | 🔬 **Research Loop Tag** — new `{{@Research:}}` tag for autonomous AI-driven experiment optimization (inspired by Karpathy's autoresearch); Propose→Execute→Evaluate→Keep/Discard loop runs entirely in-browser via Pyodide (Python in WASM); configurable `@metric`, `@direction` (lower/higher), `@max_iterations`, `@model`, `@goal`; multiline `@code: \|` (mutable) and `@test: \|` (fixed harness) fields; metric extraction from stdout (`METRIC:xxx`); AI prompt includes experiment history with strategy shift hints after 3 consecutive failures; live results table with status badges (baseline/keep/discard/crash), delta indicators, progress bar; per-card model selector; Start/Stop controls; glass card UI with purple accents and pulse animation; `js/research-loop.js` (~795 lines) + `css/research-loop.css` (~300 lines) | |
547 | 548 | | **2026-04-01** | | 🎨 **AI Panel UI Redesign** — centered initial chat state (Claude-like welcome with input + model selector clustered in middle, transitions to bottom-pinned on first message via CSS `:has(.ai-welcome-message)`); merged separate Attach File (paperclip) and Screenshot (camera) buttons into single `+` button with unified dropdown menu (Attach File / Capture Page / Capture Screen / Upload Image, `+` rotates to `×` when open); merged header bar and status bar into single compact header (status text inline below title, download progress bar still standalone); fixed download progress bar stuck at 96% after model load | |
548 | 549 | | **2026-03-31** | | 📷 **Screenshot to AI** — new 📷 camera button in the AI chat input bar with three capture modes: Capture Page (`html2canvas` full-page snapshot, AI panel hidden during capture), Capture Screen (`getDisplayMedia` screen-share with frame extraction from a hidden DOM-attached video element), and Upload Image (file picker); captured image auto-injected into `pendingAttachments` and sent to AI for analysis; fixed black-screen capture bug (video must be in DOM for GPU decoder, wait for `timeupdate` event not just `requestAnimationFrame`); self-healing button injection via `injectButtonIfMissing()` with 2s fallback poll; CSS-independent dropdown via inline `style.display` toggling; vision model warning toast; `js/ai-screenshot.js` new module (~300 lines) + `css/ai-panel.css` styles + `js/modal-templates.js` template update + `src/main.js` registration | |
549 | 550 | | **2026-03-31** | | 🦀 **OpenClaw Integration Blog Post** — published detailed technical post (`CHANGELOG-openclaw-textagent-integration.md`) documenting how OpenClaw runs natively inside TextAgent's Docker-based Agent Flow; covers `AGENT_CLI_MAP`, native CLI invocation (`openclaw agent --message ... --json`), API key forwarding, structured JSON response parsing, multi-step context chaining, cloud mode via GitHub Codespaces, and security boundaries | |
|
0 commit comments