hoquanghai · hongphuc5497 · May 19, 2026 · May 20, 2026 · May 20, 2026 · May 20, 2026
diff --git a/.agents/skills/create-news-video/SKILL.md b/.agents/skills/create-news-video/SKILL.md
diff --git a/.claude/skills/create-news-video/SKILL.md b/.claude/skills/create-news-video/SKILL.md
@@ -48,7 +48,7 @@ Single argument: a news article URL (starts with `http://` or `https://`) OR a p
 
 ### Step 4: Generate script.json
 
-Following the schema in `docs/superpowers/specs/2026-04-29-auto-news-video-design.md` Section 4. Key rules:
+Following the schema in `src/render/script-schema.ts` (Zod discriminated union, 6 templates). Key rules:
 
 **Script content (Vietnamese):**
 - Total voiceText: ~150–200 words → ~55–65s spoken at speed 1.0
@@ -129,46 +129,49 @@ RIGHT (natural):
 **Hook (most important — gets first 3 seconds of viewer attention):**
 - Must contain a claim, statistic, or curious question
 - NEVER generic ("Hôm nay chúng ta sẽ nói về..." is wrong)
-- ALWAYS include at least 1 effect: `flash-white-3f` or `particle-burst`
+- When source has og:image, set `bgSrc: "$source.image"` and pick a `kenBurns` effect
+- When no image, omit `bgSrc` — pipeline uses gradient fallback
 
-**Visual rules:**
-- For image scenes: `background.src = "$source.image"` (literal — CLI substitutes)
-- Vary `kenBurns` across scenes (don't use `zoom-in` for every scene)
-- Vary text `animation` (don't use `slide-up` for every line)
-- Each line ≤ 25 characters
-- Each scene 1-3 lines
+**TemplateData rules (6 available templates):**
+
+| Template | When to pick | Required fields |
+|---|---|---|
+| `hook` | First scene (3-5s) | `headline` (max 40), `subhead?` (max 40), `bgSrc?`, `kenBurns?` |
+| `comparison` | "X vs Y" / "exceeds" / "compared to" | `left: {label, value, color}`, `right: {label, value, color, winner?}` |
+| `stat-hero` | Key number / % stat | `value` (max 20), `label` (max 40), `context?` (max 50) |
+| `feature-list` | Listing features (1-4 bullets) | `title` (max 40), `bullets[]` (max 50 each), `icon?` |
+| `callout` | Statement / warning / quote | `statement` (max 80), `tag?` (max 20) |
+| `outro` | Last scene (3-5s) | `ctaTop` (max 30), `channelName` (max 30), `source` (max 40) |
+
+- Pick templates based on content signal, not arbitrarily — each story beat dictates its template
+- Vary `kenBurns` across hook scenes (values: `zoom-in`, `zoom-out`, `pan-left`, `pan-right`; default `zoom-in`)
 
 **Outro (always fixed format):**
 ```json
 {
   "id": "outro",
   "type": "outro",
   "voiceText": "Theo dõi Công nghệ 24h để xem bản tin mới mỗi ngày.",
-  "visual": {
-    "background": { "type": "gradient", "preset": "outro-purple" },
-    "text": {
-      "position": "center",
-      "style": "outro-card",
-      "lines": [
-        { "content": "Xem bản tin mới mỗi ngày", "emphasis": "primary", "animation": "fade-in" },
-        { "content": "Công nghệ 24h",            "emphasis": "channel", "animation": "scale-pop" },
-        { "content": "Nguồn: <DOMAIN>",          "emphasis": "muted",   "animation": "fade-in-late" }
-      ]
-    }
+  "templateData": {
+    "template": "outro",
+    "ctaTop": "Xem bản tin mới mỗi ngày",
+    "channelName": "Công nghệ 24h",
+    "source": "<DOMAIN>"
   }
 }
 ```
-Replace `<DOMAIN>` with the actual domain string. Note: outro line 1 is shortened to fit 25-char schema rule (full CTA "Theo dõi để xem bản tin mới mỗi ngày" is 36 chars).
+Replace `<DOMAIN>` with the actual domain string (e.g. `"vnexpress.net"`). `ctaTop` max 30 chars — shorten the full CTA if needed.
 
 ### Step 5: Self-validate before writing
 
 Check:
-- Total word count ~150-200
-- Every line.content ≤ 25 chars
-- 5-8 scenes total
-- scenes[0].type === "hook"
-- last scene type === "outro"
-- All enum values valid (see spec Section 4.2)
+- Total voiceText words: ~150-200
+- 5-8 scenes total (1 hook + 3-6 body + 1 outro)
+- scenes[0].type === "hook", scenes[last].type === "outro"
+- Every templateData has required fields for its template (see table above)
+- voiceText: numbers spelled phonetically, no emoji, no URLs, no markdown
+- `voice.provider`: "lucylab" or "elevenlabs"
+- Hook with og:image → set `bgSrc: "$source.image"`; no image → omit `bgSrc`
 
 If invalid, fix yourself silently. Up to 2 self-correction passes. After that, write anyway — the CLI's Zod validation will produce a precise error message that the user can act on.
 
@@ -205,7 +208,7 @@ Tổng thời lượng: XX.Xs
 
 User: `/create-news-video https://vnexpress.net/iphone-17-200mp`
 
-Generated `script.json` (excerpt):
+Generated `script.json`:
 ```json
 {
   "version": "1.0",
@@ -223,20 +226,54 @@ Generated `script.json` (excerpt):
     {
       "id": "hook", "type": "hook",
       "voiceText": "Apple vừa ra mắt iPhone 17 với camera hai trăm megapixel.",
-      "visual": {
-        "background": { "type": "image", "src": "$source.image", "kenBurns": "zoom-in" },
-        "overlay":    { "darkness": 0.4 },
-        "text": {
-          "position": "center", "style": "hook-large",
-          "lines": [
-            { "content": "iPhone 17",     "emphasis": "primary", "animation": "scale-pop" },
-            { "content": "Camera 200MP!", "emphasis": "accent",  "animation": "slide-up-bounce" }
-          ]
-        },
-        "effects": ["flash-white-3f", "particle-burst"]
+      "templateData": {
+        "template": "hook",
+        "headline": "iPhone 17",
+        "subhead": "Camera 200MP!",
+        "bgSrc": "$source.image",
+        "kenBurns": "zoom-in"
+      },
+      "sfx": { "name": "cinematic/impact", "volume": 0.5 }
+    },
+    {
+      "id": "body-1", "type": "body",
+      "voiceText": "Cảm biến hoàn toàn mới cho zoom quang học gấp mười lần, vượt mọi đối thủ Android.",
+      "templateData": {
+        "template": "stat-hero",
+        "value": "200MP",
+        "label": "Cảm biến mới",
+        "context": "Zoom quang học 10x"
+      }
+    },
+    {
+      "id": "body-2", "type": "body",
+      "voiceText": "Pin năm nghìn miliampe giờ, tăng ba mươi phần trăm so với đời cũ. Sạc nhanh sáu mươi lăm watt.",
+      "templateData": {
+        "template": "feature-list",
+        "title": "Nâng cấp lớn",
+        "bullets": ["Pin 5000mAh", "Tăng 30%", "Sạc nhanh 65W"],
+        "icon": "spark"
+      }
+    },
+    {
+      "id": "body-3", "type": "body",
+      "voiceText": "Giá khởi điểm hai mươi mốt triệu đồng, dự kiến mở bán tại Việt Nam vào tháng sau.",
+      "templateData": {
+        "template": "callout",
+        "statement": "Giá từ 21 triệu đồng, mở bán tháng 5.",
+        "tag": "Giá bán"
+      }
+    },
+    {
+      "id": "outro", "type": "outro",
+      "voiceText": "Theo dõi Công nghệ 24h để xem bản tin mới mỗi ngày.",
+      "templateData": {
+        "template": "outro",
+        "ctaTop": "Theo dõi ngay",
+        "channelName": "Công nghệ 24h",
+        "source": "vnexpress.net"
       }
     }
-    /* ... 3 body scenes + outro ... */
   ]
 }
 ```
@@ -245,35 +282,59 @@ Generated `script.json` (excerpt):
 
 User: `/create-news-video news/agi-update.txt`
 
-Generated `script.json` (excerpt):
+Generated `script.json`:
 ```json
 {
+  "version": "1.0",
   "metadata": {
     "title": "OpenAI công bố mô hình mới với khả năng lập luận",
     "source": { "url": "local", "domain": "local", "image": null },
     "channel": "Công nghệ 24h"
   },
+  "voice": { "provider": "lucylab", "voiceId": "${VIETNAMESE_VOICEID}", "speed": 1.0 },
   "scenes": [
     {
       "id": "hook", "type": "hook",
       "voiceText": "OpenAI vừa công bố mô hình mới có khả năng lập luận như con người.",
-      "visual": {
-        "background": { "type": "gradient", "preset": "news-dark" },
-        "text": {
-          "position": "center", "style": "hook-large",
-          "lines": [
-            { "content": "Mô hình mới", "emphasis": "primary", "animation": "scale-pop" },
-            { "content": "Lập luận!",  "emphasis": "accent",  "animation": "slide-up-bounce" }
-          ]
-        },
-        "effects": ["flash-white-3f"]
+      "templateData": {
+        "template": "hook",
+        "headline": "Mô hình mới",
+        "subhead": "Lập luận như người"
+      }
+    },
+    {
+      "id": "body-1", "type": "body",
+      "voiceText": "Mô hình đạt chín mươi hai phẩy bảy phần trăm trên benchmark, vượt xa phiên bản cũ.",
+      "templateData": {
+        "template": "stat-hero",
+        "value": "92.7%",
+        "label": "Benchmark",
+        "context": "Vượt phiên bản cũ 75.1%"
+      }
+    },
+    {
+      "id": "body-2", "type": "body",
+      "voiceText": "Hệ thống có thể tự suy luận đa bước, kiểm tra logic và sửa sai trước khi trả lời.",
+      "templateData": {
+        "template": "feature-list",
+        "title": "Khả năng mới",
+        "bullets": ["Suy luận đa bước", "Tự kiểm tra logic", "Tự sửa lỗi"]
+      }
+    },
+    {
+      "id": "outro", "type": "outro",
+      "voiceText": "Theo dõi Công nghệ 24h để xem bản tin mới mỗi ngày.",
+      "templateData": {
+        "template": "outro",
+        "ctaTop": "Xem bản tin mới mỗi ngày",
+        "channelName": "Công nghệ 24h",
+        "source": "local"
       }
     }
-    /* ... outro line 3 = "Nguồn: local" ... */
   ]
 }
 ```
-Note: when source has no image, every scene uses `background.type = "gradient"` (no image fallback at composer level needed).
+Note: when source has no image, omit `bgSrc` from the hook — the pipeline uses a gradient fallback automatically.
 
 ## Sound Effects (SFX)
 

diff --git a/.dockerignore b/.dockerignore
@@ -0,0 +1,8 @@
+node_modules
+dist
+output
+.git
+.env
+.env.local
+.DS_Store
+npm-debug.log*
diff --git a/.env.example b/.env.example
@@ -14,7 +14,7 @@ TTS_PROVIDER=lucylab
 # OPTION 1: LucyLab.io  (https://lucylab.io)
 # ════════════════════════════════════════════════════════════════════════════
 # Required when TTS_PROVIDER=lucylab
-VIETNAMESE_API_KEY=sk_live_xxxxxxxxxxxxxxxxxxxx
+VIETNAMESE_API_KEY=your_lucylab_api_key_here
 VIETNAMESE_VOICEID=22charvoiceiduuidhere
 
 # Optional overrides
@@ -41,6 +41,7 @@ ELEVENLABS_ENDPOINT=https://api.elevenlabs.io/v1
 # ════════════════════════════════════════════════════════════════════════════
 # Customize the TikTok-style profile card that appears at the end of every video.
 # All fields optional — defaults work out of the box.
+TIKTOK_ENABLED=true
 TIKTOK_DISPLAY_NAME=Quẹp Làm IT
 TIKTOK_HANDLE=@haiquep
 TIKTOK_FOLLOWERS=11.5k followers
@@ -53,3 +54,13 @@ TIKTOK_FOLLOWERS=11.5k followers
 # ── Pipeline tuning ─────────────────────────────────────────────────────────
 # TTS_CONCURRENCY: 1 for LucyLab (API limit), can increase for ElevenLabs
 TTS_CONCURRENCY=1
+
+# ── LLM Provider ────────────────────────────────────────────────────────────
+# Choose ONE: "anthropic", "openai", or "deepseek"
+# - anthropic : Claude (haiku for cost, sonnet for quality)
+# - openai    : GPT-4o / GPT-4.1
+# - deepseek  : OpenAI-compatible, set LLM_ENDPOINT
+LLM_PROVIDER=anthropic
+LLM_API_KEY=sk-ant-...
+LLM_MODEL=claude-haiku-4-5-20251001
+# LLM_ENDPOINT=https://api.deepseek.com/v1  # only needed for deepseek
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,17 @@
+# Changelog
+
+## [2.0.1] - 2026-05-20
+
+### Added
+- Web UI dashboard accessible at `http://localhost:4317` — paste a news URL and generate videos from the browser
+- Real-time job progress via Server-Sent Events (SSE) — see pipeline stages live as they run
+- LLM provider abstraction supporting Anthropic, OpenAI-compatible, and DeepSeek backends via `LLM_PROVIDER` env var
+- Article content web fetcher with HTML extraction and og:image detection
+- TikTok UI settings panel — configure avatar, handle, and follower count; settings persist across restarts
+- Output listing API with artifact badges (script, video, voice, text) and download links
+
+### Changed
+- Pipeline now respects `TIKTOK_ENABLED` toggle — disables TikTok card rendering and avatar fetching when off
+- Config supports `LLM_PROVIDER`, `LLM_API_KEY`, `LLM_MODEL`, and `LLM_ENDPOINT` environment variables
+- Script schema upgraded to discriminated union templates (6 types: hook, comparison, stat-hero, feature-list, callout, outro)
+- HTML composer conditionally renders TikTok handle and outro card based on settings
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,19 @@
+# AutoCreateVideo
+
+## Skill routing
+
+When the user's request matches an available skill, invoke it via the Skill tool. When in doubt, invoke the skill.
+
+Key routing rules:
+- Product ideas/brainstorming → invoke /office-hours
+- Strategy/scope → invoke /plan-ceo-review
+- Architecture → invoke /plan-eng-review
+- Design system/plan review → invoke /design-consultation or /plan-design-review
+- Full review pipeline → invoke /autoplan
+- Bugs/errors → invoke /investigate
+- QA/testing site behavior → invoke /qa or /qa-only
+- Code review/diff check → invoke /review
+- Visual polish → invoke /design-review
+- Ship/deploy/PR → invoke /ship or /land-and-deploy
+- Save progress → invoke /context-save
+- Resume context → invoke /context-restore
diff --git a/CONTEXT.md b/CONTEXT.md
@@ -0,0 +1,32 @@
+# CONTEXT — Auto News Video
+
+## Glossary
+
+**script.json** — The contract between Claude Code (skill) and the Node CLI (pipeline). Claude writes it, the CLI validates it with Zod, then renders a video from it. Contains metadata, voice config, and an array of scenes.
+
+**templateData** — The content payload Claude provides per scene, discriminated by `template` field. Claude picks the template type (creative decision) and fills in the fields (content). The CLI reads `templateData` to compose HTML. NOT the same as `visual` (a stale term from the April 2026 design spec that never shipped in this form).
+
+**Scene** — One segment of the video. Has `id`, `type` (hook|body|outro), `voiceText` (Vietnamese, TTS-safe), `templateData` (the chosen template + content), and optional `sfx` override.
+
+**Skill** — A Claude Code slash command (`.claude/skills/create-news-video/SKILL.md`) that orchestrates: fetch content → analyze → write script.json → run pipeline. The "creative" half of the architecture.
+
+**Pipeline** — The deterministic Node/TS half (`src/pipeline.ts`): validate script.json → TTS per scene → concat voice with SFX → compose HTML → render with HyperFrames → output video.mp4. Same input always produces identical frames.
+
+**voiceText** — Per-scene Vietnamese text for TTS. Dual role: (1) fed verbatim to LucyLab/ElevenLabs for speech synthesis — numbers MUST be spelled out phonetically ("năm phần trăm" not "5%"), and (2) scanned by the 3-tier SFX picker for semantic keywords to auto-select sound effects. This coupling is intentional — news writing naturally uses emotional language that maps to SFX categories.
+
+**SFX picker** — 3-tier per-scene sound effect selection: (1) explicit `scene.sfx` override, (2) semantic keyword match on `voiceText` (Vietnamese + English), (3) template default category. Within a category, files are picked deterministically by hashing `scene.id`. Anti-repetition window (last 2 scenes) prevents back-to-back duplicates.
+
+**Channel** — The brand identity: "Công nghệ 24h". Appears on the outro card and can be customized via `metadata.channel`.
+
+**Doc maintenance** — SKILL.md is the authoritative document (it's what Claude reads). The design spec (`docs/superpowers/specs/`) is a pre-implementation artifact and may drift. Code is the implementation but the skill file defines the contract Claude follows. When they diverge, SKILL.md wins — update it first, then align code to match.
+
+**Template selection** — Claude picks templates per scene based on content signals (the "When it's picked" column in README), not randomly. Hook always first, outro always last. Body templates match the story beat: a stat → `stat-hero`, a comparison → `comparison`, a list → `feature-list`, a warning → `callout`. Following content signals naturally produces variety — no mechanical "don't repeat" rule needed.
+
+**Template count** — 6 templates are implemented (hook, comparison, stat-hero, feature-list, callout, outro). README lists 6 more (quote-card, icon-grid, timeline, big-text, chart-bars, kinetic-quote) — these are documented aspirations, planned for future implementation. SKILL.md should only reference the 6 that actually render.
+
+**Dashboard** — The web UI + HTTP server at `localhost:4317` (`src/server.ts` + `src/ui/`). Browses outputs, triggers video generation from an article URL, streams job progress via SSE. The third architectural component alongside Skill and Pipeline.
+
+**Job** — An async process kicked off by the dashboard. Has an id, status (`running` | `success` | `failed`), logs, and an SSE event stream. Only one job runs at a time (V1). A pipeline job runs the full generate+render flow; a generate job produces script.json via LLM first, then chains into pipeline.
+
+**Generate** — The LLM-powered step that turns an article URL into `script.json`. The creative half (formerly exclusive to the Skill slash command), now callable from the dashboard via `POST /api/generate`. The server reads SKILL.md as the system prompt and provides a `web_fetch` tool so the LLM can fetch the article. Supports Anthropic, OpenAI, and DeepSeek providers via `LLM_PROVIDER` env var.
+
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,36 @@
+# ── Build stage ──────────────────────────────────────────────────────────
+FROM node:26-bookworm-slim AS build
+WORKDIR /app
+COPY package*.json ./
+RUN npm ci
+COPY . .
+RUN npm run build
+RUN npm prune --production
+
+# ── Runtime stage ─────────────────────────────────────────────────────────
+FROM node:26-bookworm-slim
+
+RUN apt-get update \
+    && apt-get install -y --no-install-recommends ca-certificates ffmpeg \
+    && rm -rf /var/lib/apt/lists/*
+
+WORKDIR /app
+
+COPY --from=build /app/package*.json ./
+COPY --from=build /app/node_modules ./node_modules
+COPY --from=build /app/dist ./dist
+COPY --from=build /app/assets ./assets
+COPY --from=build /app/src ./src
+
+RUN groupadd -r appuser && useradd -r -g appuser -d /app appuser \
+    && chown -R appuser:appuser /app
+USER appuser
+
+ENV HOST=0.0.0.0 \
+    PORT=4317 \
+    PUBLIC_BASE_PATH=/news-video-creating \
+    PUBLIC_DEMO_MODE=1
+
+EXPOSE 4317
+
+CMD ["node", "dist/server.js"]
diff --git a/README.md b/README.md
@@ -261,7 +261,7 @@ Open `.env.local` and pick **one of two providers**:
 
 ```env
 TTS_PROVIDER=lucylab
-VIETNAMESE_API_KEY=sk_live_xxxxxxxxxxxxxxxxxxxx
+VIETNAMESE_API_KEY=your_lucylab_api_key_here
 VIETNAMESE_VOICEID=22charvoiceiduuidhere
 ```