diff --git a/config.example.json b/config.example.json
index ef2ca498..78a62954 100644
--- a/config.example.json
+++ b/config.example.json
@@ -1,6 +1,9 @@
 {
   "provider_runtime": {
     "current_provider": "deepseek",
+    "default_provider": "deepseek",
+    "default_model": "deepseek-v4-flash",
+    "allow_fallback": false,
     "providers": {
       "deepseek": {
         "type": "openai-compatible",
@@ -22,6 +25,12 @@
           "gpt-5.4"
         ]
       }
+    },
+    "health": {
+      "fail_threshold": 3,
+      "recover_probe_sec": 30,
+      "recover_success_threshold": 2,
+      "window_size": 60
     }
   },
   "approval_policy": "on-request",
@@ -32,16 +41,5 @@
     "warning_ratio": 0.85,
     "critical_ratio": 0.95,
     "max_reactive_retry": 1
-  },
-  "token_usage": {
-    "storage_type": "file",
-    "storage_path": "./.bytemind/token_usage.json",
-    "backup_interval": "1m",
-    "max_sessions": 10000,
-    "alert_threshold": 1000000,
-    "enable_realtime": true,
-    "retention_days": 30,
-    "monitor_interval": "30s",
-    "database_driver": "sqlite3"
   }
 }
diff --git a/docs/user-stories.md b/docs/user-stories.md
new file mode 100644
index 00000000..24239754
--- /dev/null
+++ b/docs/user-stories.md
@@ -0,0 +1,453 @@
+# ByteMind 用户故事
+
+四个场景覆盖 ByteMind 全部功能点。每个故事末尾标注了该故事覆盖的功能模块。
+
+---
+
+## 故事一：设计阶段 — "为新模块做技术方案"
+
+> **角色**：后端工程师小张，刚接手一个 Go 微服务项目，需要为"消息推送模块"输出一份技术方案。
+
+### 1. 安装与上手
+
+小张在 Windows 上用 PowerShell 一键安装 ByteMind：
+
+```powershell
+iwr -useb https://raw.githubusercontent.com/1024XEngineer/bytemind/main/scripts/install.ps1 | iex
+```
+
+安装完成后进入项目目录，第一次启动看到了 **启动引导页**，提示他复制示例配置并填入 API Key。他编辑 `.bytemind/config.json`，配置了 OpenAI-compatible provider，顺手加了 Anthropic 和 Gemini 作为备用 provider，并开启 `auto_detect_type` 让系统自动识别 provider 类型。他通过环境变量 `BYTEMIND_HOME` 指定了配置目录。
+
+```json
+{
+  "provider": {
+    "type": "openai-compatible",
+    "base_url": "https://api.openai.com/v1",
+    "model": "gpt-5.4-mini",
+    "api_key": "sk-xxx"
+  },
+  "provider_runtime": {
+    "providers": [
+      { "id": "anthropic", "type": "anthropic", "model": "claude-sonnet-4-20250514", "api_key": "sk-ant-xxx" },
+      { "id": "gemini", "type": "gemini", "model": "gemini-2.5-pro", "api_key": "xxx" }
+    ]
+  },
+  "stream": true,
+  "max_iterations": 64
+}
+```
+
+### 2. 启动 Plan 模式，探索代码库
+
+小张启动 TUI 交互模式：
+
+```bash
+bytemind chat
+```
+
+进入 TUI 后，他先用 `/new` 新建会话，然后通过 `/models` 命令切换到 Claude Sonnet 4 模型——ByteMind 自动从 Anthropic provider 路由过去。他按 `Tab` 键打开子智能体面板，了解到有 `explorer`、`general`、`review` 三个内置子智能体。
+
+他输入 `@explorer 帮我梳理项目中与消息推送相关的所有代码文件和模块依赖`，ByteMind 自动补全子智能体名称，派发 `explorer` 子智能体去搜索代码库。子智能体通过 `list_files`、`search_text`、`read_file` 等工具遍历项目结构，返回了一份完整的模块依赖报告。
+
+接着小张用 `web_search` 调研业界消息推送的最佳实践，用 `web_fetch` 抓取了几篇技术文章的详细内容。
+
+### 3. Plan 模式：从探索到方案
+
+小张输入 `/plan` 进入 Plan 模式，描述需求：
+
+> "我需要为这个项目设计一个消息推送模块，支持 APNs 和 FCM 双通道，请帮我做技术方案。"
+
+ByteMind 进入 Plan 模式的阶段流转：
+- **explore**：通过 `search_text` 定位现有通知相关代码，`read_file` 了解当前架构风格
+- **clarify**：追问了几个关键问题（推送优先级策略、失败重试机制、是否需要本地消息队列）
+- **draft**：生成方案初稿，包含架构图描述、数据流设计、接口定义
+- **converge_ready**：方案待小张确认
+
+整个过程展示在 **Plan 面板**中，小张可以看到每个步骤的状态（pending/in_progress/completed/blocked）和风险等级标识（low/medium/high）。TUI 界面同时展示**上下文窗口使用量**，当接近 85% 告警线时触发了 warning 提示。
+
+### 4. 加载 Skill，引入 RFC 模板
+
+小张激活 `write-rfc` Skill：
+
+```
+/skill write-rfc
+```
+
+Skill 加载后，系统提示词被替换为 RFC 写作模板。他继续对话，ByteMind 按照 RFC 格式输出完整的技术方案文档，小张确认后方案进入 `approved_to_build` 阶段，Plan 模式自动将方案步骤写入执行计划。
+
+### 5. 持久化与收尾
+
+小张退出前用 `/sessions` 查看历史会话列表，确认方案会话已自动持久化。他配置了桌面通知，关闭终端后收到了审批请求的通知提醒。
+
+---
+
+**本故事覆盖功能**：
+
+| 模块 | 功能点 |
+|------|--------|
+| 运行模式 | `chat`/`tui` 交互模式, `install` 安装 |
+| Provider | OpenAI-compatible, Anthropic, Gemini 三适配; 多 Provider 注册路由; 模型动态切换; `auto_detect_type` |
+| TUI | Bubble Tea 全功能终端; 启动引导页; `/new` `/models` `/sessions` `/plan` 命令; 子智能体面板; Skill 面板; Plan 面板; 上下文窗口可视化; @mentions 补全; 命令面板 |
+| Plan 模式 | explore→clarify→draft→converge_ready→approved_to_build 阶段流转; 步骤状态跟踪; 风险等级; Plan 面板渲染 |
+| 工具 | `list_files`, `read_file`, `search_text`, `web_fetch`, `web_search` |
+| 子智能体 | `explorer` 代码探索; `delegate_subagent` 委托执行; builtin/user/project 三级管理 |
+| Skills | `write-rfc`; 三级 scope (builtin/user/project); Skill 激活/清除 |
+| 会话 | JSONL 持久化; 会话列表/恢复; 事件日志 |
+| 上下文 | 上下文窗口预算管理; warning/critical 告警 |
+| 通知 | 桌面通知 (审批/完成/失败); 通知冷却时间 |
+| 配置 | JSON 配置; 环境变量覆盖 (`BYTEMIND_HOME`); `provider_runtime` 多 provider 配置 |
+
+---
+
+## 故事二：开发阶段 — "实现消息推送模块"
+
+> **角色**：小张确认方案后，切换到 Build 模式开始写代码。
+
+### 1. 切换 Build 模式，全自动执行
+
+小张在 TUI 中恢复上次 Plan 会话（`/resume <id>`），然后切换到 Build 模式直接开始实现。他通过 `/models` 切换到 `deepseek-v4-pro` 以降低长任务成本。输入：
+
+> "按照刚才的方案，帮我实现消息推送模块，包括 APNs 和 FCM 两个 provider、消息队列、重试逻辑。"
+
+Build 模式下 ByteMind 直接开始干活，**流式输出**思考过程和工具调用结果，TUI 界面用不同颜色区分 thinking 和 assistant 内容。
+
+### 2. 高强度工具调用
+
+ByteMind 自动编排工具调用序列：
+- `write_file` 创建 `push/provider.go`、`push/apns.go`、`push/fcm.go`、`push/queue.go`、`push/retry.go` 等文件
+- `replace_in_file` 在现有模块中注入依赖
+- `apply_patch` 修复编译错误
+- `run_shell` 执行 `go mod tidy`、`go build` 等命令
+
+TUI 的 **Markdown 渲染器**将工具输出格式化展示，diff 内容带**语法高亮**，`run_shell` 的执行结果实时流式显示在终端中。
+
+### 3. 安全审批与沙箱
+
+当 ByteMind 尝试执行 `go build` 时，由于配置了 `approval_policy: "on-request"`，Shell 命令触发了**审批流程**。TUI 弹出审批对话框，显示命令内容和风险评估。小张确认后继续。
+
+小张之前配置了：
+
+```json
+{
+  "approval_policy": "on-request",
+  "sandbox_enabled": true,
+  "system_sandbox_mode": "non-blocking",
+  "writable_roots": ["/home/user/project"],
+  "exec_allowlist": [
+    { "command": "go", "args_pattern": ["build", "test", "mod", "vet", "fmt"] }
+  ],
+  "network_allowlist": [
+    { "host": "api.github.com", "port": "443" }
+  ]
+}
+```
+
+- 文件沙箱保证工具只能读写 `writable_roots` 范围内的文件
+- 命令白名单限制只能执行 `go build/test/mod/vet/fmt`
+- 网络沙箱限制只能访问 `api.github.com`
+
+当 ByteMind 尝试读取 `/etc/passwd` 时，**沙箱**直接 `deny` 并返回 `fs_out_of_scope`；尝试 `curl` 外网地址时被网络沙箱拦截，返回 `network_not_allowed`。
+
+### 4. Provider 故障自动切换
+
+实现过程中，OpenAI provider 突然返回 503 错误。ByteMind 的 **Provider 路由**检测到主 provider 不健康，自动通过**健康检查**切换到备用 Anthropic provider，任务无缝继续。小张在 TUI 的状态栏看到了 provider 切换提示。
+
+### 5. 子智能体并行加速
+
+编译时发现缺少 protobuf 定义，小张手动输入：
+
+> "帮我生成 push.proto 文件，然后用 protoc 编译"
+
+同时他派发 `general` 子智能体去写单元测试：
+
+```
+@general 帮我给 push/ 目录下所有文件写单元测试，覆盖正常路径和边界情况
+```
+
+子智能体在**后台运行**，通过 `task_output` 查看结果。TUI 底部的状态栏显示后台任务进度，完成后桌面弹出通知。
+
+### 6. 预算控制与上下文压缩
+
+实现过程中已跑了 50+ 轮工具调用，接近 `max_iterations: 64`。ByteMind 触发了**阶段性总结**（stop summary），归纳已完成的工作和剩余待办项。同时**上下文压缩**自动触发，将较早的对话压缩为摘要，释放上下文窗口空间。**重复调用检测**发现了两次相同的 `go build` 调用并及时终止。
+
+### 7. Token 用量监控
+
+TUI 右下角的 **Token 用量实时监控**组件显示了本轮会话的 token 消耗（输入/输出/总计），小张设置了 `alert_threshold: 100000`，当日总 token 接近阈值时弹出了告警。用量数据自动写入 SQLite 数据库（`database_driver: "sqlite"`）。
+
+---
+
+**本故事覆盖功能**：
+
+| 模块 | 功能点 |
+|------|--------|
+| 运行模式 | Build 模式; `--yolo` 全自动; `/resume` 会话恢复; `run` 单次任务 |
+| Provider | 健康检查; 故障自动切换; Provider 路由回退; 模型切换 |
+| 对话引擎 | 流式输出; 多轮对话; 工具调用循环; max_iterations 预算控制; stop summary 阶段总结; 重复调用检测; 上下文压缩 |
+| 工具 | `write_file`, `replace_in_file`, `apply_patch`, `run_shell`; 工具执行审计 |
+| TUI | Markdown 渲染; diff 语法高亮; 审批对话框; 后台任务状态栏; Token 用量实时监控; 桌面通知; 鼠标文本选择; 剪贴板粘贴; 图片输入 |
+| 审批安全 | on-request 分级审批; Shell 命令审批; 文件沙箱 (writable_roots); 网络沙箱 (network_allowlist); 命令白名单 (exec_allowlist); 沙箱 escalate/deny/allow 决策 |
+| 沙箱 | FS/Exec/Network 三级拦截; 审批通道 |
+| 子智能体 | `general` 子智能体; 子智能体并行执行; `task_output` 结果查看 |
+| 后台任务 | 后台/前台任务; 任务超时; worktree 隔离执行 |
+| Token | 实时监控; SQLite 持久化; 用量告警; 多存储后端 |
+| 上下文 | 自动压缩; 窗口预算 |
+
+---
+
+## 故事三：调试阶段 — "排查线上推送失败问题"
+
+> **角色**：小张实现的推送模块上线后出现间歇性推送失败，需要定位根因。
+
+### 1. 快速定位问题代码
+
+小张打开终端启动 TUI，恢复之前开发推送模块的会话继续对话：
+
+```bash
+bytemind chat
+```
+
+```
+/resume push-module
+```
+
+> "线上出现间歇性推送失败，错误日志显示 'connection timeout after 30s'，帮我排查根因。"
+
+ByteMind 通过 `search_text` 搜索代码中所有 timeout 相关配置，`read_file` 读取关键文件定位到 `push/apns.go` 中的 HTTP Client 超时设置为硬编码的 30s。
+
+### 2. 深入排查：Shell + Web 联动
+
+ByteMind 用 `run_shell` 执行 `go test -v -run TestAPNsRetry ./push/...` 查看测试覆盖情况，发现重试逻辑的单元测试没有覆盖 timeout 场景。
+
+接着用 `web_search` 搜索 "APNs timeout best practice 2026"，用 `web_fetch` 读取 Apple 官方文档中关于 connection timeout 的建议。
+
+### 3. 激活 Bug Investigation Skill
+
+小张激活内置的 `bug-investigation` Skill：
+
+```
+/skill bug-investigation
+```
+
+Skill 替换系统提示词为 Bug 调查专用模板，引导 ByteMind 从以下维度系统排查：
+- 问题复现条件
+- 影响范围（影响多少用户/设备）
+- 代码层面根因
+- 配置/环境因素
+- 修复方案与回归验证
+
+ByteMind 自动排查了：
+- `push/retry.go` 的重试策略是否对 timeout 场景生效
+- `push/fcm.go` 是否也有同样的硬编码问题
+- `config/config.go` 中是否有可配置的超时参数
+
+最终定位到两个问题：HTTP Client 超时硬编码 + 重试逻辑对 context.DeadlineExceeded 未正确捕获。
+
+### 4. 修复与验证
+
+> "把超时改成可配置的，默认值 60s；修复 retry.go 中对 context.DeadlineExceeded 的处理。"
+
+ByteMind 用 `replace_in_file` 修改了相关代码，用 `run_shell` 跑了 `go vet ./push/...`、`go test -race ./push/...` 验证。
+
+小张在 TUI 中通过鼠标**拖拽选择**了一段 diff 输出，`Ctrl+C` 复制后贴到代码审查文档里。diff 的**语法高亮**让改动一目了然。
+
+---
+
+**本故事覆盖功能**：
+
+| 模块 | 功能点 |
+|------|--------|
+| 运行模式 | `chat` TUI 交互; `/resume` 会话恢复 |
+| 工具 | `search_text`, `read_file`, `run_shell`, `replace_in_file`, `web_search`, `web_fetch` |
+| Skills | `bug-investigation` Skill 激活/清除; Skill 提示词替换 |
+| TUI | diff 语法高亮; 鼠标拖拽选择; 剪贴板复制; 终端流式输出 |
+| 对话引擎 | 多轮对话排查; 流式输出 |
+| 安全 | `run_shell` 命令白名单审批 |
+| 会话 | 历史会话恢复 |
+
+---
+
+## 故事四：代码审查 — "Review 推送模块 PR"
+
+> **角色**：小张的同事小王负责 Review 这次改动，他用 ByteMind 进行深度代码审查。
+
+### 1. 启动审查
+
+小王拉取 PR 分支后在项目目录启动 ByteMind：
+
+```bash
+bytemind chat
+```
+
+> "帮我 review 当前分支相对于 main 的所有改动，重点关注并发安全、错误处理、资源泄漏。"
+
+### 2. 激活 Review Skill + Review 子智能体
+
+小王先激活 `review` Skill：
+
+```
+/skill review
+```
+
+Review Skill 提供了结构化的审查框架（安全性、性能、可维护性、测试覆盖等维度）。
+
+同时他派发 `review` 内置子智能体：
+
+```
+@review 审查 push/ 目录下所有文件，检查并发安全问题
+```
+
+子智能体通过 `read_file`、`search_text` 检查了 mutex 使用、goroutine 泄漏、channel 关闭等问题。返回结果指出了 `push/queue.go` 中一处 channel 未正确关闭可能导致 goroutine 泄漏的问题。
+
+### 3. 逐文件审查与 MCP 集成
+
+小王之前通过 `bytemind mcp add` 接入了团队的代码质量 MCP 服务器：
+
+```bash
+bytemind mcp add my-linter -- node ./linter-mcp-server.js
+bytemind mcp list
+bytemind mcp health my-linter
+```
+
+在 TUI 中，MCP 工具自动注册到 ByteMind，审查时额外调用了 MCP 提供的静态分析能力。小王可以在 **MCP 管理面板**中查看所有 MCP 服务器的健康状态。
+
+ByteMind 逐文件审查：
+- `push/apns.go` — `read_file` 检查 HTTP Client 连接池配置
+- `push/fcm.go` — `search_text` 搜索 error handling 模式
+- `push/queue.go` — 重点审查 channel 生命周期
+- `push/retry.go` — 检查 backoff 策略和 context 取消传播
+
+### 4. Diff 预览与总结
+
+小王用 diff 预览工具查看改动：
+
+> "展示当前分支所有改动的 diff 摘要。"
+
+ByteMind 用 `diff_preview` 工具生成变更摘要。TUI 的 **diff 渲染器**将增删改分别用绿色/红色/黄色高亮展示。
+
+最终 ByteMind 输出了一份结构化 Review 报告，包含：
+- 严重问题（goroutine 泄漏）→ 风险等级 high
+- 建议改进（超时配置应加校验）→ 风险等级 medium
+- 测试覆盖分析（timeout 场景已覆盖）→ 通过
+
+整个审查会话自动**持久化**为 JSONL，小王用 `/sessions` 可以随时回溯。他想看看这次审查消耗了多少 token，通过 `/session` 查看当前会话的消息统计和 token 消耗。
+
+---
+
+**本故事覆盖功能**：
+
+| 模块 | 功能点 |
+|------|--------|
+| 运行模式 | `bytemind mcp add/list/health` MCP 管理命令 |
+| TUI | MCP 管理面板; diff 渲染器 (绿/红/黄); 会话消息统计 |
+| 工具 | `read_file`, `search_text`, `diff_preview` |
+| Skills | `review` Skill 结构化审查框架 |
+| 子智能体 | `review` 子智能体; builtin 内置定义 |
+| MCP | MCP 服务器增删查; 健康检查; MCP 工具自动注册; MCP 面板 |
+| 扩展 | Extensions 生命周期管理; MCP adapter |
+| 会话 | JSONL 持久化; 会话列表; 消息统计 |
+| Token | 会话级 token 消耗展示 |
+
+---
+
+## 功能覆盖率总览
+
+以下按模块列出所有功能点及其在四个故事中的分布：
+
+| 模块 | 功能点 | 故事一(设计) | 故事二(开发) | 故事三(Debug) | 故事四(Review) |
+|------|--------|:---:|:---:|:---:|:---:|
+| **运行模式** | `chat`/`tui` 交互 | ✅ | ✅ | ✅ | ✅ |
+| | `run` 单次任务 | | ✅ | | |
+| | `worker` 后台进程 | | ✅ | | |
+| | `install` 安装 | ✅ | | | |
+| | `mcp` MCP 管理 | | | | ✅ |
+| | `version` 版本 | | | | |
+| | `--yolo` 全自动 | | ✅ | | |
+| **Provider** | OpenAI-compatible 适配 | ✅ | | | |
+| | Anthropic 适配 | ✅ | | | |
+| | Gemini 适配 | ✅ | | | |
+| | 多 Provider 注册与路由 | ✅ | ✅ | | |
+| | 健康检查 + 故障切换 | | ✅ | | |
+| | 模型列表查询 | ✅ | | | |
+| | `auto_detect_type` | ✅ | | | |
+| **对话引擎** | 多轮对话 | ✅ | ✅ | ✅ | |
+| | 流式输出 | ✅ | ✅ | ✅ | |
+| | Build 模式 | | ✅ | | |
+| | Plan 模式 | ✅ | | | |
+| | 上下文压缩 | | ✅ | | |
+| | max_iterations 预算 | | ✅ | | |
+| | 重复调用检测 | | ✅ | | |
+| | stop summary | | ✅ | | |
+| | 子智能体委托 | ✅ | ✅ | | ✅ |
+| **工具** | `list_files` | ✅ | | | ✅ |
+| | `read_file` | ✅ | | ✅ | ✅ |
+| | `search_text` | ✅ | | ✅ | ✅ |
+| | `write_file` | | ✅ | | |
+| | `replace_in_file` | | ✅ | ✅ | |
+| | `apply_patch` | | ✅ | | |
+| | `run_shell` | | ✅ | ✅ | |
+| | `web_fetch` | ✅ | | ✅ | |
+| | `web_search` | ✅ | | ✅ | |
+| | `delegate_subagent` | ✅ | ✅ | | |
+| | `task_output` / `task_stop` | | ✅ | | |
+| | `diff_preview` | | | | ✅ |
+| **TUI** | Bubble Tea 终端 UI | ✅ | ✅ | ✅ | ✅ |
+| | Markdown 渲染 | | ✅ | | |
+| | diff 语法高亮 | | ✅ | ✅ | ✅ |
+| | 鼠标支持 (选择/拖拽/滚动) | | ✅ | ✅ | |
+| | 剪贴板粘贴 | | ✅ | ✅ | |
+| | 图片输入 | | ✅ | | |
+| | 会话管理面板 | ✅ | | | ✅ |
+| | 模型切换 (`/models`) | ✅ | ✅ | | |
+| | 子智能体面板 | ✅ | | | |
+| | Skill 面板 | ✅ | | | |
+| | MCP 面板 | | | | ✅ |
+| | Plan 面板 | ✅ | | | |
+| | Token 用量监控 | | ✅ | | |
+| | 命令面板/调色板 | ✅ | | | |
+| | @mentions 自动补全 | ✅ | | | |
+| | `/` 命令补全 | ✅ | | | |
+| | 桌面通知 | ✅ | ✅ | | |
+| | 启动引导页 | ✅ | | | |
+| | 上下文窗口可视化 | ✅ | | | |
+| | 审批对话框 | | ✅ | | |
+| | 后台任务状态栏 | | ✅ | | |
+| | 增强输入框 (多行) | ✅ | | | |
+| **审批安全** | on-request / away / full_access | | ✅ | | |
+| | Shell 命令审批 | | ✅ | ✅ | |
+| | 文件沙箱 (writable_roots) | | ✅ | | |
+| | 网络沙箱 (network_allowlist) | | ✅ | | |
+| | 命令白名单 (exec_allowlist) | | ✅ | ✅ | |
+| | worktree 隔离 | | ✅ | | |
+| **扩展系统** | MCP 服务器增删查 | | | | ✅ |
+| | MCP 健康检查 | | | | ✅ |
+| | Skills (6 个内置) | ✅ | | ✅ | ✅ |
+| | Skill 三级 scope | ✅ | | | |
+| | 子智能体 (3 个内置) | ✅ | ✅ | | ✅ |
+| | 子智能体三级 scope | ✅ | | | |
+| **Plan 模式** | 阶段流转 | ✅ | | | |
+| | 步骤状态跟踪 | ✅ | | | |
+| | 风险等级 | ✅ | | | |
+| | Plan 面板渲染 | ✅ | | | |
+| **会话** | JSONL 持久化 | ✅ | ✅ | | ✅ |
+| | 会话列表/恢复 | ✅ | | ✅ | ✅ |
+| | 事件日志 | ✅ | | | |
+| | 会话删除/清理 | | | | |
+| | 消息统计 | | | | ✅ |
+| **上下文** | 窗口预算管理 | ✅ | ✅ | | |
+| | warning/critical 告警 | ✅ | | | |
+| | 自动压缩 | | ✅ | | |
+| **Token** | 用量追踪 | | ✅ | | ✅ |
+| | 多后端 (文件/DB/内存) | | ✅ | | |
+| | 用量告警 | | ✅ | | |
+| | 实时监控 | | ✅ | | |
+| **后台任务** | 后台/前台执行 | | ✅ | | |
+| | 超时控制 | | ✅ | | |
+| | 重试 | | ✅ | | |
+| | worktree 隔离 | | ✅ | | |
+| **通知** | 桌面通知 | ✅ | ✅ | | |
+| | 审批/完成/失败通知 | ✅ | ✅ | | |
+| | 冷却时间 | ✅ | | | |
+| **配置** | JSON 配置文件 | ✅ | | | |
+| | 环境变量覆盖 | ✅ | | | |
+| | `provider_runtime` | ✅ | | | |
+| | 更新检查 | | | | |
diff --git a/tui/input_paste.go b/tui/input_paste.go
index be093561..e276c933 100644
--- a/tui/input_paste.go
+++ b/tui/input_paste.go
@@ -902,7 +902,7 @@ func (m *model) resolvePromptPastedInput(raw string) (string, error) {
 }
 
 func (m *model) ensurePastedContentState() {
-	if m == nil || m.pastedStateLoaded {
+	if m == nil {
 		return
 	}
 	if m.pastedContents == nil {
@@ -914,6 +914,9 @@ func (m *model) ensurePastedContentState() {
 	if m.nextPasteID <= 0 {
 		m.nextPasteID = 1
 	}
+	if m.pastedStateLoaded {
+		return
+	}
 	m.pastedStateLoaded = true
 
 	if m.sess == nil || m.sess.Conversation.Meta == nil {
diff --git a/tui/input_paste_test.go b/tui/input_paste_test.go
index 1c9144af..34c32c9f 100644
--- a/tui/input_paste_test.go
+++ b/tui/input_paste_test.go
@@ -491,6 +491,48 @@ func TestSubmitPromptExpandsPasteReferenceForDisplayedChatBodyAndClearsPasteStat
 	}
 }
 
+func TestClipboardPasteCaptureAfterSubmittedPasteReinitializesState(t *testing.T) {
+	m := newImagePipelineModel(t)
+	_, stored, err := m.compressPastedText("old1\nold2\nold3\nold4\nold5\nold6\nold7\nold8\nold9\nold10\nold11")
+	if err != nil {
+		t.Fatalf("compress pasted text: %v", err)
+	}
+	got, _ := m.submitPrompt("inspect [Paste #" + stored.ID + " ~11 lines]")
+	updated := got.(model)
+	if updated.pastedContents != nil || updated.pastedOrder != nil {
+		t.Fatalf("expected submit to clear pasted state, got contents=%v order=%v", updated.pastedContents, updated.pastedOrder)
+	}
+
+	clipboardText := strings.Join([]string{
+		"abcd first pasted line",
+		"second pasted line",
+		"third pasted line",
+		"fourth pasted line",
+		"fifth pasted line",
+		"sixth pasted line",
+		"seventh pasted line",
+		"eighth pasted line",
+		"ninth pasted line",
+		"tenth pasted line",
+		"eleventh pasted line",
+		"twelfth pasted line",
+	}, "\n")
+	updated.clipboardRead = fakeClipboardTextReader{text: clipboardText}
+	updated.clipboardCaptureArmedUntil = time.Now().Add(time.Second)
+	updated.input.SetValue("abcd")
+
+	result := updated.handleInputMutation("abc", "abcd", "")
+	if !regexp.MustCompile(`^\[Paste #\d+ ~\d+ lines\]$`).MatchString(result) {
+		t.Fatalf("expected clipboard capture to compress into a paste marker, got %q", result)
+	}
+	if updated.pastedContents == nil || len(updated.pastedContents) != 1 {
+		t.Fatalf("expected pasted contents to be reinitialized with one entry, got %#v", updated.pastedContents)
+	}
+	if updated.pastedOrder == nil || len(updated.pastedOrder) != 1 {
+		t.Fatalf("expected pasted order to be reinitialized with one entry, got %#v", updated.pastedOrder)
+	}
+}
+
 func TestStorePastedContentKeepsRecentLimit(t *testing.T) {
 	m := newImagePipelineModel(t)
 	for i := 0; i < maxStoredPastedContents+2; i++ {
@@ -1096,3 +1138,265 @@ func TestShouldMergeIntoLatestMarkerRequiresPasteEvidence(t *testing.T) {
 		t.Fatalf("expected stale transaction window to skip merge")
 	}
 }
+
+func TestResetPasteBurstTracking(t *testing.T) {
+	m := newImagePipelineModel(t)
+	m.inputBurstBaseValue = "test-base"
+	m.pasteBurstCandidate = pasteBurstCandidateState{
+		active:    true,
+		baseInput: "test-base",
+		startedAt: time.Now(),
+	}
+	m.resetPasteBurstTracking()
+	if m.inputBurstBaseValue != "" {
+		t.Fatalf("expected inputBurstBaseValue to be cleared, got %q", m.inputBurstBaseValue)
+	}
+	if m.pasteBurstCandidate.active {
+		t.Fatalf("expected pasteBurstCandidate to be cleared after reset")
+	}
+}
+
+func TestResetPasteBurstTrackingNilModel(t *testing.T) {
+	var m *model
+	m.resetPasteBurstTracking()
+	// should not panic
+}
+
+func TestCaptureImplicitPasteCandidateNilModel(t *testing.T) {
+	var m *model
+	msg := tea.KeyMsg{Type: tea.KeyRunes, Runes: []rune{'a'}}
+	cmd := m.captureImplicitPasteCandidate(msg)
+	if cmd != nil {
+		t.Fatalf("expected nil command from nil model")
+	}
+}
+
+func TestCaptureImplicitPasteCandidateNonPromotableKey(t *testing.T) {
+	m := newImagePipelineModel(t)
+	m.input.SetValue("hello")
+	m.lastInputAt = time.Now().Add(-50 * time.Millisecond)
+
+	// Use Escape which implicitPasteCandidateFragment returns ok=false for
+	msg := tea.KeyMsg{Type: tea.KeyEscape}
+	cmd := m.captureImplicitPasteCandidate(msg)
+	if cmd != nil {
+		t.Fatalf("expected nil command for non-fragment key (Escape)")
+	}
+}
+
+func TestCaptureImplicitPasteCandidateWithPasteMsg(t *testing.T) {
+	m := newImagePipelineModel(t)
+	msg := tea.KeyMsg{Paste: true}
+	cmd := m.captureImplicitPasteCandidate(msg)
+	if cmd != nil {
+		t.Fatalf("expected nil command when msg.Paste is true")
+	}
+}
+
+func TestCaptureImplicitPasteSpecialKeyNilModel(t *testing.T) {
+	var m *model
+	msg := tea.KeyMsg{Type: tea.KeyEnter}
+	cmd := m.captureImplicitPasteSpecialKey(msg)
+	if cmd != nil {
+		t.Fatalf("expected nil command from nil model")
+	}
+}
+
+func TestCaptureImplicitPasteSpecialKeyEnterStartsSession(t *testing.T) {
+	m := newImagePipelineModel(t)
+	msg := tea.KeyMsg{Type: tea.KeyEnter}
+	cmd := m.captureImplicitPasteSpecialKey(msg)
+	if cmd == nil {
+		t.Fatalf("expected non-nil command for Enter key (starts implicit paste session)")
+	}
+	if !m.pasteSession.active {
+		t.Fatalf("expected paste session to be active after implicit special key capture")
+	}
+}
+
+func TestCaptureImplicitPasteSpecialKeyTabStartsSession(t *testing.T) {
+	m := newImagePipelineModel(t)
+	msg := tea.KeyMsg{Type: tea.KeyTab}
+	cmd := m.captureImplicitPasteSpecialKey(msg)
+	if cmd == nil {
+		t.Fatalf("expected non-nil command for Tab key (starts implicit paste session)")
+	}
+	if m.pasteSession.sourceKind != "implicit-tab" {
+		t.Fatalf("expected implicit-tab source kind, got %q", m.pasteSession.sourceKind)
+	}
+}
+
+func TestIsSplitPasteContinuationEmptyInput(t *testing.T) {
+	if isSplitPasteContinuation("   ", "paste-key", time.Now()) {
+		t.Fatalf("expected empty trimmed input to not be a split continuation")
+	}
+}
+
+func TestIsSplitPasteContinuationPathInput(t *testing.T) {
+	if isSplitPasteContinuation(`C:\Users\test\file`, "paste-key", time.Now()) {
+		t.Fatalf("expected path-like input to not be a split continuation")
+	}
+}
+
+func TestIsSplitPasteContinuationNonPasteSource(t *testing.T) {
+	if isSplitPasteContinuation("some long text that is not paste", "rune", time.Now()) {
+		t.Fatalf("expected non-paste source to not be a split continuation")
+	}
+}
+
+func TestIsSplitPasteContinuationContainsMarker(t *testing.T) {
+	if isSplitPasteContinuation("[Paste #1 ~11 lines]", "paste-key", time.Now()) {
+		t.Fatalf("expected input containing paste marker to not be a split continuation")
+	}
+}
+
+func TestIsSplitPasteContinuationWithinWindow(t *testing.T) {
+	if !isSplitPasteContinuation("some quick text", "paste-key", time.Now().Add(-500*time.Millisecond)) {
+		t.Fatalf("expected split continuation within paste continuation window")
+	}
+}
+
+func TestIsSplitPasteContinuationOutsideWindowButMultiLine(t *testing.T) {
+	if !isSplitPasteContinuation("line1\nline2\nline3", "paste-key", time.Now().Add(-3*time.Second)) {
+		t.Fatalf("expected multi-line paste to be a split continuation even outside window")
+	}
+}
+
+func TestIsSplitPasteContinuationOutsideWindowButLong(t *testing.T) {
+	longText := strings.Repeat("a", pasteQuickCharThreshold)
+	if !isSplitPasteContinuation(longText, "paste-key", time.Now().Add(-3*time.Second)) {
+		t.Fatalf("expected long paste to be a split continuation even outside window")
+	}
+}
+
+func TestIsSplitPasteContinuationZeroLastPasteAt(t *testing.T) {
+	if isSplitPasteContinuation("short", "paste-key", time.Time{}) {
+		t.Fatalf("expected short single line with zero lastPasteAt to not be a split continuation")
+	}
+}
+
+func TestLooksLikePastedFragmentWithWhitespace(t *testing.T) {
+	if !looksLikePastedFragment("text with spaces") {
+		t.Fatalf("expected text with spaces to look like a pasted fragment")
+	}
+	if !looksLikePastedFragment("text\twith\ttabs") {
+		t.Fatalf("expected text with tabs to look like a pasted fragment")
+	}
+}
+
+func TestLooksLikePastedFragmentPlain(t *testing.T) {
+	short := strings.Repeat("x", 63)
+	if looksLikePastedFragment(short) {
+		t.Fatalf("expected short text without whitespace under 64 chars to not look like a pasted fragment")
+	}
+	long := strings.Repeat("x", 64)
+	if !looksLikePastedFragment(long) {
+		t.Fatalf("expected text of 64 chars to look like a pasted fragment")
+	}
+}
+
+func TestShouldMergeIntoLatestMarkerNilModel(t *testing.T) {
+	var m *model
+	if m.shouldMergeIntoLatestMarker("rune") {
+		t.Fatalf("expected nil model to return false")
+	}
+}
+
+func TestImplicitPasteCandidateFragmentPasteMsg(t *testing.T) {
+	m := newImagePipelineModel(t)
+	msg := tea.KeyMsg{Paste: true}
+	frag, source, ok := m.implicitPasteCandidateFragment(msg)
+	if ok {
+		t.Fatalf("expected paste message to not be a candidate fragment, got frag=%q source=%q", frag, source)
+	}
+}
+
+func TestImplicitPasteCandidateFragmentEmptyRunes(t *testing.T) {
+	m := newImagePipelineModel(t)
+	msg := tea.KeyMsg{Type: tea.KeyRunes, Runes: []rune{}}
+	_, _, ok := m.implicitPasteCandidateFragment(msg)
+	if ok {
+		t.Fatalf("expected empty runes to not be a candidate fragment")
+	}
+}
+
+func TestImplicitPasteCandidateFragmentUnhandledKey(t *testing.T) {
+	m := newImagePipelineModel(t)
+	msg := tea.KeyMsg{Type: tea.KeyEscape}
+	_, _, ok := m.implicitPasteCandidateFragment(msg)
+	if ok {
+		t.Fatalf("expected unhandled key type to not be a candidate fragment")
+	}
+}
+
+func TestShouldPromoteImplicitPasteCandidateNilModel(t *testing.T) {
+	var m *model
+	msg := tea.KeyMsg{Type: tea.KeyEnter}
+	if m.shouldPromoteImplicitPasteCandidate(msg) {
+		t.Fatalf("expected nil model to return false for promote")
+	}
+}
+
+func TestShouldCaptureImplicitPasteSpecialKeyNilModel(t *testing.T) {
+	var m *model
+	msg := tea.KeyMsg{Type: tea.KeyEnter}
+	if m.shouldCaptureImplicitPasteSpecialKey(msg) {
+		t.Fatalf("expected nil model to return false")
+	}
+}
+
+func TestShouldCaptureImplicitPasteSpecialKeyPasteMsg(t *testing.T) {
+	m := newImagePipelineModel(t)
+	msg := tea.KeyMsg{Paste: true}
+	if m.shouldCaptureImplicitPasteSpecialKey(msg) {
+		t.Fatalf("expected paste message to not be captured as special key")
+	}
+}
+
+func TestShouldCaptureImplicitPasteSpecialKeyNonEnterTab(t *testing.T) {
+	m := newImagePipelineModel(t)
+	msg := tea.KeyMsg{Type: tea.KeyEscape}
+	if m.shouldCaptureImplicitPasteSpecialKey(msg) {
+		t.Fatalf("expected non-enter/tab key to not be captured")
+	}
+}
+
+func TestCountCompressedMarkersHandlesVariousInputs(t *testing.T) {
+	if got := countCompressedMarkers(""); got != 0 {
+		t.Fatalf("expected empty string count to be 0, got %d", got)
+	}
+	if got := countCompressedMarkers("[Paste #1 ~11 lines]"); got != 1 {
+		t.Fatalf("expected single marker count to be 1, got %d", got)
+	}
+	if got := countCompressedMarkers("[Pasted #3 ~20 lines]"); got != 1 {
+		t.Fatalf("expected Pasted marker count to be 1, got %d", got)
+	}
+}
+
+func TestDropLatestCompressedMarkerNoMarker(t *testing.T) {
+	result := dropLatestCompressedMarker("plain text with no markers")
+	if result != "plain text with no markers" {
+		t.Fatalf("expected plain text unchanged, got %q", result)
+	}
+}
+
+func TestDropLatestCompressedMarkerSingleMarker(t *testing.T) {
+	result := dropLatestCompressedMarker("[Paste #1 ~11 lines]")
+	if result != "" {
+		t.Fatalf("expected single marker to be dropped to empty, got %q", result)
+	}
+}
+
+func TestDropLatestCompressedMarkerMultipleMarkers(t *testing.T) {
+	result := dropLatestCompressedMarker("[Paste #1 ~11 lines] [Paste #2 ~15 lines]")
+	if result != "[Paste #1 ~11 lines]" {
+		t.Fatalf("expected only latest marker dropped, got %q", result)
+	}
+}
+
+func TestDropLatestCompressedMarkerWithTextBefore(t *testing.T) {
+	result := dropLatestCompressedMarker("before text [Paste #1 ~11 lines]")
+	if result != "before text" {
+		t.Fatalf("expected marker removed with surrounding text preserved, got %q", result)
+	}
+}
diff --git a/www/.vitepress/config.mts b/www/.vitepress/config.mts
index 4a6c311b..2d1b3971 100644
--- a/www/.vitepress/config.mts
+++ b/www/.vitepress/config.mts
@@ -36,6 +36,7 @@ const enSidebar = [
       { text: 'Fix a Bug', link: '/examples/fix-bug' },
       { text: 'Refactor Code', link: '/examples/refactor' },
       { text: 'Generate Documentation', link: '/examples/doc-generation' },
+      { text: 'User Stories', link: '/examples/user-stories' },
     ],
   },
   {
@@ -87,6 +88,7 @@ const zhSidebar = [
       { text: '修复 Bug', link: '/zh/examples/fix-bug' },
       { text: '代码重构', link: '/zh/examples/refactor' },
       { text: '文档生成', link: '/zh/examples/doc-generation' },
+      { text: '用户故事', link: '/zh/examples/user-stories' },
     ],
   },
   {
diff --git a/www/api-key.md b/www/api-key.md
index cc866a9a..3deb2f96 100644
--- a/www/api-key.md
+++ b/www/api-key.md
@@ -130,6 +130,44 @@ DeepSeek uses the OpenAI-compatible format, so use `openai-compatible`.
 
 DeepSeek's official OpenAI-format base URL is `https://api.deepseek.com`. ByteMind appends the default `/chat/completions` path, so do not add `/chat/completions` yourself.
 
+**Can I use `api_key_env` instead of `api_key`?**
+
+Yes, and it's safer. Replace `"api_key"` with `"api_key_env": "DEEPSEEK_API_KEY"`, then set the environment variable:
+
+<Tabs default-tab="PowerShell">
+<Tab title="PowerShell">
+
+```powershell
+# Temporary (current window only):
+$env:DEEPSEEK_API_KEY = "sk-..."
+
+# Permanent (survives reboots):
+[Environment]::SetEnvironmentVariable("DEEPSEEK_API_KEY", "sk-...", "User")
+```
+
+</Tab>
+
+<Tab title="Linux">
+
+```bash
+export DEEPSEEK_API_KEY="sk-..."
+```
+
+</Tab>
+
+<Tab title="macOS">
+
+```bash
+export DEEPSEEK_API_KEY="sk-..."
+```
+
+</Tab>
+</Tabs>
+
+:::warning Don't set both `api_key` and `api_key_env`
+If both are present, `api_key` takes priority and `api_key_env` is ignored. Use one or the other.
+:::
+
 **Can I use any model ID?**
 
 No. The model ID must exactly match the name in the provider's documentation. For DeepSeek, start with `deepseek-v4-flash`; if you need stronger capability, switch to `deepseek-v4-pro` according to the official docs.
diff --git a/www/configuration.md b/www/configuration.md
index 69d931c7..1286cfb2 100644
--- a/www/configuration.md
+++ b/www/configuration.md
@@ -81,6 +81,94 @@ Any endpoint that speaks the OpenAI chat completions format works:
 Set `"auto_detect_type": true` to let ByteMind infer the provider type from `base_url` automatically.
 :::
 
+## Multi-Provider Setup (Model Switching)
+
+ByteMind supports configuring multiple model providers and switching between them at runtime without restarting. Use `provider_runtime` instead of the legacy `provider` field:
+
+```json
+{
+  "provider_runtime": {
+    "current_provider": "deepseek",
+    "default_provider": "deepseek",
+    "default_model": "deepseek-v4-flash",
+    "allow_fallback": false,
+    "providers": {
+      "deepseek": {
+        "type": "openai-compatible",
+        "base_url": "https://api.deepseek.com",
+        "api_key_env": "DEEPSEEK_API_KEY",
+        "model": "deepseek-v4-flash",
+        "models": ["deepseek-v4-flash", "deepseek-v4-pro"]
+      },
+      "openai": {
+        "type": "openai-compatible",
+        "base_url": "https://api.openai.com/v1",
+        "api_key_env": "OPENAI_API_KEY",
+        "model": "gpt-5.4-mini",
+        "models": ["gpt-5.4-mini", "gpt-5.4"]
+      }
+    }
+  }
+}
+```
+
+### Key points
+
+- **`providers.<id>.models`** is required — this is the list of models you can switch between with `/model`.
+- **`providers.<id>.model`** is the currently active model for that provider. It gets updated automatically when you switch.
+- **`api_key_env`** is preferred over `api_key` — keeps secrets out of your config file.
+
+### Switching models
+
+| Command | What it does |
+| ------- | ------------ |
+| `/model` | Open interactive picker to browse all configured providers and models |
+| `/model deepseek/deepseek-v4-pro` | Switch directly to deepseek-v4-pro |
+| `/model openai/gpt-5.4` | Switch directly to GPT-5.4 |
+| `/models` | Show all discovered models and current active model |
+
+The config file is updated automatically after switching — no need to edit it manually.
+
+### Adding a new provider
+
+Edit your `config.json` and add a new entry under `provider_runtime.providers`:
+
+```json
+"providers": {
+  "deepseek": { ... },
+  "openai": { ... },
+  "anthropic": {
+    "type": "anthropic",
+    "base_url": "https://api.anthropic.com",
+    "api_key_env": "ANTHROPIC_API_KEY",
+    "model": "claude-sonnet-4-20250514",
+    "models": ["claude-sonnet-4-20250514", "claude-opus-4-20250514"]
+  }
+}
+```
+
+Restart ByteMind or use `/model` to see the new provider in the picker.
+
+### Adding a new model to an existing provider
+
+Edit the `models` array for that provider:
+
+```json
+"deepseek": {
+  "type": "openai-compatible",
+  "base_url": "https://api.deepseek.com",
+  "api_key_env": "DEEPSEEK_API_KEY",
+  "model": "deepseek-v4-flash",
+  "models": ["deepseek-v4-flash", "deepseek-v4-pro", "deepseek-v4-flash-2"]
+}
+```
+
+Save and use `/model` to see the new model in the picker.
+
+:::tip Migrating from the legacy `provider` field
+If your config has `provider` (single), ByteMind auto-converts it to `provider_runtime` on startup. After using `/model` to switch, the config is persisted in `provider_runtime` format. You can also manually restructure your config following the multi-provider example above.
+:::
+
 ## Approval Policy
 
 `approval_policy` controls when high-risk tools (file writes, shell commands) pause for confirmation:
diff --git a/www/examples/user-stories.md b/www/examples/user-stories.md
new file mode 100644
index 00000000..2f7a70f2
--- /dev/null
+++ b/www/examples/user-stories.md
@@ -0,0 +1,193 @@
+﻿# User Stories
+
+Four end-to-end scenarios covering all of ByteMind's functionality.
+
+---
+
+## Story 1: Design — "Creating a Technical Plan for a New Module"
+
+**Role**: A backend engineer who needs to produce a technical plan for a push notification module.
+
+### 1. Install and Onboard
+
+Install ByteMind on Windows:
+
+ + '`' + powershell
+iwr -useb https://raw.githubusercontent.com/1024XEngineer/bytemind/main/scripts/install.ps1 | iex
+ + '`' + 
+
+Copy the example config, fill in your API key, and add backup providers.
+
+### 2. Launch Plan Mode and Explore
+
+ + '`' + ash
+bytemind chat
+ + '`' + 
+
+Use  + '/new' +  to create a session,  + '/models' +  to switch models. Press Tab to open the sub-agent panel:
+
+ + '`' + 	ext
+@explorer Map all code files and module dependencies related to push notifications.
+ + '`' + 
+
+The sub-agent traverses the project with list_files, search_text, and read_file, returning a dependency report.
+
+### 3. Plan Mode: From Exploration to Proposal
+
+Enter /plan:
+
+> "Design a push notification module supporting both APNs and FCM channels."
+
+ByteMind walks through the Plan phase pipeline: explore -> clarify -> draft -> converge_ready -> approved_to_build.
+
+The Plan panel shows step status and risk levels. Context window usage triggers a warning at 85%.
+
+### 4. Load a Skill
+
+Activate the write-rfc skill: /skill write-rfc
+
+### 5. Persistence
+
+Use /sessions to view the session list. Sessions auto-persist as JSONL.
+
+**Covered features**: Run modes (chat/tui, install), Providers (OpenAI-compatible, Anthropic, Gemini, routing, model switching), TUI (Bubble Tea, onboarding, panels, context visualization, autocomplete), Plan mode (phase pipeline, step tracking, risk levels), Tools (list_files, read_file, search_text, web_fetch, web_search), Sub-agents (explorer), Skills (write-rfc), Sessions (JSONL persistence, restore), Context (window budget, alerts), Notifications, Config
+
+---
+
+## Story 2: Development — "Building the Push Notification Module"
+
+**Role**: After confirming the plan, switch to Build mode to start coding.
+
+### 1. Switch to Build Mode
+
+Resume with /resume <id>, switch to Build mode:
+
+> "Implement the push notification module: APNs and FCM providers, message queue, and retry logic."
+
+Build mode streams thinking and tool calls in different colors.
+
+### 2. High-Intensity Tool Calling
+
+ByteMind orchestrates: write_file, replace_in_file, apply_patch, run_shell. The TUI renders tool output with syntax-highlighted diffs.
+
+### 3. Safety Approval and Sandbox
+
+With approval_policy: "on-request", shell commands trigger an approval dialog. Sandbox enforces:
+- File sandbox (writable_roots)
+- Command whitelist (go build/test/mod/vet/fmt)
+- Network sandbox (api.github.com only)
+
+### 4. Provider Failover
+
+When primary provider returns 503, health checks auto-switch to backup provider. Status bar shows the switch.
+
+### 5. Parallel Sub-agent Acceleration
+
+Dispatch general sub-agent in background: @general Write unit tests for push/. Check results with task_output.
+
+### 6. Budget Control and Context Compression
+
+After 50+ tool calls, stop summary triggers. Context compression auto-compresses earlier turns. Duplicate call detection catches repeats.
+
+### 7. Token Usage Monitoring
+
+Real-time token monitor shows input/output/total. Alert fires near threshold. Data persisted to SQLite.
+
+**Covered features**: Build mode, yolo, Provider health/failover, Streaming, tool loop, max_iterations, stop summary, duplicate detection, context compression, write_file/replace_in_file/apply_patch/run_shell, Markdown rendering, diff highlighting, approval dialog, background tasks, token monitoring, file/network/command sandbox, worktree isolation
+
+---
+
+## Story 3: Debugging — "Investigating Push Failure in Production"
+
+**Role**: Push module has intermittent failures. Find the root cause.
+
+### 1. Quick Problem Location
+
+Resume session: /resume push-module
+
+> "Push notifications failing intermittently. Error: 'connection timeout after 30s'."
+
+ByteMind searches for timeout configs, finds hardcoded 30s in push/apns.go.
+
+### 2. Deep Investigation: Shell + Web
+
+Runs go test -v -run TestAPNsRetry ./push/..., finds no timeout coverage. Web searches "APNs timeout best practice", fetches Apple docs.
+
+### 3. Activate Bug Investigation Skill
+
+/skill bug-investigation replaces prompt with bug investigation template. Systematically checks:
+- Reproduction conditions
+- Impact scope
+- Code root cause
+- Config/environment factors
+- Fix plan and regression
+
+Finds two root causes: hardcoded timeout + retry logic not catching context.DeadlineExceeded.
+
+### 4. Fix and Verify
+
+> "Make timeout configurable (default 60s), fix retry.go context.DeadlineExceeded handling."
+
+ByteMind uses replace_in_file, then go vet and go test -race to verify.
+
+**Covered features**: chat TUI, /resume, search_text, read_file, run_shell, replace_in_file, web_search, web_fetch, bug-investigation skill, diff syntax highlighting, mouse/clipboard, command whitelist
+
+---
+
+## Story 4: Code Review — "Reviewing the Push Module PR"
+
+**Role**: A teammate reviews the push module changes with deep analysis.
+
+### 1. Start the Review
+
+ + '`' + ash
+bytemind chat
+ + '`' + 
+
+> "Review this branch against main. Focus on concurrency safety, error handling, resource leaks."
+
+### 2. Activate Review Skill + Review Sub-agent
+
+/skill review then @review Review all files under push/ for concurrency safety issues.
+
+Sub-agent finds improperly closed channel in push/queue.go causing goroutine leak.
+
+### 3. Per-File Review with MCP Integration
+
+ + '`' + ash
+bytemind mcp add my-linter -- node ./linter-mcp-server.js
+bytemind mcp list
+bytemind mcp health my-linter
+ + '`' + 
+
+MCP tools auto-register. Reviews push/apns.go, push/fcm.go, push/queue.go, push/retry.go.
+
+### 4. Diff Preview and Summary
+
+diff_preview generates change summary. TUI diff renderer highlights additions/deletions/modifications.
+
+Final report: Critical (goroutine leak, high), Suggestion (timeout config validation, medium), Test coverage (timeout scenarios covered, pass).
+
+Session auto-persisted. Use /session for message stats and token consumption.
+
+**Covered features**: MCP management (add/list/health), MCP panel, diff renderer, review skill, review sub-agent, diff_preview, JSONL persistence, token display
+
+---
+
+## Feature Coverage Overview
+
+| Module | Story 1 | Story 2 | Story 3 | Story 4 |
+|--------|:---:|:---:|:---:|:---:|
+| Run Modes (chat/tui, run, install, mcp, yolo) | ✅ | ✅ | ✅ | ✅ |
+| Providers (OpenAI, Anthropic, Gemini, routing, failover) | ✅ | ✅ | | |
+| Engine (conversation, streaming, Build/Plan mode, compression) | ✅ | ✅ | ✅ | |
+| Tools (files, search, shell, web, diff) | ✅ | ✅ | ✅ | ✅ |
+| TUI (Bubble Tea, panels, rendering, notifications) | ✅ | ✅ | ✅ | ✅ |
+| Security (approval, sandbox, whitelist, worktree) | | ✅ | ✅ | |
+| Extensions (MCP, Skills, Sub-agents) | ✅ | ✅ | ✅ | ✅ |
+| Plan Mode (pipeline, tracking, risks) | ✅ | | | |
+| Sessions (persistence, restore) | ✅ | ✅ | ✅ | ✅ |
+| Context & Token (budget, compression, monitoring) | ✅ | ✅ | | ✅ |
+| Background Tasks (parallel, timeout) | | ✅ | | |
+| Notifications (desktop, approval) | ✅ | ✅ | | |
+| Config (JSON, env vars, provider_runtime) | ✅ | | | |
diff --git a/www/reference/config-reference.md b/www/reference/config-reference.md
index 71ea2941..f5ca8698 100644
--- a/www/reference/config-reference.md
+++ b/www/reference/config-reference.md
@@ -4,21 +4,170 @@ Full reference for all fields in `~/.bytemind/config.json` and project-level `.b
 
 For a working example see [`config.example.json`](https://github.com/1024XEngineer/bytemind/blob/main/config.example.json).
 
-## `provider`
+## `provider` (single-provider, legacy)
 
-Model provider configuration.
+Single model provider configuration. For configuring multiple providers and switching between them at runtime, prefer `provider_runtime` below.
 
 | Field               | Type   | Description                                 | Default                     |
 | ------------------- | ------ | ------------------------------------------- | --------------------------- |
 | `type`              | string | `openai-compatible`, `anthropic`, or `gemini` | `openai-compatible`       |
 | `base_url`          | string | API endpoint URL                            | `https://api.openai.com/v1` |
 | `model`             | string | Model ID to use                             | `gpt-5.4-mini`              |
-| `api_key`           | string | API key (plain text - prefer `api_key_env`) | -                           |
-| `api_key_env`       | string | Env var name to read the key from           | `BYTEMIND_API_KEY`          |
+| `api_key`           | string | API key in plain text — convenient but stores secrets in file | -                           |
+| `api_key_env`       | string | Env var name to read the key from. **When both `api_key` and `api_key_env` are set, `api_key` takes priority.** | `BYTEMIND_API_KEY`          |
 | `anthropic_version` | string | Anthropic API version header                | `2023-06-01`                |
 | `auth_header`       | string | Custom auth header name                     | `Authorization`             |
 | `auth_scheme`       | string | Auth scheme prefix (e.g. `Bearer`)          | `Bearer`                    |
 | `auto_detect_type`  | bool   | Infer provider type from `base_url`         | `false`                     |
+| `family`            | string | Provider family label (for display)         | -                           |
+| `api_path`          | string | Custom API path override                    | -                           |
+| `models`            | array  | Available model IDs for this provider       | -                           |
+| `extra_headers`     | object | Additional HTTP headers                     | -                           |
+
+## `provider_runtime` (multi-provider)
+
+Configure multiple model providers and switch between them at runtime with `/model`. When `provider_runtime` is present, it takes precedence over the legacy `provider` field.
+
+### Top-level fields
+
+| Field              | Type    | Description                                           | Default                  |
+| ------------------ | ------- | ----------------------------------------------------- | ------------------------ |
+| `current_provider` | string  | The currently active provider ID (e.g. `"deepseek"`)  | (first provider in map)  |
+| `default_provider` | string  | Fallback provider ID                                  | same as `current_provider` |
+| `default_model`    | string  | Fallback model ID when a provider has no `model` set  | -                        |
+| `allow_fallback`   | bool    | Allow automatic failover to another provider          | `false`                  |
+| `providers`        | object  | Map of provider ID → provider config (see below)      | (required)               |
+| `health`           | object  | Health-check settings for provider failover           | see below                |
+
+### `providers.<id>` fields
+
+Each provider entry supports all fields from the legacy `provider` section above, plus:
+
+| Field      | Type   | Description                                                  |
+| ---------- | ------ | ------------------------------------------------------------ |
+| `type`     | string | `openai-compatible`, `anthropic`, or `gemini`               |
+| `base_url` | string | API endpoint URL                                             |
+| `model`    | string | Currently selected model for this provider (updated by `/model`) |
+| `models`   | array  | List of model IDs available for switching. **Required** for `/model` picker to show options. |
+| `api_key_env` | string | Env var name to read the key from                        |
+| `api_key`  | string | API key in plain text (prefer `api_key_env`)                 |
+
+### `health` fields
+
+| Field                      | Type | Default | Description                                |
+| -------------------------- | ---- | ------- | ------------------------------------------ |
+| `fail_threshold`           | int  | `3`     | Consecutive failures before marking unhealthy |
+| `recover_probe_sec`        | int  | `30`    | Seconds between recovery probes            |
+| `recover_success_threshold` | int  | `2`    | Consecutive successes to mark healthy      |
+| `window_size`              | int  | `60`    | Rolling window size in seconds for health checks |
+
+### How model switching works
+
+1. Define multiple providers under `provider_runtime.providers`, each with a `models` list.
+2. Start ByteMind — it uses `current_provider` and that provider's `model`.
+3. Type `/model` to open the interactive picker, or `/model <provider>/<model>` to switch directly.
+4. The config file is updated automatically: `current_provider` and the provider's `model` field are rewritten.
+
+### Multi-provider example
+
+```json
+{
+  "provider_runtime": {
+    "current_provider": "deepseek",
+    "default_provider": "deepseek",
+    "default_model": "deepseek-v4-flash",
+    "allow_fallback": false,
+    "providers": {
+      "deepseek": {
+        "type": "openai-compatible",
+        "base_url": "https://api.deepseek.com",
+        "api_key_env": "DEEPSEEK_API_KEY",
+        "model": "deepseek-v4-flash",
+        "models": ["deepseek-v4-flash", "deepseek-v4-pro"]
+      },
+      "openai": {
+        "type": "openai-compatible",
+        "base_url": "https://api.openai.com/v1",
+        "api_key_env": "OPENAI_API_KEY",
+        "model": "gpt-5.4-mini",
+        "models": ["gpt-5.4-mini", "gpt-5.4"]
+      }
+    },
+    "health": {
+      "fail_threshold": 3,
+      "recover_probe_sec": 30,
+      "recover_success_threshold": 2,
+      "window_size": 60
+    }
+  }
+}
+```
+
+### Adding a new provider
+
+Edit `config.json` and add a new entry under `provider_runtime.providers`:
+
+```json
+"providers": {
+  "deepseek": { ... },
+  "openai": { ... },
+  "my-new-provider": {
+    "type": "openai-compatible",
+    "base_url": "https://api.my-provider.com/v1",
+    "api_key_env": "MY_PROVIDER_API_KEY",
+    "model": "my-model",
+    "models": ["my-model", "my-other-model"]
+  }
+}
+```
+
+Then restart ByteMind or use `/model` to see the new provider and its models in the picker.
+
+:::tip Migrating from legacy `provider`
+If your config only has the legacy `provider` field, ByteMind auto-converts it into `provider_runtime` on startup. Switching models with `/model` will persist the selection back to `provider_runtime`. You can also manually restructure your config to the multi-provider format above.
+:::
+
+## Setting API Key via Environment Variables
+
+Using `api_key_env` is the recommended approach — it keeps secrets out of your config file. However, `export` only sets the variable for the current terminal session and is lost when you close the window.
+
+### Permanent setup
+
+**Windows (PowerShell)** — write to user-level registry, survives reboots:
+```powershell
+[Environment]::SetEnvironmentVariable("DEEPSEEK_API_KEY", "sk-...", "User")
+```
+Restart your terminal after running this command.
+
+**Linux** — add to your shell profile:
+```bash
+echo 'export DEEPSEEK_API_KEY="sk-..."' >> ~/.bashrc
+```
+
+**macOS** — add to your shell profile (zsh is the default):
+```bash
+echo 'export DEEPSEEK_API_KEY="sk-..."' >> ~/.zshrc
+```
+
+### Temporary setup (current terminal only)
+
+```bash
+# Linux / macOS
+export DEEPSEEK_API_KEY="sk-..."
+
+# Windows PowerShell
+$env:DEEPSEEK_API_KEY = "sk-..."
+```
+
+### Priority when both `api_key` and `api_key_env` are set
+
+`api_key` (plain text in config) always takes priority over `api_key_env`. The resolution order is:
+
+1. `api_key` — if non-empty, use it directly
+2. `api_key_env` — if set, read from that environment variable
+3. `BYTEMIND_API_KEY` — final fallback environment variable
+
+If you have `api_key` in your config and also set `api_key_env`, the environment variable is ignored. Remove `api_key` from the config to use the environment variable instead.
 
 ## `approval_policy`
 
@@ -126,13 +275,37 @@ Controls context window management.
 
 ## Full Example
 
+### Multi-provider (recommended)
+
 ```json
 {
-  "provider": {
-    "type": "openai-compatible",
-    "base_url": "https://api.openai.com/v1",
-    "model": "gpt-4o",
-    "api_key_env": "OPENAI_API_KEY"
+  "provider_runtime": {
+    "current_provider": "deepseek",
+    "default_provider": "deepseek",
+    "default_model": "deepseek-v4-flash",
+    "allow_fallback": false,
+    "providers": {
+      "deepseek": {
+        "type": "openai-compatible",
+        "base_url": "https://api.deepseek.com",
+        "api_key_env": "DEEPSEEK_API_KEY",
+        "model": "deepseek-v4-flash",
+        "models": ["deepseek-v4-flash", "deepseek-v4-pro"]
+      },
+      "openai": {
+        "type": "openai-compatible",
+        "base_url": "https://api.openai.com/v1",
+        "api_key_env": "OPENAI_API_KEY",
+        "model": "gpt-5.4-mini",
+        "models": ["gpt-5.4-mini", "gpt-5.4"]
+      }
+    },
+    "health": {
+      "fail_threshold": 3,
+      "recover_probe_sec": 30,
+      "recover_success_threshold": 2,
+      "window_size": 60
+    }
   },
   "approval_policy": "on-request",
   "approval_mode": "interactive",
@@ -159,3 +332,26 @@ Controls context window management.
   }
 }
 ```
+
+### Single provider (legacy)
+
+```json
+{
+  "provider": {
+    "type": "openai-compatible",
+    "base_url": "https://api.openai.com/v1",
+    "model": "gpt-4o",
+    "api_key_env": "OPENAI_API_KEY"
+  },
+  "approval_policy": "on-request",
+  "approval_mode": "interactive",
+  "max_iterations": 32,
+  "stream": true,
+  "sandbox_enabled": false,
+  "context_budget": {
+    "warning_ratio": 0.85,
+    "critical_ratio": 0.95,
+    "max_reactive_retry": 1
+  }
+}
+```
diff --git a/www/usage/provider-setup.md b/www/usage/provider-setup.md
index 29201025..fa640765 100644
--- a/www/usage/provider-setup.md
+++ b/www/usage/provider-setup.md
@@ -2,6 +2,53 @@
 
 ByteMind supports any model provider that exposes an OpenAI-compatible API, plus Anthropic and Gemini native APIs.
 
+## Multi-Provider Setup (Model Switching)
+
+Configure multiple providers at once and switch between them at runtime with `/model`:
+
+```json
+{
+  "provider_runtime": {
+    "current_provider": "deepseek",
+    "default_provider": "deepseek",
+    "default_model": "deepseek-v4-flash",
+    "providers": {
+      "deepseek": {
+        "type": "openai-compatible",
+        "base_url": "https://api.deepseek.com",
+        "api_key_env": "DEEPSEEK_API_KEY",
+        "model": "deepseek-v4-flash",
+        "models": ["deepseek-v4-flash", "deepseek-v4-pro"]
+      },
+      "openai": {
+        "type": "openai-compatible",
+        "base_url": "https://api.openai.com/v1",
+        "api_key_env": "OPENAI_API_KEY",
+        "model": "gpt-5.4-mini",
+        "models": ["gpt-5.4-mini", "gpt-5.4"]
+      },
+      "anthropic": {
+        "type": "anthropic",
+        "base_url": "https://api.anthropic.com",
+        "api_key_env": "ANTHROPIC_API_KEY",
+        "model": "claude-sonnet-4-20250514",
+        "models": ["claude-sonnet-4-20250514", "claude-opus-4-20250514"]
+      }
+    }
+  }
+}
+```
+
+| Command | Action |
+| ------- | ------ |
+| `/model` | Interactive picker with all configured models |
+| `/model openai/gpt-5.4` | Switch to GPT-5.4 |
+| `/models` | Show current active model and all discovered models |
+
+The config file is updated automatically after switching. See [Config Reference](/reference/config-reference#provider-runtime-multi-provider) for every field.
+
+## Single Provider Examples (Legacy)
+
 ## OpenAI
 
 ```json
@@ -48,8 +95,8 @@ ByteMind supports any model provider that exposes an OpenAI-compatible API, plus
 {
   "provider": {
     "type": "openai-compatible",
-    "base_url": "https://api.deepseek.com/v1",
-    "model": "deepseek-coder",
+    "base_url": "https://api.deepseek.com",
+    "model": "deepseek-v4-flash",
     "api_key_env": "DEEPSEEK_API_KEY"
   }
 }
@@ -80,11 +127,55 @@ Always prefer `api_key_env` over a literal `api_key` in config files. This keeps
 { "provider": { "api_key_env": "MY_API_KEY_VAR" } }
 ```
 
+Set the variable **before** starting ByteMind:
+
+<Tabs default-tab="PowerShell">
+<Tab title="PowerShell">
+
+```powershell
+# Temporary (current window only):
+$env:MY_API_KEY_VAR = "sk-..."
+
+# Permanent (survives reboots):
+[Environment]::SetEnvironmentVariable("MY_API_KEY_VAR", "sk-...", "User")
+# Restart terminal after this command.
+```
+
+</Tab>
+
+<Tab title="Linux">
+
+```bash
+# Temporary (current window only):
+export MY_API_KEY_VAR="sk-..."
+
+# Permanent:
+echo 'export MY_API_KEY_VAR="sk-..."' >> ~/.bashrc
+```
+
+</Tab>
+
+<Tab title="macOS">
+
 ```bash
+# Temporary (current window only):
 export MY_API_KEY_VAR="sk-..."
+
+# Permanent:
+echo 'export MY_API_KEY_VAR="sk-..."' >> ~/.zshrc
+```
+
+</Tab>
+</Tabs>
+
+```bash
 bytemind
 ```
 
+:::warning `api_key` overrides `api_key_env`
+If both `api_key` and `api_key_env` are set, `api_key` (plain text) takes priority. Remove `api_key` from your config to use the environment variable.
+:::
+
 ## Custom Auth Headers
 
 For providers that require non-standard authentication:
diff --git a/www/zh/api-key.md b/www/zh/api-key.md
index bd58b3e4..12c249aa 100644
--- a/www/zh/api-key.md
+++ b/www/zh/api-key.md
@@ -130,6 +130,45 @@ DeepSeek 使用 OpenAI 兼容格式，所以填 `openai-compatible`。
 
 DeepSeek 官方文档给出的 OpenAI 格式 Base URL 是 `https://api.deepseek.com`。ByteMind 会在后面拼接默认接口路径 `/chat/completions`，所以这里不要再加 `/chat/completions`。
 
+**可以用 `api_key_env` 代替 `api_key` 吗？**
+
+可以，而且更安全。把 `"api_key"` 替换为 `"api_key_env": "DEEPSEEK_API_KEY"`，然后设置环境变量：
+
+<Tabs default-tab="PowerShell">
+<Tab title="PowerShell">
+
+```powershell
+# 临时（仅当前窗口有效）：
+$env:DEEPSEEK_API_KEY = "sk-..."
+
+# 永久（重启电脑后依然有效）：
+[Environment]::SetEnvironmentVariable("DEEPSEEK_API_KEY", "sk-...", "User")
+# 执行后需重启终端窗口。
+```
+
+</Tab>
+
+<Tab title="Linux">
+
+```bash
+export DEEPSEEK_API_KEY="sk-..."
+```
+
+</Tab>
+
+<Tab title="macOS">
+
+```bash
+export DEEPSEEK_API_KEY="sk-..."
+```
+
+</Tab>
+</Tabs>
+
+:::warning 不要同时设置 `api_key` 和 `api_key_env`
+如果两个都存在，`api_key` 优先，`api_key_env` 会被忽略。二选一即可。
+:::
+
 **模型 ID 可以随便写吗？**
 
 不可以。模型 ID 必须和服务商文档里的名字完全一致。DeepSeek 当前建议从 `deepseek-v4-flash` 开始；如果你需要更高能力，再按官方文档改成 `deepseek-v4-pro`。
diff --git a/www/zh/configuration.md b/www/zh/configuration.md
index cb2b651c..b380b2e8 100644
--- a/www/zh/configuration.md
+++ b/www/zh/configuration.md
@@ -81,6 +81,94 @@ bytemind
 设置 `"auto_detect_type": true` 后，ByteMind 会根据 `base_url` 自动推断 Provider 类型，无需手动指定 `type` 字段。
 :::
 
+## 多 Provider 配置（模型切换）
+
+ByteMind 支持配置多个模型 Provider 并在运行时切换，无需重启。使用 `provider_runtime` 替代旧版 `provider` 字段：
+
+```json
+{
+  "provider_runtime": {
+    "current_provider": "deepseek",
+    "default_provider": "deepseek",
+    "default_model": "deepseek-v4-flash",
+    "allow_fallback": false,
+    "providers": {
+      "deepseek": {
+        "type": "openai-compatible",
+        "base_url": "https://api.deepseek.com",
+        "api_key_env": "DEEPSEEK_API_KEY",
+        "model": "deepseek-v4-flash",
+        "models": ["deepseek-v4-flash", "deepseek-v4-pro"]
+      },
+      "openai": {
+        "type": "openai-compatible",
+        "base_url": "https://api.openai.com/v1",
+        "api_key_env": "OPENAI_API_KEY",
+        "model": "gpt-5.4-mini",
+        "models": ["gpt-5.4-mini", "gpt-5.4"]
+      }
+    }
+  }
+}
+```
+
+### 关键点
+
+- **`providers.<id>.models`** 必须填写 — 这是 `/model` 切换时可选的模型列表。
+- **`providers.<id>.model`** 是该 Provider 当前使用的模型，切换后会自动更新。
+- **`api_key_env`** 优于 `api_key` — 避免密钥明文写入配置文件。
+
+### 切换模型
+
+| 命令 | 作用 |
+| ---- | ---- |
+| `/model` | 打开交互式选择器，浏览所有已配置的 Provider 和模型 |
+| `/model deepseek/deepseek-v4-pro` | 直接切换到 deepseek-v4-pro |
+| `/model openai/gpt-5.4` | 直接切换到 GPT-5.4 |
+| `/models` | 显示所有已发现的模型和当前使用的模型 |
+
+切换后配置文件会自动更新 — 无需手动编辑。
+
+### 添加新 Provider
+
+编辑 `config.json`，在 `provider_runtime.providers` 下新增条目：
+
+```json
+"providers": {
+  "deepseek": { ... },
+  "openai": { ... },
+  "anthropic": {
+    "type": "anthropic",
+    "base_url": "https://api.anthropic.com",
+    "api_key_env": "ANTHROPIC_API_KEY",
+    "model": "claude-sonnet-4-20250514",
+    "models": ["claude-sonnet-4-20250514", "claude-opus-4-20250514"]
+  }
+}
+```
+
+重启 ByteMind 或使用 `/model` 即可在切换器中看到新增的 Provider。
+
+### 为已有 Provider 添加新模型
+
+编辑对应 Provider 的 `models` 数组：
+
+```json
+"deepseek": {
+  "type": "openai-compatible",
+  "base_url": "https://api.deepseek.com",
+  "api_key_env": "DEEPSEEK_API_KEY",
+  "model": "deepseek-v4-flash",
+  "models": ["deepseek-v4-flash", "deepseek-v4-pro", "deepseek-v4-flash-2"]
+}
+```
+
+保存后使用 `/model` 即可在切换器中看到新增的模型。
+
+:::tip 从旧版 `provider` 迁移
+如果配置文件中只有 `provider`（单 Provider），ByteMind 启动时会自动将其转换为 `provider_runtime`。使用 `/model` 切换后，选择结果会保存为 `provider_runtime` 格式。你也可以手动将配置改为上面的多 Provider 格式。
+:::
+
 ## 审批策略
 
 `approval_policy` 控制高风险工具（写文件、执行 Shell 命令等）何时请求确认：
diff --git a/www/zh/examples/user-stories.md b/www/zh/examples/user-stories.md
new file mode 100644
index 00000000..24239754
--- /dev/null
+++ b/www/zh/examples/user-stories.md
@@ -0,0 +1,453 @@
+# ByteMind 用户故事
+
+四个场景覆盖 ByteMind 全部功能点。每个故事末尾标注了该故事覆盖的功能模块。
+
+---
+
+## 故事一：设计阶段 — "为新模块做技术方案"
+
+> **角色**：后端工程师小张，刚接手一个 Go 微服务项目，需要为"消息推送模块"输出一份技术方案。
+
+### 1. 安装与上手
+
+小张在 Windows 上用 PowerShell 一键安装 ByteMind：
+
+```powershell
+iwr -useb https://raw.githubusercontent.com/1024XEngineer/bytemind/main/scripts/install.ps1 | iex
+```
+
+安装完成后进入项目目录，第一次启动看到了 **启动引导页**，提示他复制示例配置并填入 API Key。他编辑 `.bytemind/config.json`，配置了 OpenAI-compatible provider，顺手加了 Anthropic 和 Gemini 作为备用 provider，并开启 `auto_detect_type` 让系统自动识别 provider 类型。他通过环境变量 `BYTEMIND_HOME` 指定了配置目录。
+
+```json
+{
+  "provider": {
+    "type": "openai-compatible",
+    "base_url": "https://api.openai.com/v1",
+    "model": "gpt-5.4-mini",
+    "api_key": "sk-xxx"
+  },
+  "provider_runtime": {
+    "providers": [
+      { "id": "anthropic", "type": "anthropic", "model": "claude-sonnet-4-20250514", "api_key": "sk-ant-xxx" },
+      { "id": "gemini", "type": "gemini", "model": "gemini-2.5-pro", "api_key": "xxx" }
+    ]
+  },
+  "stream": true,
+  "max_iterations": 64
+}
+```
+
+### 2. 启动 Plan 模式，探索代码库
+
+小张启动 TUI 交互模式：
+
+```bash
+bytemind chat
+```
+
+进入 TUI 后，他先用 `/new` 新建会话，然后通过 `/models` 命令切换到 Claude Sonnet 4 模型——ByteMind 自动从 Anthropic provider 路由过去。他按 `Tab` 键打开子智能体面板，了解到有 `explorer`、`general`、`review` 三个内置子智能体。
+
+他输入 `@explorer 帮我梳理项目中与消息推送相关的所有代码文件和模块依赖`，ByteMind 自动补全子智能体名称，派发 `explorer` 子智能体去搜索代码库。子智能体通过 `list_files`、`search_text`、`read_file` 等工具遍历项目结构，返回了一份完整的模块依赖报告。
+
+接着小张用 `web_search` 调研业界消息推送的最佳实践，用 `web_fetch` 抓取了几篇技术文章的详细内容。
+
+### 3. Plan 模式：从探索到方案
+
+小张输入 `/plan` 进入 Plan 模式，描述需求：
+
+> "我需要为这个项目设计一个消息推送模块，支持 APNs 和 FCM 双通道，请帮我做技术方案。"
+
+ByteMind 进入 Plan 模式的阶段流转：
+- **explore**：通过 `search_text` 定位现有通知相关代码，`read_file` 了解当前架构风格
+- **clarify**：追问了几个关键问题（推送优先级策略、失败重试机制、是否需要本地消息队列）
+- **draft**：生成方案初稿，包含架构图描述、数据流设计、接口定义
+- **converge_ready**：方案待小张确认
+
+整个过程展示在 **Plan 面板**中，小张可以看到每个步骤的状态（pending/in_progress/completed/blocked）和风险等级标识（low/medium/high）。TUI 界面同时展示**上下文窗口使用量**，当接近 85% 告警线时触发了 warning 提示。
+
+### 4. 加载 Skill，引入 RFC 模板
+
+小张激活 `write-rfc` Skill：
+
+```
+/skill write-rfc
+```
+
+Skill 加载后，系统提示词被替换为 RFC 写作模板。他继续对话，ByteMind 按照 RFC 格式输出完整的技术方案文档，小张确认后方案进入 `approved_to_build` 阶段，Plan 模式自动将方案步骤写入执行计划。
+
+### 5. 持久化与收尾
+
+小张退出前用 `/sessions` 查看历史会话列表，确认方案会话已自动持久化。他配置了桌面通知，关闭终端后收到了审批请求的通知提醒。
+
+---
+
+**本故事覆盖功能**：
+
+| 模块 | 功能点 |
+|------|--------|
+| 运行模式 | `chat`/`tui` 交互模式, `install` 安装 |
+| Provider | OpenAI-compatible, Anthropic, Gemini 三适配; 多 Provider 注册路由; 模型动态切换; `auto_detect_type` |
+| TUI | Bubble Tea 全功能终端; 启动引导页; `/new` `/models` `/sessions` `/plan` 命令; 子智能体面板; Skill 面板; Plan 面板; 上下文窗口可视化; @mentions 补全; 命令面板 |
+| Plan 模式 | explore→clarify→draft→converge_ready→approved_to_build 阶段流转; 步骤状态跟踪; 风险等级; Plan 面板渲染 |
+| 工具 | `list_files`, `read_file`, `search_text`, `web_fetch`, `web_search` |
+| 子智能体 | `explorer` 代码探索; `delegate_subagent` 委托执行; builtin/user/project 三级管理 |
+| Skills | `write-rfc`; 三级 scope (builtin/user/project); Skill 激活/清除 |
+| 会话 | JSONL 持久化; 会话列表/恢复; 事件日志 |
+| 上下文 | 上下文窗口预算管理; warning/critical 告警 |
+| 通知 | 桌面通知 (审批/完成/失败); 通知冷却时间 |
+| 配置 | JSON 配置; 环境变量覆盖 (`BYTEMIND_HOME`); `provider_runtime` 多 provider 配置 |
+
+---
+
+## 故事二：开发阶段 — "实现消息推送模块"
+
+> **角色**：小张确认方案后，切换到 Build 模式开始写代码。
+
+### 1. 切换 Build 模式，全自动执行
+
+小张在 TUI 中恢复上次 Plan 会话（`/resume <id>`），然后切换到 Build 模式直接开始实现。他通过 `/models` 切换到 `deepseek-v4-pro` 以降低长任务成本。输入：
+
+> "按照刚才的方案，帮我实现消息推送模块，包括 APNs 和 FCM 两个 provider、消息队列、重试逻辑。"
+
+Build 模式下 ByteMind 直接开始干活，**流式输出**思考过程和工具调用结果，TUI 界面用不同颜色区分 thinking 和 assistant 内容。
+
+### 2. 高强度工具调用
+
+ByteMind 自动编排工具调用序列：
+- `write_file` 创建 `push/provider.go`、`push/apns.go`、`push/fcm.go`、`push/queue.go`、`push/retry.go` 等文件
+- `replace_in_file` 在现有模块中注入依赖
+- `apply_patch` 修复编译错误
+- `run_shell` 执行 `go mod tidy`、`go build` 等命令
+
+TUI 的 **Markdown 渲染器**将工具输出格式化展示，diff 内容带**语法高亮**，`run_shell` 的执行结果实时流式显示在终端中。
+
+### 3. 安全审批与沙箱
+
+当 ByteMind 尝试执行 `go build` 时，由于配置了 `approval_policy: "on-request"`，Shell 命令触发了**审批流程**。TUI 弹出审批对话框，显示命令内容和风险评估。小张确认后继续。
+
+小张之前配置了：
+
+```json
+{
+  "approval_policy": "on-request",
+  "sandbox_enabled": true,
+  "system_sandbox_mode": "non-blocking",
+  "writable_roots": ["/home/user/project"],
+  "exec_allowlist": [
+    { "command": "go", "args_pattern": ["build", "test", "mod", "vet", "fmt"] }
+  ],
+  "network_allowlist": [
+    { "host": "api.github.com", "port": "443" }
+  ]
+}
+```
+
+- 文件沙箱保证工具只能读写 `writable_roots` 范围内的文件
+- 命令白名单限制只能执行 `go build/test/mod/vet/fmt`
+- 网络沙箱限制只能访问 `api.github.com`
+
+当 ByteMind 尝试读取 `/etc/passwd` 时，**沙箱**直接 `deny` 并返回 `fs_out_of_scope`；尝试 `curl` 外网地址时被网络沙箱拦截，返回 `network_not_allowed`。
+
+### 4. Provider 故障自动切换
+
+实现过程中，OpenAI provider 突然返回 503 错误。ByteMind 的 **Provider 路由**检测到主 provider 不健康，自动通过**健康检查**切换到备用 Anthropic provider，任务无缝继续。小张在 TUI 的状态栏看到了 provider 切换提示。
+
+### 5. 子智能体并行加速
+
+编译时发现缺少 protobuf 定义，小张手动输入：
+
+> "帮我生成 push.proto 文件，然后用 protoc 编译"
+
+同时他派发 `general` 子智能体去写单元测试：
+
+```
+@general 帮我给 push/ 目录下所有文件写单元测试，覆盖正常路径和边界情况
+```
+
+子智能体在**后台运行**，通过 `task_output` 查看结果。TUI 底部的状态栏显示后台任务进度，完成后桌面弹出通知。
+
+### 6. 预算控制与上下文压缩
+
+实现过程中已跑了 50+ 轮工具调用，接近 `max_iterations: 64`。ByteMind 触发了**阶段性总结**（stop summary），归纳已完成的工作和剩余待办项。同时**上下文压缩**自动触发，将较早的对话压缩为摘要，释放上下文窗口空间。**重复调用检测**发现了两次相同的 `go build` 调用并及时终止。
+
+### 7. Token 用量监控
+
+TUI 右下角的 **Token 用量实时监控**组件显示了本轮会话的 token 消耗（输入/输出/总计），小张设置了 `alert_threshold: 100000`，当日总 token 接近阈值时弹出了告警。用量数据自动写入 SQLite 数据库（`database_driver: "sqlite"`）。
+
+---
+
+**本故事覆盖功能**：
+
+| 模块 | 功能点 |
+|------|--------|
+| 运行模式 | Build 模式; `--yolo` 全自动; `/resume` 会话恢复; `run` 单次任务 |
+| Provider | 健康检查; 故障自动切换; Provider 路由回退; 模型切换 |
+| 对话引擎 | 流式输出; 多轮对话; 工具调用循环; max_iterations 预算控制; stop summary 阶段总结; 重复调用检测; 上下文压缩 |
+| 工具 | `write_file`, `replace_in_file`, `apply_patch`, `run_shell`; 工具执行审计 |
+| TUI | Markdown 渲染; diff 语法高亮; 审批对话框; 后台任务状态栏; Token 用量实时监控; 桌面通知; 鼠标文本选择; 剪贴板粘贴; 图片输入 |
+| 审批安全 | on-request 分级审批; Shell 命令审批; 文件沙箱 (writable_roots); 网络沙箱 (network_allowlist); 命令白名单 (exec_allowlist); 沙箱 escalate/deny/allow 决策 |
+| 沙箱 | FS/Exec/Network 三级拦截; 审批通道 |
+| 子智能体 | `general` 子智能体; 子智能体并行执行; `task_output` 结果查看 |
+| 后台任务 | 后台/前台任务; 任务超时; worktree 隔离执行 |
+| Token | 实时监控; SQLite 持久化; 用量告警; 多存储后端 |
+| 上下文 | 自动压缩; 窗口预算 |
+
+---
+
+## 故事三：调试阶段 — "排查线上推送失败问题"
+
+> **角色**：小张实现的推送模块上线后出现间歇性推送失败，需要定位根因。
+
+### 1. 快速定位问题代码
+
+小张打开终端启动 TUI，恢复之前开发推送模块的会话继续对话：
+
+```bash
+bytemind chat
+```
+
+```
+/resume push-module
+```
+
+> "线上出现间歇性推送失败，错误日志显示 'connection timeout after 30s'，帮我排查根因。"
+
+ByteMind 通过 `search_text` 搜索代码中所有 timeout 相关配置，`read_file` 读取关键文件定位到 `push/apns.go` 中的 HTTP Client 超时设置为硬编码的 30s。
+
+### 2. 深入排查：Shell + Web 联动
+
+ByteMind 用 `run_shell` 执行 `go test -v -run TestAPNsRetry ./push/...` 查看测试覆盖情况，发现重试逻辑的单元测试没有覆盖 timeout 场景。
+
+接着用 `web_search` 搜索 "APNs timeout best practice 2026"，用 `web_fetch` 读取 Apple 官方文档中关于 connection timeout 的建议。
+
+### 3. 激活 Bug Investigation Skill
+
+小张激活内置的 `bug-investigation` Skill：
+
+```
+/skill bug-investigation
+```
+
+Skill 替换系统提示词为 Bug 调查专用模板，引导 ByteMind 从以下维度系统排查：
+- 问题复现条件
+- 影响范围（影响多少用户/设备）
+- 代码层面根因
+- 配置/环境因素
+- 修复方案与回归验证
+
+ByteMind 自动排查了：
+- `push/retry.go` 的重试策略是否对 timeout 场景生效
+- `push/fcm.go` 是否也有同样的硬编码问题
+- `config/config.go` 中是否有可配置的超时参数
+
+最终定位到两个问题：HTTP Client 超时硬编码 + 重试逻辑对 context.DeadlineExceeded 未正确捕获。
+
+### 4. 修复与验证
+
+> "把超时改成可配置的，默认值 60s；修复 retry.go 中对 context.DeadlineExceeded 的处理。"
+
+ByteMind 用 `replace_in_file` 修改了相关代码，用 `run_shell` 跑了 `go vet ./push/...`、`go test -race ./push/...` 验证。
+
+小张在 TUI 中通过鼠标**拖拽选择**了一段 diff 输出，`Ctrl+C` 复制后贴到代码审查文档里。diff 的**语法高亮**让改动一目了然。
+
+---
+
+**本故事覆盖功能**：
+
+| 模块 | 功能点 |
+|------|--------|
+| 运行模式 | `chat` TUI 交互; `/resume` 会话恢复 |
+| 工具 | `search_text`, `read_file`, `run_shell`, `replace_in_file`, `web_search`, `web_fetch` |
+| Skills | `bug-investigation` Skill 激活/清除; Skill 提示词替换 |
+| TUI | diff 语法高亮; 鼠标拖拽选择; 剪贴板复制; 终端流式输出 |
+| 对话引擎 | 多轮对话排查; 流式输出 |
+| 安全 | `run_shell` 命令白名单审批 |
+| 会话 | 历史会话恢复 |
+
+---
+
+## 故事四：代码审查 — "Review 推送模块 PR"
+
+> **角色**：小张的同事小王负责 Review 这次改动，他用 ByteMind 进行深度代码审查。
+
+### 1. 启动审查
+
+小王拉取 PR 分支后在项目目录启动 ByteMind：
+
+```bash
+bytemind chat
+```
+
+> "帮我 review 当前分支相对于 main 的所有改动，重点关注并发安全、错误处理、资源泄漏。"
+
+### 2. 激活 Review Skill + Review 子智能体
+
+小王先激活 `review` Skill：
+
+```
+/skill review
+```
+
+Review Skill 提供了结构化的审查框架（安全性、性能、可维护性、测试覆盖等维度）。
+
+同时他派发 `review` 内置子智能体：
+
+```
+@review 审查 push/ 目录下所有文件，检查并发安全问题
+```
+
+子智能体通过 `read_file`、`search_text` 检查了 mutex 使用、goroutine 泄漏、channel 关闭等问题。返回结果指出了 `push/queue.go` 中一处 channel 未正确关闭可能导致 goroutine 泄漏的问题。
+
+### 3. 逐文件审查与 MCP 集成
+
+小王之前通过 `bytemind mcp add` 接入了团队的代码质量 MCP 服务器：
+
+```bash
+bytemind mcp add my-linter -- node ./linter-mcp-server.js
+bytemind mcp list
+bytemind mcp health my-linter
+```
+
+在 TUI 中，MCP 工具自动注册到 ByteMind，审查时额外调用了 MCP 提供的静态分析能力。小王可以在 **MCP 管理面板**中查看所有 MCP 服务器的健康状态。
+
+ByteMind 逐文件审查：
+- `push/apns.go` — `read_file` 检查 HTTP Client 连接池配置
+- `push/fcm.go` — `search_text` 搜索 error handling 模式
+- `push/queue.go` — 重点审查 channel 生命周期
+- `push/retry.go` — 检查 backoff 策略和 context 取消传播
+
+### 4. Diff 预览与总结
+
+小王用 diff 预览工具查看改动：
+
+> "展示当前分支所有改动的 diff 摘要。"
+
+ByteMind 用 `diff_preview` 工具生成变更摘要。TUI 的 **diff 渲染器**将增删改分别用绿色/红色/黄色高亮展示。
+
+最终 ByteMind 输出了一份结构化 Review 报告，包含：
+- 严重问题（goroutine 泄漏）→ 风险等级 high
+- 建议改进（超时配置应加校验）→ 风险等级 medium
+- 测试覆盖分析（timeout 场景已覆盖）→ 通过
+
+整个审查会话自动**持久化**为 JSONL，小王用 `/sessions` 可以随时回溯。他想看看这次审查消耗了多少 token，通过 `/session` 查看当前会话的消息统计和 token 消耗。
+
+---
+
+**本故事覆盖功能**：
+
+| 模块 | 功能点 |
+|------|--------|
+| 运行模式 | `bytemind mcp add/list/health` MCP 管理命令 |
+| TUI | MCP 管理面板; diff 渲染器 (绿/红/黄); 会话消息统计 |
+| 工具 | `read_file`, `search_text`, `diff_preview` |
+| Skills | `review` Skill 结构化审查框架 |
+| 子智能体 | `review` 子智能体; builtin 内置定义 |
+| MCP | MCP 服务器增删查; 健康检查; MCP 工具自动注册; MCP 面板 |
+| 扩展 | Extensions 生命周期管理; MCP adapter |
+| 会话 | JSONL 持久化; 会话列表; 消息统计 |
+| Token | 会话级 token 消耗展示 |
+
+---
+
+## 功能覆盖率总览
+
+以下按模块列出所有功能点及其在四个故事中的分布：
+
+| 模块 | 功能点 | 故事一(设计) | 故事二(开发) | 故事三(Debug) | 故事四(Review) |
+|------|--------|:---:|:---:|:---:|:---:|
+| **运行模式** | `chat`/`tui` 交互 | ✅ | ✅ | ✅ | ✅ |
+| | `run` 单次任务 | | ✅ | | |
+| | `worker` 后台进程 | | ✅ | | |
+| | `install` 安装 | ✅ | | | |
+| | `mcp` MCP 管理 | | | | ✅ |
+| | `version` 版本 | | | | |
+| | `--yolo` 全自动 | | ✅ | | |
+| **Provider** | OpenAI-compatible 适配 | ✅ | | | |
+| | Anthropic 适配 | ✅ | | | |
+| | Gemini 适配 | ✅ | | | |
+| | 多 Provider 注册与路由 | ✅ | ✅ | | |
+| | 健康检查 + 故障切换 | | ✅ | | |
+| | 模型列表查询 | ✅ | | | |
+| | `auto_detect_type` | ✅ | | | |
+| **对话引擎** | 多轮对话 | ✅ | ✅ | ✅ | |
+| | 流式输出 | ✅ | ✅ | ✅ | |
+| | Build 模式 | | ✅ | | |
+| | Plan 模式 | ✅ | | | |
+| | 上下文压缩 | | ✅ | | |
+| | max_iterations 预算 | | ✅ | | |
+| | 重复调用检测 | | ✅ | | |
+| | stop summary | | ✅ | | |
+| | 子智能体委托 | ✅ | ✅ | | ✅ |
+| **工具** | `list_files` | ✅ | | | ✅ |
+| | `read_file` | ✅ | | ✅ | ✅ |
+| | `search_text` | ✅ | | ✅ | ✅ |
+| | `write_file` | | ✅ | | |
+| | `replace_in_file` | | ✅ | ✅ | |
+| | `apply_patch` | | ✅ | | |
+| | `run_shell` | | ✅ | ✅ | |
+| | `web_fetch` | ✅ | | ✅ | |
+| | `web_search` | ✅ | | ✅ | |
+| | `delegate_subagent` | ✅ | ✅ | | |
+| | `task_output` / `task_stop` | | ✅ | | |
+| | `diff_preview` | | | | ✅ |
+| **TUI** | Bubble Tea 终端 UI | ✅ | ✅ | ✅ | ✅ |
+| | Markdown 渲染 | | ✅ | | |
+| | diff 语法高亮 | | ✅ | ✅ | ✅ |
+| | 鼠标支持 (选择/拖拽/滚动) | | ✅ | ✅ | |
+| | 剪贴板粘贴 | | ✅ | ✅ | |
+| | 图片输入 | | ✅ | | |
+| | 会话管理面板 | ✅ | | | ✅ |
+| | 模型切换 (`/models`) | ✅ | ✅ | | |
+| | 子智能体面板 | ✅ | | | |
+| | Skill 面板 | ✅ | | | |
+| | MCP 面板 | | | | ✅ |
+| | Plan 面板 | ✅ | | | |
+| | Token 用量监控 | | ✅ | | |
+| | 命令面板/调色板 | ✅ | | | |
+| | @mentions 自动补全 | ✅ | | | |
+| | `/` 命令补全 | ✅ | | | |
+| | 桌面通知 | ✅ | ✅ | | |
+| | 启动引导页 | ✅ | | | |
+| | 上下文窗口可视化 | ✅ | | | |
+| | 审批对话框 | | ✅ | | |
+| | 后台任务状态栏 | | ✅ | | |
+| | 增强输入框 (多行) | ✅ | | | |
+| **审批安全** | on-request / away / full_access | | ✅ | | |
+| | Shell 命令审批 | | ✅ | ✅ | |
+| | 文件沙箱 (writable_roots) | | ✅ | | |
+| | 网络沙箱 (network_allowlist) | | ✅ | | |
+| | 命令白名单 (exec_allowlist) | | ✅ | ✅ | |
+| | worktree 隔离 | | ✅ | | |
+| **扩展系统** | MCP 服务器增删查 | | | | ✅ |
+| | MCP 健康检查 | | | | ✅ |
+| | Skills (6 个内置) | ✅ | | ✅ | ✅ |
+| | Skill 三级 scope | ✅ | | | |
+| | 子智能体 (3 个内置) | ✅ | ✅ | | ✅ |
+| | 子智能体三级 scope | ✅ | | | |
+| **Plan 模式** | 阶段流转 | ✅ | | | |
+| | 步骤状态跟踪 | ✅ | | | |
+| | 风险等级 | ✅ | | | |
+| | Plan 面板渲染 | ✅ | | | |
+| **会话** | JSONL 持久化 | ✅ | ✅ | | ✅ |
+| | 会话列表/恢复 | ✅ | | ✅ | ✅ |
+| | 事件日志 | ✅ | | | |
+| | 会话删除/清理 | | | | |
+| | 消息统计 | | | | ✅ |
+| **上下文** | 窗口预算管理 | ✅ | ✅ | | |
+| | warning/critical 告警 | ✅ | | | |
+| | 自动压缩 | | ✅ | | |
+| **Token** | 用量追踪 | | ✅ | | ✅ |
+| | 多后端 (文件/DB/内存) | | ✅ | | |
+| | 用量告警 | | ✅ | | |
+| | 实时监控 | | ✅ | | |
+| **后台任务** | 后台/前台执行 | | ✅ | | |
+| | 超时控制 | | ✅ | | |
+| | 重试 | | ✅ | | |
+| | worktree 隔离 | | ✅ | | |
+| **通知** | 桌面通知 | ✅ | ✅ | | |
+| | 审批/完成/失败通知 | ✅ | ✅ | | |
+| | 冷却时间 | ✅ | | | |
+| **配置** | JSON 配置文件 | ✅ | | | |
+| | 环境变量覆盖 | ✅ | | | |
+| | `provider_runtime` | ✅ | | | |
+| | 更新检查 | | | | |
diff --git a/www/zh/reference/config-reference.md b/www/zh/reference/config-reference.md
index 85fdb660..ed1ec339 100644
--- a/www/zh/reference/config-reference.md
+++ b/www/zh/reference/config-reference.md
@@ -4,21 +4,170 @@
 
 可用示例参考 [`config.example.json`](https://github.com/1024XEngineer/bytemind/blob/main/config.example.json)。
 
-## `provider`
+## `provider`（单 Provider，兼容旧配置）
 
-模型 Provider 配置。
+单一模型 Provider 配置。如需配置多个 Provider 并在运行时切换，建议使用下方的 `provider_runtime`。
 
 | 字段                | 类型   | 说明                                     | 默认值                      |
 | ------------------- | ------ | ---------------------------------------- | --------------------------- |
 | `type`              | string | `openai-compatible`、`anthropic` 或 `gemini` | `openai-compatible`      |
 | `base_url`          | string | API 端点 URL                             | `https://api.openai.com/v1` |
 | `model`             | string | 使用的模型 ID                            | `gpt-5.4-mini`              |
-| `api_key`           | string | API 密钥（明文，建议改用 `api_key_env`） | —                           |
-| `api_key_env`       | string | 从该环境变量读取 API 密钥                | `BYTEMIND_API_KEY`          |
+| `api_key`           | string | API 密钥明文 — 方便但会把密钥写入文件 | —                           |
+| `api_key_env`       | string | 从该环境变量读取 API 密钥。**当 `api_key` 和 `api_key_env` 同时存在时，`api_key` 优先。** | `BYTEMIND_API_KEY`          |
 | `anthropic_version` | string | Anthropic API 版本头                     | `2023-06-01`                |
 | `auth_header`       | string | 自定义鉴权头名称                         | `Authorization`             |
 | `auth_scheme`       | string | 鉴权前缀（如 `Bearer`）                  | `Bearer`                    |
 | `auto_detect_type`  | bool   | 根据 `base_url` 自动推断 Provider 类型   | `false`                     |
+| `family`            | string | Provider 系列标签（用于显示）             | —                           |
+| `api_path`          | string | 自定义 API 路径覆盖                      | —                           |
+| `models`            | array  | 该 Provider 可用的模型 ID 列表            | —                           |
+| `extra_headers`     | object | 额外的 HTTP 请求头                        | —                           |
+
+## `provider_runtime`（多 Provider）
+
+配置多个模型 Provider，并在运行时通过 `/model` 命令切换。当 `provider_runtime` 存在时，优先级高于旧版 `provider` 字段。
+
+### 顶层字段
+
+| 字段              | 类型   | 说明                                                   | 默认值                    |
+| ----------------- | ------ | ------------------------------------------------------ | ------------------------- |
+| `current_provider` | string | 当前激活的 Provider ID（如 `"deepseek"`）              | （providers map 中的第一个） |
+| `default_provider` | string | 兜底 Provider ID                                       | 同 `current_provider`     |
+| `default_model`   | string  | 兜底模型 ID，当 Provider 内未设置 `model` 时使用       | —                         |
+| `allow_fallback`  | bool    | 是否允许自动故障转移至其他 Provider                     | `false`                   |
+| `providers`       | object  | Provider ID → Provider 配置的映射表（见下方）           | （必填）                   |
+| `health`          | object  | 故障转移健康检查配置（见下方）                          | 见下方                    |
+
+### `providers.<id>` 字段
+
+每个 Provider 配置支持上述旧版 `provider` 的全部字段，特别注意：
+
+| 字段        | 类型   | 说明                                                        |
+| ----------- | ------ | ----------------------------------------------------------- |
+| `type`      | string | `openai-compatible`、`anthropic` 或 `gemini`                |
+| `base_url`  | string | API 端点 URL                                                 |
+| `model`     | string | 该 Provider 当前选用的模型（`/model` 切换时会自动更新）       |
+| `models`    | array  | 可切换的模型 ID 列表。**必须填写**，否则 `/model` 选择器不会显示可选项 |
+| `api_key_env` | string | 从该环境变量读取 API 密钥                                  |
+| `api_key`   | string | API 密钥明文（建议使用 `api_key_env`）                       |
+
+### `health` 字段
+
+| 字段                       | 类型 | 默认值 | 说明                       |
+| -------------------------- | ---- | ------ | -------------------------- |
+| `fail_threshold`           | int  | `3`    | 连续失败多少次后标记为不健康 |
+| `recover_probe_sec`        | int  | `30`   | 恢复探测间隔（秒）          |
+| `recover_success_threshold` | int  | `2`   | 连续成功多少次后标记为健康   |
+| `window_size`              | int  | `60`   | 滚动窗口大小（秒）           |
+
+### 模型切换流程
+
+1. 在 `provider_runtime.providers` 下配置多个 Provider，每个都写好 `models` 列表。
+2. 启动 ByteMind — 默认使用 `current_provider` 及其对应的 `model`。
+3. 输入 `/model` 打开交互式选择器，或输入 `/model <provider>/<model>` 直接切换。
+4. 配置文件会被自动更新：`current_provider` 和对应 Provider 的 `model` 字段会被重写。
+
+### 多 Provider 示例
+
+```json
+{
+  "provider_runtime": {
+    "current_provider": "deepseek",
+    "default_provider": "deepseek",
+    "default_model": "deepseek-v4-flash",
+    "allow_fallback": false,
+    "providers": {
+      "deepseek": {
+        "type": "openai-compatible",
+        "base_url": "https://api.deepseek.com",
+        "api_key_env": "DEEPSEEK_API_KEY",
+        "model": "deepseek-v4-flash",
+        "models": ["deepseek-v4-flash", "deepseek-v4-pro"]
+      },
+      "openai": {
+        "type": "openai-compatible",
+        "base_url": "https://api.openai.com/v1",
+        "api_key_env": "OPENAI_API_KEY",
+        "model": "gpt-5.4-mini",
+        "models": ["gpt-5.4-mini", "gpt-5.4"]
+      }
+    },
+    "health": {
+      "fail_threshold": 3,
+      "recover_probe_sec": 30,
+      "recover_success_threshold": 2,
+      "window_size": 60
+    }
+  }
+}
+```
+
+### 添加新 Provider
+
+编辑 `config.json`，在 `provider_runtime.providers` 下新增一个条目：
+
+```json
+"providers": {
+  "deepseek": { ... },
+  "openai": { ... },
+  "my-new-provider": {
+    "type": "openai-compatible",
+    "base_url": "https://api.my-provider.com/v1",
+    "api_key_env": "MY_PROVIDER_API_KEY",
+    "model": "my-model",
+    "models": ["my-model", "my-other-model"]
+  }
+}
+```
+
+重启 ByteMind 或使用 `/model` 即可在切换器中看到新增的 Provider 及其模型。
+
+:::tip 从旧版 `provider` 迁移
+如果配置文件中只有旧版的 `provider` 字段，ByteMind 启动时会自动将其转换为 `provider_runtime`。使用 `/model` 切换模型后，选择结果会持久化到 `provider_runtime`。你也可以手动将配置改为上面的多 Provider 格式。
+:::
+
+## 通过环境变量设置 API Key
+
+推荐使用 `api_key_env` — 避免密钥写入配置文件。但 `export` 只在当前终端窗口临时生效，关闭窗口后丢失。
+
+### 永久设置
+
+**Windows (PowerShell)** — 写入用户级注册表，重启电脑后依然有效：
+```powershell
+[Environment]::SetEnvironmentVariable("DEEPSEEK_API_KEY", "sk-...", "User")
+```
+执行后需重启终端窗口。
+
+**Linux** — 写入 shell 配置文件：
+```bash
+echo 'export DEEPSEEK_API_KEY="sk-..."' >> ~/.bashrc
+```
+
+**macOS** — 写入 zsh 配置文件（macOS 默认 shell）：
+```bash
+echo 'export DEEPSEEK_API_KEY="sk-..."' >> ~/.zshrc
+```
+
+### 临时设置（仅当前终端有效）
+
+```bash
+# Linux / macOS
+export DEEPSEEK_API_KEY="sk-..."
+
+# Windows PowerShell
+$env:DEEPSEEK_API_KEY = "sk-..."
+```
+
+### `api_key` 与 `api_key_env` 同时存在时的优先级
+
+`api_key`（明文）始终优先于 `api_key_env`。解析顺序为：
+
+1. `api_key` — 非空则直接使用
+2. `api_key_env` — 从指定的环境变量读取
+3. `BYTEMIND_API_KEY` — 最终兜底环境变量
+
+如果 config 中同时写了 `api_key` 和 `api_key_env`，环境变量会被忽略。想用环境变量需先删掉 `api_key` 字段。
 
 ## `approval_policy`
 
@@ -128,13 +277,37 @@
 
 ## 完整示例
 
+### 多 Provider（推荐）
+
 ```json
 {
-  "provider": {
-    "type": "openai-compatible",
-    "base_url": "https://api.openai.com/v1",
-    "model": "gpt-4o",
-    "api_key_env": "OPENAI_API_KEY"
+  "provider_runtime": {
+    "current_provider": "deepseek",
+    "default_provider": "deepseek",
+    "default_model": "deepseek-v4-flash",
+    "allow_fallback": false,
+    "providers": {
+      "deepseek": {
+        "type": "openai-compatible",
+        "base_url": "https://api.deepseek.com",
+        "api_key_env": "DEEPSEEK_API_KEY",
+        "model": "deepseek-v4-flash",
+        "models": ["deepseek-v4-flash", "deepseek-v4-pro"]
+      },
+      "openai": {
+        "type": "openai-compatible",
+        "base_url": "https://api.openai.com/v1",
+        "api_key_env": "OPENAI_API_KEY",
+        "model": "gpt-5.4-mini",
+        "models": ["gpt-5.4-mini", "gpt-5.4"]
+      }
+    },
+    "health": {
+      "fail_threshold": 3,
+      "recover_probe_sec": 30,
+      "recover_success_threshold": 2,
+      "window_size": 60
+    }
   },
   "approval_policy": "on-request",
   "approval_mode": "interactive",
@@ -161,3 +334,26 @@
   }
 }
 ```
+
+### 单 Provider（兼容旧版）
+
+```json
+{
+  "provider": {
+    "type": "openai-compatible",
+    "base_url": "https://api.openai.com/v1",
+    "model": "gpt-4o",
+    "api_key_env": "OPENAI_API_KEY"
+  },
+  "approval_policy": "on-request",
+  "approval_mode": "interactive",
+  "max_iterations": 32,
+  "stream": true,
+  "sandbox_enabled": false,
+  "context_budget": {
+    "warning_ratio": 0.85,
+    "critical_ratio": 0.95,
+    "max_reactive_retry": 1
+  }
+}
+```
diff --git a/www/zh/usage/provider-setup.md b/www/zh/usage/provider-setup.md
index 7204312d..ae28f5c2 100644
--- a/www/zh/usage/provider-setup.md
+++ b/www/zh/usage/provider-setup.md
@@ -2,6 +2,53 @@
 
 ByteMind 支持任何兼容 OpenAI API 的服务，以及 Anthropic 和 Gemini 原生 API。
 
+## 多 Provider 配置（模型切换）
+
+一次性配置多个 Provider，运行时通过 `/model` 切换：
+
+```json
+{
+  "provider_runtime": {
+    "current_provider": "deepseek",
+    "default_provider": "deepseek",
+    "default_model": "deepseek-v4-flash",
+    "providers": {
+      "deepseek": {
+        "type": "openai-compatible",
+        "base_url": "https://api.deepseek.com",
+        "api_key_env": "DEEPSEEK_API_KEY",
+        "model": "deepseek-v4-flash",
+        "models": ["deepseek-v4-flash", "deepseek-v4-pro"]
+      },
+      "openai": {
+        "type": "openai-compatible",
+        "base_url": "https://api.openai.com/v1",
+        "api_key_env": "OPENAI_API_KEY",
+        "model": "gpt-5.4-mini",
+        "models": ["gpt-5.4-mini", "gpt-5.4"]
+      },
+      "anthropic": {
+        "type": "anthropic",
+        "base_url": "https://api.anthropic.com",
+        "api_key_env": "ANTHROPIC_API_KEY",
+        "model": "claude-sonnet-4-20250514",
+        "models": ["claude-sonnet-4-20250514", "claude-opus-4-20250514"]
+      }
+    }
+  }
+}
+```
+
+| 命令 | 作用 |
+| ---- | ---- |
+| `/model` | 打开交互式选择器，浏览所有已配置的模型 |
+| `/model openai/gpt-5.4` | 直接切换到 GPT-5.4 |
+| `/models` | 查看当前模型和所有已发现模型 |
+
+切换后配置文件会自动更新。完整字段参考见[配置参考](/zh/reference/config-reference#provider-runtime-多-provider)。
+
+## 单 Provider 示例（兼容旧版）
+
 ## OpenAI
 
 ```json
@@ -82,11 +129,55 @@ ByteMind 支持任何兼容 OpenAI API 的服务，以及 Anthropic 和 Gemini 
 { "provider": { "api_key_env": "MY_API_KEY_VAR" } }
 ```
 
+**在启动 ByteMind 之前**设置环境变量：
+
+<Tabs default-tab="PowerShell">
+<Tab title="PowerShell">
+
+```powershell
+# 临时（仅当前窗口有效）：
+$env:MY_API_KEY_VAR = "sk-..."
+
+# 永久（重启电脑后依然有效）：
+[Environment]::SetEnvironmentVariable("MY_API_KEY_VAR", "sk-...", "User")
+# 执行后需重启终端窗口。
+```
+
+</Tab>
+
+<Tab title="Linux">
+
+```bash
+# 临时（仅当前窗口有效）：
+export MY_API_KEY_VAR="sk-..."
+
+# 永久：
+echo 'export MY_API_KEY_VAR="sk-..."' >> ~/.bashrc
+```
+
+</Tab>
+
+<Tab title="macOS">
+
 ```bash
+# 临时（仅当前窗口有效）：
 export MY_API_KEY_VAR="sk-..."
+
+# 永久：
+echo 'export MY_API_KEY_VAR="sk-..."' >> ~/.zshrc
+```
+
+</Tab>
+</Tabs>
+
+```bash
 bytemind
 ```
 
+:::warning `api_key` 会覆盖 `api_key_env`
+如果同时写了 `api_key` 和 `api_key_env`，`api_key`（明文）优先。想用环境变量需先删掉 config 里的 `api_key` 字段。
+:::
+
 ## 自定义鉴权头
 
 对于需要非标准鉴权的网关或内部服务：