diff --git "a/2.\346\267\261\345\205\245LLM\346\250\241\345\236\213\345\267\245\347\250\213\350\210\207LLM\351\201\213\347\266\255/11.\347\217\276\344\273\243\345\260\215\351\275\212\346\226\271\346\263\2252024-2025/README.md" "b/2.\346\267\261\345\205\245LLM\346\250\241\345\236\213\345\267\245\347\250\213\350\210\207LLM\351\201\213\347\266\255/11.\347\217\276\344\273\243\345\260\215\351\275\212\346\226\271\346\263\2252024-2025/README.md" index 483c28d..53c194e 100644 --- "a/2.\346\267\261\345\205\245LLM\346\250\241\345\236\213\345\267\245\347\250\213\350\210\207LLM\351\201\213\347\266\255/11.\347\217\276\344\273\243\345\260\215\351\275\212\346\226\271\346\263\2252024-2025/README.md" +++ "b/2.\346\267\261\345\205\245LLM\346\250\241\345\236\213\345\267\245\347\250\213\350\210\207LLM\351\201\213\347\266\255/11.\347\217\276\344\273\243\345\260\215\351\275\212\346\226\271\346\263\2252024-2025/README.md" @@ -1,861 +1,170 @@ -# 現代LLM對齊方法 2024-2025 +# 現代 LLM 對齊方法 2024-2025 -> **最後更新**: 2025-12-14 -> **狀態**: 涵蓋RLHF之後的新一代對齊技術 +> **最後更新**: 2026-01 +> **涵蓋範圍**: RLHF、DPO、IPO、SimPO、KTO、ORPO 等 +> **難度**: 進階 --- -## 📋 目錄 +## 目錄 -1. [對齊技術演進](#1-對齊技術演進) -2. [DPO: Direct Preference Optimization](#2-dpo-direct-preference-optimization) -3. [IPO: Identity Preference Optimization](#3-ipo-identity-preference-optimization) -4. [SimPO: Simple Preference Optimization](#4-simpo-simple-preference-optimization) -5. [KTO: Kahneman-Tversky Optimization](#5-kto-kahneman-tversky-optimization) -6. [ORPO: Odds Ratio Preference Optimization](#6-orpo-odds-ratio-preference-optimization) -7. [方法對比與選擇指南](#7-方法對比與選擇指南) -8. [實戰案例](#8-實戰案例) +1. [對齊概述](#1-對齊概述) +2. [RLHF 回顧與局限](#2-rlhf-回顧與局限) +3. [DPO: Direct Preference Optimization](#3-dpo-direct-preference-optimization) +4. [IPO: Identity Preference Optimization](#4-ipo-identity-preference-optimization) +5. [SimPO: Simple Preference Optimization](#5-simpo-simple-preference-optimization) +6. [KTO: Kahneman-Tversky Optimization](#6-kto-kahneman-tversky-optimization) +7. [ORPO: Odds Ratio Preference Optimization](#7-orpo-odds-ratio-preference-optimization) +8. [方法對比與選擇指南](#8-方法對比與選擇指南) --- -## 1. 對齊技術演進 +## 1. 對齊概述 -### 1.1 從RLHF到直接偏好學習 +### 1.1 什麼是 LLM 對齊? -``` -┌─────────────────────────────────────────────────────────────────┐ -│ 對齊技術演進時間線 │ -├─────────────────────────────────────────────────────────────────┤ -│ │ -│ 2020 2022 2023 2024 2025 │ -│ │ │ │ │ │ │ -│ ▼ ▼ ▼ ▼ ▼ │ -│ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │ -│ │RLHF│ → │RLHF│ → │DPO │ → │IPO │ → │KTO │ │ -│ │基礎│ │成熟│ │ │ │SimPO│ │ORPO│ │ -│ └────┘ └────┘ └────┘ └────┘ └────┘ │ -│ │ -│ 特點: │ -│ RLHF: 需要獎勵模型 + PPO訓練,複雜度高 │ -│ DPO: 直接從偏好學習,無需獎勵模型 │ -│ IPO: 解決DPO的過擬合問題 │ -│ SimPO: 簡化DPO,無需參考模型 │ -│ KTO: 使用前景理論,支持非配對數據 │ -│ ORPO: 整合SFT和對齊,一步完成 │ -│ │ -└─────────────────────────────────────────────────────────────────┘ -``` +LLM 對齊(Alignment)是指讓語言模型的行為符合人類偏好和價值觀的過程。 -### 1.2 方法核心對比 +### 1.2 對齊方法演進 -| 方法 | 需要獎勵模型 | 需要參考模型 | 數據需求 | 訓練複雜度 | 穩定性 | -|------|------------|------------|---------|-----------|--------| -| RLHF | ✅ | ✅ | 配對偏好 | 🔴 高 | 🟡 中 | -| DPO | ❌ | ✅ | 配對偏好 | 🟢 低 | 🟢 高 | -| IPO | ❌ | ✅ | 配對偏好 | 🟢 低 | 🟢 高 | -| SimPO | ❌ | ❌ | 配對偏好 | 🟢 最低 | 🟢 高 | -| KTO | ❌ | ✅ | 非配對 | 🟢 低 | 🟢 高 | -| ORPO | ❌ | ❌ | 配對偏好 | 🟢 低 | 🟢 高 | +| 年份 | 方法 | 特點 | +|------|------|------| +| 2022 | RLHF | 需要獎勵模型 + PPO | +| 2023 | DPO | 直接偏好優化,無需獎勵模型 | +| 2024 | SimPO | 無需參考模型,長度歸一化 | +| 2024 | KTO | 支持二元標籤數據 | +| 2024 | ORPO | SFT + 對齊一步完成 | --- -## 2. DPO: Direct Preference Optimization - -### 2.1 核心原理 - -DPO (Direct Preference Optimization) 是2023年由Stanford團隊提出的方法,通過數學推導將RLHF的目標函數轉化為簡單的分類損失,避免了訓練獎勵模型和使用PPO的複雜性。 - -**核心公式**: - -``` -L_DPO = -E[(x, y_w, y_l)] [log σ(β * (log π_θ(y_w|x) / π_ref(y_w|x) - - log π_θ(y_l|x) / π_ref(y_l|x)))] -``` - -其中: -- `y_w`: 優選回答 (winner) -- `y_l`: 劣選回答 (loser) -- `π_θ`: 當前模型 -- `π_ref`: 參考模型 (通常是SFT後的模型) -- `β`: 溫度參數,控制與參考模型的偏離程度 +## 2. RLHF 回顧與局限 -### 2.2 實現代碼 +### 2.1 RLHF 流程 -```python -import torch -import torch.nn.functional as F -from transformers import AutoModelForCausalLM, AutoTokenizer -from datasets import load_dataset -from trl import DPOTrainer, DPOConfig - -# 載入模型 -model = AutoModelForCausalLM.from_pretrained( - "meta-llama/Llama-2-7b-hf", - torch_dtype=torch.bfloat16, - device_map="auto" -) - -# 參考模型 (通常是SFT後的checkpoint) -ref_model = AutoModelForCausalLM.from_pretrained( - "path/to/sft-checkpoint", - torch_dtype=torch.bfloat16, - device_map="auto" -) - -tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf") -tokenizer.pad_token = tokenizer.eos_token - -# 準備數據集 -# 格式: {"prompt": str, "chosen": str, "rejected": str} -dataset = load_dataset("your_preference_dataset") - -# DPO配置 -dpo_config = DPOConfig( - output_dir="./dpo-output", - beta=0.1, # 關鍵超參數 - learning_rate=5e-7, - per_device_train_batch_size=4, - gradient_accumulation_steps=4, - num_train_epochs=1, - warmup_ratio=0.1, - logging_steps=10, - save_steps=100, - bf16=True, - gradient_checkpointing=True, - max_length=1024, - max_prompt_length=512, -) - -# 訓練 -trainer = DPOTrainer( - model=model, - ref_model=ref_model, - args=dpo_config, - train_dataset=dataset["train"], - tokenizer=tokenizer, -) - -trainer.train() -``` - -### 2.3 從頭實現DPO損失 +1. **SFT 階段**: 監督微調 +2. **RM 階段**: 訓練獎勵模型 +3. **PPO 階段**: 強化學習優化 -```python -def dpo_loss( - model_logps_chosen: torch.Tensor, - model_logps_rejected: torch.Tensor, - ref_logps_chosen: torch.Tensor, - ref_logps_rejected: torch.Tensor, - beta: float = 0.1 -) -> torch.Tensor: - """ - 計算DPO損失 - - Args: - model_logps_chosen: 當前模型對優選回答的log概率 - model_logps_rejected: 當前模型對劣選回答的log概率 - ref_logps_chosen: 參考模型對優選回答的log概率 - ref_logps_rejected: 參考模型對劣選回答的log概率 - beta: 溫度參數 - - Returns: - DPO損失值 - """ - # 計算log ratio - chosen_logratios = model_logps_chosen - ref_logps_chosen - rejected_logratios = model_logps_rejected - ref_logps_rejected - - # DPO損失 = -log(sigmoid(beta * (chosen_ratio - rejected_ratio))) - logits = beta * (chosen_logratios - rejected_logratios) - loss = -F.logsigmoid(logits).mean() - - # 計算準確率 (chosen分數是否高於rejected) - accuracy = (logits > 0).float().mean() - - return loss, accuracy - -def compute_log_probs( - model: AutoModelForCausalLM, - input_ids: torch.Tensor, - attention_mask: torch.Tensor, - labels: torch.Tensor -) -> torch.Tensor: - """計算序列的log概率""" - outputs = model(input_ids=input_ids, attention_mask=attention_mask) - logits = outputs.logits - - # Shift for causal LM - shift_logits = logits[..., :-1, :].contiguous() - shift_labels = labels[..., 1:].contiguous() - - # 計算每個token的log概率 - log_probs = F.log_softmax(shift_logits, dim=-1) - - # 選擇對應label的log概率 - per_token_logps = torch.gather( - log_probs, - dim=-1, - index=shift_labels.unsqueeze(-1) - ).squeeze(-1) - - # 使用mask過濾padding - mask = (shift_labels != -100).float() - log_prob_sum = (per_token_logps * mask).sum(dim=-1) - - return log_prob_sum -``` - -### 2.4 DPO最佳實踐 - -```python -# 超參數建議 -dpo_hyperparams = { - "beta": { - "range": [0.05, 0.5], - "default": 0.1, - "notes": "較小的beta允許更大的偏離,較大的beta保守學習" - }, - "learning_rate": { - "range": [1e-7, 5e-6], - "default": 5e-7, - "notes": "比SFT低10-100倍" - }, - "epochs": { - "range": [1, 3], - "default": 1, - "notes": "通常1-2 epochs足夠,過多會過擬合" - }, - "batch_size": { - "effective": 32, # gradient_accumulation * per_device - "notes": "較大的batch size更穩定" - } -} +### 2.2 局限性 -# 數據質量檢查 -def validate_preference_data(dataset): - """驗證偏好數據質量""" - issues = [] - - for idx, example in enumerate(dataset): - # 檢查必需字段 - if "prompt" not in example or "chosen" not in example or "rejected" not in example: - issues.append(f"樣本 {idx}: 缺少必需字段") - continue - - # 檢查chosen和rejected是否相同 - if example["chosen"] == example["rejected"]: - issues.append(f"樣本 {idx}: chosen和rejected相同") - - # 檢查長度 - if len(example["chosen"]) < 10 or len(example["rejected"]) < 10: - issues.append(f"樣本 {idx}: 回答過短") - - return issues -``` +- 需要 4 個模型同時在內存 +- PPO 訓練不穩定 +- 可能產生獎勵駭客 +- 資源消耗大 --- -## 3. IPO: Identity Preference Optimization +## 3. DPO: Direct Preference Optimization -### 3.1 核心改進 +### 3.1 核心思想 -IPO (Identity Preference Optimization) 解決了DPO的一個關鍵問題:當偏好數據確定性很高時(幾乎總是選擇y_w),DPO會過擬合。 - -**IPO損失函數**: - -``` -L_IPO = E[(log π_θ(y_w|x) / π_ref(y_w|x) - - log π_θ(y_l|x) / π_ref(y_l|x) - 1/2τ)²] -``` +直接從偏好數據優化策略,無需訓練獎勵模型。 -### 3.2 實現代碼 +### 3.2 損失函數 ```python -def ipo_loss( - model_logps_chosen: torch.Tensor, - model_logps_rejected: torch.Tensor, - ref_logps_chosen: torch.Tensor, - ref_logps_rejected: torch.Tensor, - tau: float = 0.1 -) -> torch.Tensor: - """ - 計算IPO損失 - - Args: - tau: 正則化參數 - """ - chosen_logratios = model_logps_chosen - ref_logps_chosen - rejected_logratios = model_logps_rejected - ref_logps_rejected - - # IPO使用MSE而非log sigmoid - logits = chosen_logratios - rejected_logratios - loss = (logits - 1 / (2 * tau)) ** 2 - - return loss.mean() - -# TRL中使用IPO -from trl import DPOConfig - -ipo_config = DPOConfig( - output_dir="./ipo-output", - loss_type="ipo", # 關鍵:指定IPO損失 - beta=0.1, - # ... 其他參數 -) +# DPO 損失 +def dpo_loss(chosen_log_ratios, rejected_log_ratios, beta=0.1): + diff = chosen_log_ratios - rejected_log_ratios + losses = -F.logsigmoid(beta * diff) + return losses.mean() ``` ---- - -## 4. SimPO: Simple Preference Optimization - -### 4.1 核心創新 - -SimPO (Simple Preference Optimization) 的主要創新是**不需要參考模型**,通過使用平均log概率作為隱式獎勵,簡化了訓練流程。 +### 3.3 優缺點 -**SimPO損失函數**: +| 優點 | 缺點 | +|------|------| +| 無需獎勵模型 | 需要高品質偏好數據 | +| 訓練穩定 | 對 β 參數敏感 | +| 內存效率高 | 可能有長度偏差 | -``` -L_SimPO = -log σ(β/|y_w| * log π_θ(y_w|x) - β/|y_l| * log π_θ(y_l|x) - γ) -``` +--- -其中: -- `|y_w|`, `|y_l|`: 回答長度(用於長度歸一化) -- `γ`: margin參數,確保優選和劣選之間有足夠差距 +## 4. IPO: Identity Preference Optimization -### 4.2 實現代碼 +解決 DPO 的過度擬合問題,使用恆等函數替代 sigmoid,添加正則化。 ```python -def simpo_loss( - model_logps_chosen: torch.Tensor, - model_logps_rejected: torch.Tensor, - chosen_lengths: torch.Tensor, - rejected_lengths: torch.Tensor, - beta: float = 2.0, - gamma: float = 0.5 -) -> torch.Tensor: - """ - 計算SimPO損失 - - Args: - beta: 溫度參數 (SimPO通常使用較大的beta) - gamma: margin參數 - """ - # 長度歸一化 - chosen_rewards = beta * model_logps_chosen / chosen_lengths - rejected_rewards = beta * model_logps_rejected / rejected_lengths - - # 帶margin的損失 - logits = chosen_rewards - rejected_rewards - gamma - loss = -F.logsigmoid(logits).mean() - - return loss - -# 使用TRL的SimPO -from trl import DPOConfig - -simpo_config = DPOConfig( - output_dir="./simpo-output", - loss_type="simpo", - beta=2.0, # SimPO推薦較大的beta - simpo_gamma=0.5, - # 注意: SimPO不需要ref_model -) - -trainer = DPOTrainer( - model=model, - ref_model=None, # SimPO不需要參考模型! - args=simpo_config, - train_dataset=dataset, - tokenizer=tokenizer, -) +def ipo_loss(chosen_log_ratios, rejected_log_ratios, beta=0.1): + diff = chosen_log_ratios - rejected_log_ratios + target = 1 / (2 * beta) + return ((diff - target) ** 2).mean() ``` -### 4.3 SimPO vs DPO - -| 特性 | DPO | SimPO | -|------|-----|-------| -| 參考模型 | 需要 | 不需要 | -| 記憶體使用 | 2x模型 | 1x模型 | -| 訓練速度 | 較慢 | 更快 | -| 長度偏見 | 可能存在 | 內建歸一化 | -| 推薦beta | 0.1 | 2.0 | - --- -## 5. KTO: Kahneman-Tversky Optimization - -### 5.1 核心理念 - -KTO (Kahneman-Tversky Optimization) 基於行為經濟學的**前景理論**,主要創新是: -1. **不需要配對數據** - 只需要標記好/壞的回答 -2. **損失厭惡** - 對壞回答的懲罰大於對好回答的獎勵 +## 5. SimPO: Simple Preference Optimization -**KTO損失函數**: - -``` -L_KTO = E_chosen[-λ_w * σ(-β * (r_θ(x, y_w) - z_0))] - + E_rejected[-λ_l * σ(β * (r_θ(x, y_l) - z_0))] +### 5.1 創新點 -其中 r_θ(x, y) = log π_θ(y|x) - log π_ref(y|x) -``` - -### 5.2 實現代碼 - -```python -def kto_loss( - model_logps_chosen: torch.Tensor, - model_logps_rejected: torch.Tensor, - ref_logps_chosen: torch.Tensor, - ref_logps_rejected: torch.Tensor, - beta: float = 0.1, - lambda_w: float = 1.0, - lambda_l: float = 1.0 -) -> torch.Tensor: - """ - 計算KTO損失 - - Args: - lambda_w: 優選回答的權重 - lambda_l: 劣選回答的權重 (損失厭惡時 lambda_l > lambda_w) - """ - # 計算獎勵 - chosen_rewards = model_logps_chosen - ref_logps_chosen - rejected_rewards = model_logps_rejected - ref_logps_rejected - - # KL散度作為baseline (z_0) - kl_chosen = (ref_logps_chosen - model_logps_chosen).mean().detach() - kl_rejected = (ref_logps_rejected - model_logps_rejected).mean().detach() - z_0 = (kl_chosen + kl_rejected) / 2 - - # KTO損失 - chosen_loss = -lambda_w * F.logsigmoid(beta * (chosen_rewards - z_0)) - rejected_loss = -lambda_l * F.logsigmoid(-beta * (rejected_rewards - z_0)) - - loss = chosen_loss.mean() + rejected_loss.mean() - - return loss - -# TRL配置 -kto_config = DPOConfig( - output_dir="./kto-output", - loss_type="kto", - beta=0.1, - desirable_weight=1.0, # lambda_w - undesirable_weight=1.33, # lambda_l (損失厭惡) -) -``` +1. **無需參考模型** +2. **長度歸一化** - 解決長度偏差 -### 5.3 KTO的優勢場景 +### 5.2 損失函數 ```python -# KTO特別適合的數據格式 -# 不需要配對,只需要單獨標記好/壞 - -kto_dataset = [ - {"prompt": "問題1", "completion": "好的回答1", "label": True}, - {"prompt": "問題2", "completion": "壞的回答1", "label": False}, - {"prompt": "問題3", "completion": "好的回答2", "label": True}, - # 注意: prompt可以不同,不需要同一個prompt有好壞配對 -] - -# 轉換現有的人類反饋數據 -def convert_feedback_to_kto(feedback_data): - """ - 將用戶反饋數據轉換為KTO格式 - - 原始格式: [{"prompt": ..., "response": ..., "rating": 1-5}] - """ - kto_data = [] - - for item in feedback_data: - kto_data.append({ - "prompt": item["prompt"], - "completion": item["response"], - "label": item["rating"] >= 4 # 4-5分視為好回答 - }) - - return kto_data +def simpo_loss(chosen_log_probs, rejected_log_probs, + chosen_lengths, rejected_lengths, + beta=2.0, gamma=0.5): + chosen_rewards = beta * chosen_log_probs / chosen_lengths + rejected_rewards = beta * rejected_log_probs / rejected_lengths + return -F.logsigmoid(chosen_rewards - rejected_rewards - gamma).mean() ``` --- -## 6. ORPO: Odds Ratio Preference Optimization - -### 6.1 核心創新 +## 6. KTO: Kahneman-Tversky Optimization -ORPO (Odds Ratio Preference Optimization) 的創新是**整合SFT和對齊為一步**,通過在SFT損失中加入對比項。 +### 6.1 特點 -**ORPO損失函數**: +- 基於前景理論(損失厭惡) +- **只需二元標籤數據**,不需要成對偏好 -``` -L_ORPO = L_SFT + λ * L_OR - -L_OR = -log σ(log odds_θ(y_w|x) / odds_θ(y_l|x)) -``` +### 6.2 數據格式 -### 6.2 實現代碼 - -```python -def orpo_loss( - model_logps_chosen: torch.Tensor, - model_logps_rejected: torch.Tensor, - chosen_nll: torch.Tensor, # SFT損失部分 - lambda_orpo: float = 1.0 -) -> torch.Tensor: - """ - 計算ORPO損失 - - Args: - chosen_nll: 優選回答的負對數似然 (SFT損失) - lambda_orpo: 對比項權重 - """ - # 計算odds ratio - log_odds_chosen = model_logps_chosen - torch.log1p(-torch.exp(model_logps_chosen).clamp(max=0.9999)) - log_odds_rejected = model_logps_rejected - torch.log1p(-torch.exp(model_logps_rejected).clamp(max=0.9999)) - - # Odds ratio損失 - or_loss = -F.logsigmoid(log_odds_chosen - log_odds_rejected).mean() - - # 總損失 = SFT + lambda * OR - total_loss = chosen_nll.mean() + lambda_orpo * or_loss - - return total_loss - -# TRL配置 -from trl import ORPOConfig, ORPOTrainer - -orpo_config = ORPOConfig( - output_dir="./orpo-output", - beta=0.1, - learning_rate=5e-6, # ORPO通常可以用較高的學習率 - per_device_train_batch_size=4, - num_train_epochs=1, - # ORPO不需要參考模型 -) - -trainer = ORPOTrainer( - model=model, - args=orpo_config, - train_dataset=dataset, - tokenizer=tokenizer, -) +```json +{ + "prompt": "問題", + "response": "回答", + "label": true // 好或壞 +} ``` -### 6.3 ORPO的優勢 - -1. **一步完成** - 無需先SFT再對齊 -2. **無需參考模型** - 節省記憶體 -3. **更快收斂** - 同時學習任務和偏好 - --- -## 7. 方法對比與選擇指南 - -### 7.1 決策樹 - -``` - 開始 - │ - ▼ - ┌─────────────────┐ - │ 是否有配對數據? │ - └────────┬────────┘ - │ - ┌───────────┴───────────┐ - │ │ - ▼ ▼ - 是 否 - │ │ - ▼ ▼ - ┌─────────┐ ┌─────────┐ - │需要SFT嗎│ │ KTO │ - └────┬────┘ └─────────┘ - │ - ┌────┴────┐ - │ │ - ▼ ▼ - 是 否 - │ │ - ▼ ▼ - ┌─────┐ ┌─────────────┐ - │ORPO │ │ 記憶體受限? │ - └─────┘ └──────┬──────┘ - │ - ┌────┴────┐ - │ │ - ▼ ▼ - 是 否 - │ │ - ▼ ▼ - ┌──────┐ ┌─────┐ - │SimPO │ │ DPO │ - └──────┘ └─────┘ -``` +## 7. ORPO: Odds Ratio Preference Optimization -### 7.2 場景推薦 +### 7.1 特點 -| 場景 | 推薦方法 | 原因 | -|------|---------|------| -| **資源有限** | SimPO | 無需參考模型,記憶體減半 | -| **數據質量高** | DPO | 標準方法,效果穩定 | -| **數據可能有噪音** | IPO | 抗過擬合能力強 | -| **只有單獨標記** | KTO | 不需要配對數據 | -| **從頭訓練** | ORPO | 一步完成SFT+對齊 | -| **生產環境** | DPO/SimPO | 成熟穩定 | +結合 SFT 和偏好對齊為單一訓練階段。 -### 7.3 超參數速查表 +### 7.2 損失函數 -```python -hyperparams_by_method = { - "DPO": { - "beta": 0.1, - "learning_rate": 5e-7, - "epochs": 1, - "batch_size": 32 - }, - "IPO": { - "tau": 0.1, # 替代beta - "learning_rate": 5e-7, - "epochs": 1 - }, - "SimPO": { - "beta": 2.0, # 較大 - "gamma": 0.5, - "learning_rate": 5e-7 - }, - "KTO": { - "beta": 0.1, - "desirable_weight": 1.0, - "undesirable_weight": 1.33 # 損失厭惡 - }, - "ORPO": { - "beta": 0.1, - "learning_rate": 5e-6 # 較高 - } -} +``` +L_ORPO = L_SFT + λ · L_OR ``` --- -## 8. 實戰案例 +## 8. 方法對比與選擇指南 -### 8.1 完整訓練流程 +### 8.1 對比表 -```python -import torch -from transformers import AutoModelForCausalLM, AutoTokenizer -from datasets import load_dataset -from trl import DPOTrainer, DPOConfig -from peft import LoraConfig, get_peft_model - -# 1. 載入基礎模型 -model = AutoModelForCausalLM.from_pretrained( - "meta-llama/Llama-2-7b-hf", - torch_dtype=torch.bfloat16, - device_map="auto", - trust_remote_code=True -) - -tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf") -tokenizer.pad_token = tokenizer.eos_token - -# 2. 添加LoRA (可選,節省記憶體) -lora_config = LoraConfig( - r=16, - lora_alpha=32, - lora_dropout=0.05, - target_modules=["q_proj", "k_proj", "v_proj", "o_proj"], - task_type="CAUSAL_LM" -) -model = get_peft_model(model, lora_config) - -# 3. 載入偏好數據 -dataset = load_dataset("your_preference_dataset") - -def format_dataset(example): - """格式化數據""" - return { - "prompt": f"問題: {example['question']}\n回答: ", - "chosen": example["chosen_response"], - "rejected": example["rejected_response"] - } - -dataset = dataset.map(format_dataset) - -# 4. 選擇對齊方法 -# 方法A: DPO (需要參考模型) -if USE_DPO: - ref_model = AutoModelForCausalLM.from_pretrained( - "path/to/sft-model", - torch_dtype=torch.bfloat16, - device_map="auto" - ) - - config = DPOConfig( - output_dir="./dpo-output", - loss_type="sigmoid", # DPO默認 - beta=0.1, - learning_rate=5e-7, - per_device_train_batch_size=2, - gradient_accumulation_steps=8, - num_train_epochs=1, - bf16=True, - gradient_checkpointing=True, - ) - - trainer = DPOTrainer( - model=model, - ref_model=ref_model, - args=config, - train_dataset=dataset["train"], - tokenizer=tokenizer, - ) - -# 方法B: SimPO (不需要參考模型) -elif USE_SIMPO: - config = DPOConfig( - output_dir="./simpo-output", - loss_type="simpo", - beta=2.0, - simpo_gamma=0.5, - learning_rate=5e-7, - per_device_train_batch_size=2, - gradient_accumulation_steps=8, - num_train_epochs=1, - bf16=True, - ) - - trainer = DPOTrainer( - model=model, - ref_model=None, # SimPO不需要 - args=config, - train_dataset=dataset["train"], - tokenizer=tokenizer, - ) - -# 方法C: KTO (非配對數據) -elif USE_KTO: - # KTO數據格式不同 - kto_dataset = convert_to_kto_format(dataset) - - config = DPOConfig( - output_dir="./kto-output", - loss_type="kto", - beta=0.1, - desirable_weight=1.0, - undesirable_weight=1.33, - ) - - trainer = DPOTrainer( - model=model, - ref_model=ref_model, - args=config, - train_dataset=kto_dataset, - tokenizer=tokenizer, - ) - -# 5. 訓練 -trainer.train() - -# 6. 保存模型 -trainer.save_model("./final-aligned-model") -``` - -### 8.2 評估對齊效果 - -```python -from datasets import load_dataset -import numpy as np - -def evaluate_alignment(model, tokenizer, eval_dataset, method="pairwise"): - """ - 評估對齊效果 - - Args: - method: "pairwise" (配對比較) 或 "rating" (絕對評分) - """ - if method == "pairwise": - wins, losses, ties = 0, 0, 0 - - for example in eval_dataset: - prompt = example["prompt"] - - # 生成回答 - response = generate(model, tokenizer, prompt) - - # 使用GPT-4評判 - judge_result = judge_preference( - prompt=prompt, - response_a=response, - response_b=example["baseline_response"] - ) - - if judge_result == "A": - wins += 1 - elif judge_result == "B": - losses += 1 - else: - ties += 1 - - win_rate = wins / (wins + losses + ties) - return {"win_rate": win_rate, "wins": wins, "losses": losses, "ties": ties} - - elif method == "rating": - ratings = [] - - for example in eval_dataset: - response = generate(model, tokenizer, example["prompt"]) - - # 使用GPT-4打分 - rating = rate_response( - prompt=example["prompt"], - response=response, - criteria=["helpfulness", "harmlessness", "honesty"] - ) - ratings.append(rating) - - return { - "mean_rating": np.mean(ratings), - "std_rating": np.std(ratings) - } - -def judge_preference(prompt: str, response_a: str, response_b: str) -> str: - """使用GPT-4作為judge""" - judge_prompt = f""" - 請比較以下兩個回答,選出更好的一個。 - - 問題: {prompt} - - 回答A: {response_a} - - 回答B: {response_b} - - 請回答 "A" 或 "B" 或 "Tie"。只需要回答字母,不需要解釋。 - """ - - response = client.chat.completions.create( - model="gpt-4o", - messages=[{"role": "user", "content": judge_prompt}], - max_tokens=1 - ) - - return response.choices[0].message.content.strip() -``` - ---- +| 方法 | 需要獎勵模型 | 需要參考模型 | 數據需求 | +|------|------------|------------|---------| +| RLHF | ✅ | ✅ | 偏好對 | +| DPO | ❌ | ✅ | 偏好對 | +| SimPO | ❌ | ❌ | 偏好對 | +| KTO | ❌ | ✅ | 二元標籤 | +| ORPO | ❌ | ❌ | 偏好對 | -## 📚 參考文獻 +### 8.2 選擇建議 -1. **DPO**: Rafailov et al., "Direct Preference Optimization: Your Language Model is Secretly a Reward Model" (2023) -2. **IPO**: Azar et al., "A General Theoretical Paradigm to Understand Learning from Human Feedback" (2023) -3. **SimPO**: Meng et al., "SimPO: Simple Preference Optimization with a Reference-Free Reward" (2024) -4. **KTO**: Ethayarajh et al., "KTO: Model Alignment as Prospect Theoretic Optimization" (2024) -5. **ORPO**: Hong et al., "ORPO: Monolithic Preference Optimization without Reference Model" (2024) +- **資源有限**: SimPO +- **高品質對齊**: DPO +- **只有二元標籤**: KTO +- **從頭訓練**: ORPO --- -## 🔗 相關章節 - -- [監督微調 (SFT)](../5.監督微調%20(SFT)/README.md) -- [偏好對齊技術](../6.偏好對齊%20(Alignment)%20技術/README.md) -- [模型評估](../9.模型評估/README.md) +*本文檔持續更新中* diff --git "a/2.\346\267\261\345\205\245LLM\346\250\241\345\236\213\345\267\245\347\250\213\350\210\207LLM\351\201\213\347\266\255/12.\346\216\250\347\220\206\346\250\241\345\236\213\346\207\211\347\224\250/1_\346\216\250\347\220\206\350\203\275\345\212\233\350\247\243\346\236\220.md" "b/2.\346\267\261\345\205\245LLM\346\250\241\345\236\213\345\267\245\347\250\213\350\210\207LLM\351\201\213\347\266\255/12.\346\216\250\347\220\206\346\250\241\345\236\213\346\207\211\347\224\250/1_\346\216\250\347\220\206\350\203\275\345\212\233\350\247\243\346\236\220.md" new file mode 100644 index 0000000..2c83e2c --- /dev/null +++ "b/2.\346\267\261\345\205\245LLM\346\250\241\345\236\213\345\267\245\347\250\213\350\210\207LLM\351\201\213\347\266\255/12.\346\216\250\347\220\206\346\250\241\345\236\213\346\207\211\347\224\250/1_\346\216\250\347\220\206\350\203\275\345\212\233\350\247\243\346\236\220.md" @@ -0,0 +1,802 @@ +# 推理能力解析:o1、DeepSeek-R1 與 Claude + +> **最後更新**: 2025-01-15 +> **涵蓋模型**: OpenAI o1/o3, DeepSeek-R1, Claude 3.5/Opus + +--- + +## 目錄 + +1. [推理模型概述](#1-推理模型概述) +2. [OpenAI o 系列](#2-openai-o-系列) +3. [DeepSeek-R1](#3-deepseek-r1) +4. [Claude 的推理能力](#4-claude-的推理能力) +5. [推理能力對比分析](#5-推理能力對比分析) +6. [技術原理深度解析](#6-技術原理深度解析) + +--- + +## 1. 推理模型概述 + +### 1.1 什麼是推理能力? + +推理能力指的是模型在面對複雜問題時,能夠進行多步驟邏輯思考、分解問題、驗證假設並得出結論的能力。 + +``` +傳統 LLM 的處理流程: +┌─────────┐ ┌─────────┐ +│ 輸入 │ ───────► │ 輸出 │ +│ Prompt │ 直接 │ Answer │ +└─────────┘ 生成 └─────────┘ + +推理模型的處理流程: +┌─────────┐ ┌─────────────────────┐ ┌─────────┐ +│ 輸入 │ ─► │ 推理過程 │ ─► │ 輸出 │ +│ Prompt │ │ • 問題分解 │ │ Answer │ +└─────────┘ │ • 假設驗證 │ └─────────┘ + │ • 多路徑探索 │ + │ • 自我修正 │ + └─────────────────────┘ +``` + +### 1.2 推理能力的核心指標 + +| 指標 | 描述 | 測量方式 | +|------|------|----------| +| **邏輯連貫性** | 推理步驟之間的邏輯關聯 | 人工評估/自動驗證 | +| **問題分解** | 將複雜問題拆解為子問題 | 結構化分析 | +| **錯誤修正** | 識別並修正推理錯誤 | 對比實驗 | +| **知識整合** | 綜合運用多領域知識 | 跨領域測試 | +| **深度推理** | 處理多層嵌套邏輯 | 複雜度測試 | + +--- + +## 2. OpenAI o 系列 + +### 2.1 模型家族總覽 + +```python +# OpenAI o 系列模型發展時間線 +o_series = { + "o1-preview": { + "release": "2024-09", + "focus": "首個推理模型", + "reasoning_tokens": True, + "status": "deprecated" + }, + "o1": { + "release": "2024-12", + "focus": "完整推理能力", + "reasoning_tokens": True, + "status": "active" + }, + "o1-mini": { + "release": "2024-09", + "focus": "STEM 優化,高速度", + "reasoning_tokens": True, + "status": "active" + }, + "o3": { + "release": "2025-01", + "focus": "下一代推理", + "reasoning_tokens": True, + "status": "limited" + }, + "o3-mini": { + "release": "2025-01", + "focus": "平衡速度與能力", + "reasoning_tokens": True, + "status": "active" + } +} +``` + +### 2.2 o1 推理機制詳解 + +#### 思考鏈 (Chain-of-Thought) 架構 + +```python +# o1 的推理過程(概念性展示) +class O1ReasoningProcess: + """ + o1 使用內部思考鏈進行推理 + 用戶看不到完整的思考過程,只能看到結果 + """ + + def reason(self, problem: str) -> dict: + # 階段 1: 問題理解 + understanding = self.understand_problem(problem) + + # 階段 2: 策略制定 + strategies = self.formulate_strategies(understanding) + + # 階段 3: 探索多條推理路徑 + paths = [] + for strategy in strategies: + path = self.explore_path(strategy) + verification = self.verify_path(path) + paths.append({ + "path": path, + "confidence": verification.confidence, + "steps": verification.steps + }) + + # 階段 4: 選擇最佳路徑 + best_path = self.select_best_path(paths) + + # 階段 5: 生成最終答案 + answer = self.synthesize_answer(best_path) + + return { + "answer": answer, + "reasoning_tokens_used": self.token_counter, + "confidence": best_path["confidence"] + } +``` + +#### 推理 Token 機制 + +```python +from openai import OpenAI + +client = OpenAI() + +def analyze_o1_response(prompt: str) -> dict: + """分析 o1 的推理 Token 使用情況""" + + response = client.chat.completions.create( + model="o1", + messages=[{"role": "user", "content": prompt}], + max_completion_tokens=16000 + ) + + usage = response.usage + reasoning_tokens = usage.completion_tokens_details.reasoning_tokens + visible_tokens = usage.completion_tokens - reasoning_tokens + + return { + "input_tokens": usage.prompt_tokens, + "reasoning_tokens": reasoning_tokens, # 內部思考 + "visible_tokens": visible_tokens, # 最終輸出 + "total_tokens": usage.total_tokens, + "reasoning_ratio": reasoning_tokens / usage.completion_tokens, + "response": response.choices[0].message.content + } + +# 示例:數學問題 +result = analyze_o1_response(""" +證明:對於任意正整數 n, +1 + 2 + 3 + ... + n = n(n+1)/2 +使用數學歸納法。 +""") + +print(f"推理 Tokens: {result['reasoning_tokens']}") +print(f"可見 Tokens: {result['visible_tokens']}") +print(f"推理佔比: {result['reasoning_ratio']:.2%}") +``` + +### 2.3 o1 的能力邊界 + +#### 強項 + +```markdown +1. **數學推理**: 競賽級數學問題 + - AIME: 83.3% + - MATH: 94.8% + +2. **代碼推理**: 複雜算法設計 + - Codeforces: 89th 百分位 + - SWE-bench: 48.9% + +3. **科學推理**: PhD 級別問題 + - GPQA Diamond: 78.3% + +4. **邏輯推理**: 多步驟推導 + - BBH: 94.6% +``` + +#### 限制 + +```python +# o1 目前的限制 +o1_limitations = { + "no_system_message": "不支持 system message", + "no_streaming": "不支持流式輸出", + "no_tools": "有限的工具調用支持", + "no_vision": "o1 不支持圖像(o1-pro 支持)", + "high_latency": "複雜問題需要較長時間", + "high_cost": "價格是 GPT-4o 的 6-24 倍", + "no_temperature": "不支持 temperature 參數" +} +``` + +--- + +## 3. DeepSeek-R1 + +### 3.1 模型概述 + +DeepSeek-R1 是 DeepSeek 於 2025 年 1 月發布的開源推理模型,以極低成本實現了與 o1 相當的推理能力。 + +```python +# DeepSeek-R1 特性 +deepseek_r1_features = { + "architecture": "MoE (Mixture of Experts)", + "total_params": "671B", + "active_params": "37B", + "training_method": "Pure RL + Distillation", + "open_source": True, + "license": "MIT", + "api_cost": "$0.55/1M input, $2.19/1M output", + "distilled_versions": [ + "R1-Distill-Qwen-1.5B", + "R1-Distill-Qwen-7B", + "R1-Distill-Qwen-14B", + "R1-Distill-Qwen-32B", + "R1-Distill-Llama-8B", + "R1-Distill-Llama-70B" + ] +} +``` + +### 3.2 技術創新 + +#### 純強化學習訓練 + +```python +# DeepSeek-R1 的訓練方法(概念性) +class DeepSeekR1Training: + """ + 創新點:直接使用 RL 訓練推理能力 + 不依賴大量人工標註的思考過程數據 + """ + + def __init__(self): + self.reward_functions = { + "accuracy": self.accuracy_reward, + "format": self.format_reward, + "reasoning_quality": self.reasoning_quality_reward + } + + def train_step(self, problem, ground_truth): + # 生成多個推理軌跡 + trajectories = self.model.sample( + problem, + num_samples=16, + do_sample=True, + temperature=0.7 + ) + + # 計算獎勵 + rewards = [] + for traj in trajectories: + # 組合多個獎勵信號 + r = ( + 0.6 * self.accuracy_reward(traj, ground_truth) + + 0.2 * self.format_reward(traj) + + 0.2 * self.reasoning_quality_reward(traj) + ) + rewards.append(r) + + # 使用 GRPO (Group Relative Policy Optimization) + self.update_with_grpo(trajectories, rewards) + + def accuracy_reward(self, trajectory, ground_truth): + """答案正確性獎勵""" + final_answer = self.extract_answer(trajectory) + return 1.0 if self.verify_answer(final_answer, ground_truth) else 0.0 + + def format_reward(self, trajectory): + """格式規範獎勵""" + has_think_tag = "" in trajectory + has_answer_tag = "" in trajectory + return 1.0 if (has_think_tag and has_answer_tag) else 0.0 +``` + +#### 推理過程可視化 + +```python +from openai import OpenAI + +# DeepSeek 使用 OpenAI 兼容 API +client = OpenAI( + api_key="your-deepseek-api-key", + base_url="https://api.deepseek.com" +) + +def solve_with_deepseek_r1(problem: str) -> dict: + """使用 DeepSeek-R1 解題,包含推理過程""" + + response = client.chat.completions.create( + model="deepseek-reasoner", # R1 模型 + messages=[{"role": "user", "content": problem}], + max_tokens=8000 + ) + + message = response.choices[0].message + + return { + "answer": message.content, + "reasoning": message.reasoning_content, # 完整推理過程! + "input_tokens": response.usage.prompt_tokens, + "output_tokens": response.usage.completion_tokens, + "reasoning_tokens": response.usage.completion_tokens_details.reasoning_tokens + } + +# 示例 +result = solve_with_deepseek_r1(""" +一個農夫要將狼、羊和白菜運過河。 +船每次只能載農夫和一樣物品。 +如果農夫不在,狼會吃羊,羊會吃白菜。 +請問農夫應該怎麼做? +""") + +print("推理過程:") +print(result["reasoning"]) +print("\n最終答案:") +print(result["answer"]) +``` + +### 3.3 本地部署 + +```python +# 使用 Ollama 本地部署蒸餾版本 +""" +# 安裝蒸餾版本 +ollama pull deepseek-r1:7b # 7B 版本,約 4GB +ollama pull deepseek-r1:14b # 14B 版本,約 8GB +ollama pull deepseek-r1:32b # 32B 版本,約 18GB +ollama pull deepseek-r1:70b # 70B 版本,約 40GB + +# 運行 +ollama run deepseek-r1:14b +""" + +import ollama + +def local_reasoning(problem: str, model: str = "deepseek-r1:14b") -> dict: + """本地 DeepSeek-R1 推理""" + + response = ollama.chat( + model=model, + messages=[{"role": "user", "content": problem}] + ) + + return { + "content": response["message"]["content"], + "model": model, + "eval_duration": response.get("eval_duration", 0) + } + +# 使用示例 +result = local_reasoning("解釋什麼是動態規劃,並給出一個例子") +print(result["content"]) +``` + +--- + +## 4. Claude 的推理能力 + +### 4.1 Claude 的推理特點 + +Claude(尤其是 Claude 3.5 Sonnet 和 Claude Opus)雖然不是專門的「推理模型」,但具備出色的分析和推理能力。 + +```python +# Claude 模型推理特性對比 +claude_reasoning = { + "claude-3-5-sonnet-20241022": { + "推理風格": "清晰結構化", + "代碼能力": "極強", + "數學能力": "中上", + "特色": "平衡的性價比", + "extended_thinking": False + }, + "claude-3-opus-20240229": { + "推理風格": "深度分析", + "代碼能力": "強", + "數學能力": "上", + "特色": "複雜任務首選", + "extended_thinking": False + }, + "claude-opus-4-20250514": { # 假設的新版本 + "推理風格": "擴展思考", + "代碼能力": "極強", + "數學能力": "極強", + "特色": "支持 extended thinking", + "extended_thinking": True + } +} +``` + +### 4.2 Claude 的擴展思考 (Extended Thinking) + +```python +import anthropic + +client = anthropic.Anthropic() + +def claude_extended_thinking(problem: str) -> dict: + """使用 Claude 的擴展思考功能""" + + # 注意:extended_thinking 是較新的功能 + response = client.messages.create( + model="claude-opus-4-5-20251101", + max_tokens=16000, + thinking={ + "type": "enabled", + "budget_tokens": 10000 # 思考 token 預算 + }, + messages=[{"role": "user", "content": problem}] + ) + + # 解析回應 + thinking_content = "" + answer_content = "" + + for block in response.content: + if block.type == "thinking": + thinking_content = block.thinking + elif block.type == "text": + answer_content = block.text + + return { + "thinking": thinking_content, + "answer": answer_content, + "input_tokens": response.usage.input_tokens, + "output_tokens": response.usage.output_tokens + } + +# 示例 +result = claude_extended_thinking(""" +設計一個分布式系統來處理每秒百萬級別的事件流。 +需要考慮: +1. 高可用性 +2. 低延遲 +3. 數據一致性 +4. 水平擴展能力 +""") + +print("思考過程:") +print(result["thinking"]) +print("\n設計方案:") +print(result["answer"]) +``` + +### 4.3 Claude 的結構化推理 + +```python +def structured_reasoning_with_claude(problem: str) -> dict: + """引導 Claude 進行結構化推理""" + + system_prompt = """你是一個嚴謹的問題解決專家。 + +對於每個問題,請按照以下結構進行分析: + +1. **問題理解**:明確問題的核心要求 +2. **已知條件**:列出所有已知信息 +3. **分析思路**:提出解決策略 +4. **逐步推導**:詳細的推理步驟 +5. **驗證檢查**:驗證答案的正確性 +6. **最終答案**:簡潔的結論""" + + response = client.messages.create( + model="claude-3-5-sonnet-20241022", + max_tokens=4000, + system=system_prompt, + messages=[{"role": "user", "content": problem}] + ) + + return { + "structured_response": response.content[0].text, + "tokens": response.usage.input_tokens + response.usage.output_tokens + } +``` + +--- + +## 5. 推理能力對比分析 + +### 5.1 基準測試對比 + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ 推理能力基準測試對比 │ +├─────────────────────────────────────────────────────────────────┤ +│ │ +│ AIME 2024 (數學競賽): │ +│ ┌──────────────────────────────────────────────────────┐ │ +│ │ o3 (high) █████████████████████████████ 96.7% │ │ +│ │ DeepSeek-R1 ███████████████████████████ 79.8% │ │ +│ │ o1 ██████████████████████████ 79.2% │ │ +│ │ Claude 3.5 Opus ████████████ 40.0% │ │ +│ │ Claude 3.5 Sonnet ██████ 16.0% │ │ +│ │ GPT-4o ███ 9.3% │ │ +│ └──────────────────────────────────────────────────────┘ │ +│ │ +│ GPQA Diamond (科學推理): │ +│ ┌──────────────────────────────────────────────────────┐ │ +│ │ o3 (high) ██████████████████████████████ 87.7%│ │ +│ │ o1 █████████████████████████ 78.3% │ │ +│ │ DeepSeek-R1 ███████████████████████ 71.5% │ │ +│ │ Claude 3.5 Opus █████████████████████ 68.0% │ │ +│ │ Claude 3.5 Sonnet ██████████████████ 65.0% │ │ +│ │ GPT-4o ████████████████ 56.1% │ │ +│ └──────────────────────────────────────────────────────┘ │ +│ │ +│ SWE-bench Verified (程式碼): │ +│ ┌──────────────────────────────────────────────────────┐ │ +│ │ o3 ██████████████████████████████ 71.7%│ │ +│ │ Claude 3.5 Sonnet ██████████████████████ 50.8% │ │ +│ │ DeepSeek-R1 ████████████████████ 49.2% │ │ +│ │ o1 ████████████████████ 48.9% │ │ +│ │ GPT-4o ██████████████ 38.2% │ │ +│ └──────────────────────────────────────────────────────┘ │ +│ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +### 5.2 能力矩陣對比 + +| 能力維度 | o1 | o3 | DeepSeek-R1 | Claude 3.5 Sonnet | Claude Opus | +|---------|----|----|-------------|-------------------|-------------| +| **數學推理** | ★★★★★ | ★★★★★+ | ★★★★★ | ★★★☆☆ | ★★★★☆ | +| **代碼生成** | ★★★★☆ | ★★★★★ | ★★★★☆ | ★★★★★ | ★★★★☆ | +| **科學推理** | ★★★★★ | ★★★★★+ | ★★★★☆ | ★★★★☆ | ★★★★★ | +| **邏輯推理** | ★★★★★ | ★★★★★ | ★★★★★ | ★★★★☆ | ★★★★★ | +| **創意寫作** | ★★☆☆☆ | ★★☆☆☆ | ★★★☆☆ | ★★★★★ | ★★★★★ | +| **指令遵循** | ★★★★☆ | ★★★★☆ | ★★★★☆ | ★★★★★ | ★★★★★ | +| **響應速度** | ★★☆☆☆ | ★★☆☆☆ | ★★★☆☆ | ★★★★★ | ★★★☆☆ | +| **成本效益** | ★★☆☆☆ | ★☆☆☆☆ | ★★★★★ | ★★★★☆ | ★★★☆☆ | + +### 5.3 推理風格對比 + +```python +# 同一問題,不同模型的推理風格 + +problem = "一個桶裝滿水重 15kg,裝半桶水重 9kg,求桶和水的重量" + +# o1 風格:內部大量推理,簡潔輸出 +o1_style = """ +設桶重 x kg,水重 y kg。 +- 滿桶:x + y = 15 +- 半桶:x + y/2 = 9 +解得:y = 12 kg,x = 3 kg +答案:桶重 3kg,水重 12kg +""" + +# DeepSeek-R1 風格:展示完整思考過程 +deepseek_style = """ + +讓我仔細分析這個問題... +已知條件: +1. 桶 + 滿水 = 15kg +2. 桶 + 半桶水 = 9kg + +我可以設變量來解決: +設桶的重量為 B,滿桶水的重量為 W + +方程組: +B + W = 15 ... (1) +B + W/2 = 9 ... (2) + +用 (1) - (2): +W - W/2 = 15 - 9 +W/2 = 6 +W = 12 + +代入 (1): +B = 15 - 12 = 3 + +驗證: +桶(3) + 滿水(12) = 15 ✓ +桶(3) + 半水(6) = 9 ✓ + + +答案:桶重 3 公斤,水重 12 公斤。 +""" + +# Claude 風格:結構化且解釋充分 +claude_style = """ +## 問題分析 +這是一個經典的二元一次方程組問題。 + +## 解題步驟 + +**設定變量:** +- 設桶重為 x kg +- 設滿桶水重為 y kg + +**建立方程組:** +根據題目條件: +1. 桶 + 滿水:x + y = 15 +2. 桶 + 半水:x + y/2 = 9 + +**求解:** +將方程(1)減去方程(2): +(x + y) - (x + y/2) = 15 - 9 +y/2 = 6 +**y = 12 kg** + +代入方程(1): +x + 12 = 15 +**x = 3 kg** + +## 答案 +- 桶的重量:**3 公斤** +- 水的重量:**12 公斤** + +## 驗證 +- 滿桶:3 + 12 = 15 kg ✓ +- 半桶:3 + 6 = 9 kg ✓ +""" +``` + +--- + +## 6. 技術原理深度解析 + +### 6.1 訓練方法對比 + +```python +# 不同模型的訓練方法概念對比 + +training_methods = { + "o1": { + "base": "GPT-4 架構", + "training": [ + "大規模預訓練", + "指令微調 (SFT)", + "人類反饋強化學習 (RLHF)", + "推理專項強化學習" + ], + "key_innovation": "專門的推理獎勵模型", + "data": "人工標註的思考過程 + 自動生成", + "compute": "極高" + }, + + "deepseek_r1": { + "base": "DeepSeek-V3 (MoE)", + "training": [ + "大規模預訓練", + "純強化學習推理訓練", + "蒸餾到小模型" + ], + "key_innovation": "無需標註即可湧現推理能力", + "data": "不需要人工標註的思考過程", + "compute": "相對較低 (<$6M)" + }, + + "claude": { + "base": "Constitutional AI 架構", + "training": [ + "大規模預訓練", + "CAI (Constitutional AI)", + "RLHF", + "Safety training" + ], + "key_innovation": "通過原則約束實現對齊", + "data": "高質量人類反饋", + "compute": "高" + } +} +``` + +### 6.2 推理機制差異 + +```python +class ReasoningMechanismComparison: + """推理機制對比""" + + @staticmethod + def o1_mechanism(): + """ + o1: 內部思考鏈 + - 用戶看不到完整推理過程 + - 模型自主決定思考深度 + - 推理 token 單獨計費 + """ + return { + "visibility": "hidden", + "control": "automatic", + "billing": "separate_reasoning_tokens" + } + + @staticmethod + def deepseek_r1_mechanism(): + """ + DeepSeek-R1: 顯式思考過程 + - 推理過程對用戶可見 + - 使用 標記 + - 可以選擇是否輸出推理 + """ + return { + "visibility": "visible", + "control": "semi_automatic", + "billing": "unified" + } + + @staticmethod + def claude_mechanism(): + """ + Claude: 結構化推理 + - 通過 prompt 引導推理結構 + - 可選的 extended thinking + - 強調清晰的邏輯表達 + """ + return { + "visibility": "controllable", + "control": "user_guided", + "billing": "standard" + } +``` + +### 6.3 選擇建議 + +```python +def recommend_model(task: dict) -> str: + """根據任務特性推薦模型""" + + task_type = task.get("type", "general") + accuracy_need = task.get("accuracy", "medium") + budget = task.get("budget", "medium") + latency = task.get("latency_tolerance", "medium") + need_transparency = task.get("need_reasoning_transparency", False) + + # 決策邏輯 + if task_type in ["math", "science", "complex_reasoning"]: + if accuracy_need == "critical": + if budget == "high": + return "o3" + else: + return "DeepSeek-R1" + else: + return "o1-mini" + + elif task_type == "coding": + if accuracy_need == "critical": + return "Claude 3.5 Sonnet" + else: + return "DeepSeek-R1 或 o1-mini" + + elif task_type in ["writing", "conversation"]: + return "Claude 3.5 Sonnet" + + elif need_transparency: + return "DeepSeek-R1 (可見推理過程)" + + elif budget == "low": + return "DeepSeek-R1-Distill-7B (本地部署)" + + else: + return "Claude 3.5 Sonnet (平衡選擇)" + +# 使用示例 +task = { + "type": "math", + "accuracy": "critical", + "budget": "medium", + "latency_tolerance": "high", + "need_reasoning_transparency": True +} + +recommendation = recommend_model(task) +print(f"推薦模型: {recommendation}") +``` + +--- + +## 參考資源 + +- [OpenAI o1 Documentation](https://platform.openai.com/docs/guides/reasoning) +- [DeepSeek-R1 Paper](https://arxiv.org/abs/2501.12948) +- [DeepSeek-R1 GitHub](https://github.com/deepseek-ai/DeepSeek-R1) +- [Anthropic Claude Documentation](https://docs.anthropic.com/) +- [Chain-of-Thought Prompting](https://arxiv.org/abs/2201.11903) + +--- + +## 相關章節 + +- [使用場景與最佳實踐](./2_使用場景與最佳實踐.md) +- [成本效益分析](./3_成本效益分析.md) +- [與傳統模型的對比](./4_與傳統模型的對比.md) diff --git "a/3.LLM\346\207\211\347\224\250\345\267\245\347\250\213/11.MCP\345\215\224\350\255\260\350\210\207\345\267\245\345\205\267\350\252\277\347\224\250/examples/01_basic_mcp_server.py" "b/3.LLM\346\207\211\347\224\250\345\267\245\347\250\213/11.MCP\345\215\224\350\255\260\350\210\207\345\267\245\345\205\267\350\252\277\347\224\250/examples/01_basic_mcp_server.py" new file mode 100644 index 0000000..7f5748c --- /dev/null +++ "b/3.LLM\346\207\211\347\224\250\345\267\245\347\250\213/11.MCP\345\215\224\350\255\260\350\210\207\345\267\245\345\205\267\350\252\277\347\224\250/examples/01_basic_mcp_server.py" @@ -0,0 +1,247 @@ +""" +基礎 MCP 伺服器範例 +展示如何使用 Python SDK 創建一個簡單的 MCP 伺服器 + +安裝依賴: + pip install mcp + +運行方式: + python 01_basic_mcp_server.py +""" + +import asyncio +from typing import Any +from datetime import datetime + +# MCP SDK 導入 +try: + from mcp.server import Server + from mcp.server.stdio import stdio_server + from mcp.types import ( + Tool, + TextContent, + Resource, + ResourceTemplate, + ) + MCP_AVAILABLE = True +except ImportError: + MCP_AVAILABLE = False + print("MCP SDK 未安裝,此範例展示概念結構") + + +# ============ 工具定義 ============ + +def get_current_time() -> str: + """獲取當前時間""" + return datetime.now().strftime("%Y-%m-%d %H:%M:%S") + + +def calculate(expression: str) -> str: + """ + 安全的數學計算器 + 僅支持基礎運算 + """ + # 安全檢查:只允許數字和基本運算符 + allowed_chars = set("0123456789+-*/.() ") + if not all(c in allowed_chars for c in expression): + return "錯誤:表達式包含不允許的字符" + + try: + # 使用 eval 前進行嚴格驗證 + result = eval(expression, {"__builtins__": {}}, {}) + return str(result) + except Exception as e: + return f"計算錯誤: {str(e)}" + + +def search_notes(query: str, limit: int = 5) -> list: + """ + 搜尋筆記(模擬實現) + 實際應用中應連接真實的搜尋後端 + """ + # 模擬搜尋結果 + mock_notes = [ + {"id": 1, "title": "Transformer 架構解析", "content": "Self-attention 機制..."}, + {"id": 2, "title": "RAG 系統設計", "content": "檢索增強生成..."}, + {"id": 3, "title": "LLM 微調技巧", "content": "LoRA, QLoRA..."}, + {"id": 4, "title": "Prompt Engineering", "content": "Few-shot, CoT..."}, + {"id": 5, "title": "Agent 架構", "content": "ReAct, Tool Use..."}, + ] + + # 簡單的關鍵字匹配 + results = [ + note for note in mock_notes + if query.lower() in note["title"].lower() or query.lower() in note["content"].lower() + ] + + return results[:limit] + + +# ============ MCP 伺服器實現 ============ + +if MCP_AVAILABLE: + # 創建 MCP 伺服器實例 + server = Server("ai-learning-notes-server") + + # 定義可用工具 + @server.list_tools() + async def list_tools() -> list[Tool]: + """列出所有可用工具""" + return [ + Tool( + name="get_time", + description="獲取當前系統時間", + inputSchema={ + "type": "object", + "properties": {}, + "required": [] + } + ), + Tool( + name="calculate", + description="執行數學計算,支持加減乘除和括號", + inputSchema={ + "type": "object", + "properties": { + "expression": { + "type": "string", + "description": "數學表達式,例如: 2+3*4" + } + }, + "required": ["expression"] + } + ), + Tool( + name="search_notes", + description="搜尋 AI 學習筆記", + inputSchema={ + "type": "object", + "properties": { + "query": { + "type": "string", + "description": "搜尋關鍵字" + }, + "limit": { + "type": "integer", + "description": "返回結果數量上限", + "default": 5 + } + }, + "required": ["query"] + } + ) + ] + + # 處理工具調用 + @server.call_tool() + async def call_tool(name: str, arguments: dict) -> list[TextContent]: + """執行工具調用""" + if name == "get_time": + result = get_current_time() + elif name == "calculate": + result = calculate(arguments.get("expression", "")) + elif name == "search_notes": + notes = search_notes( + arguments.get("query", ""), + arguments.get("limit", 5) + ) + result = "\n".join([f"- {n['title']}: {n['content']}" for n in notes]) + if not result: + result = "未找到匹配的筆記" + else: + result = f"未知工具: {name}" + + return [TextContent(type="text", text=result)] + + # 定義資源 + @server.list_resources() + async def list_resources() -> list[Resource]: + """列出可用資源""" + return [ + Resource( + uri="notes://learning-path", + name="學習路線圖", + description="AI 學習路線完整指南", + mimeType="text/markdown" + ), + Resource( + uri="notes://glossary", + name="術語表", + description="AI/ML 常用術語解釋", + mimeType="text/markdown" + ) + ] + + @server.read_resource() + async def read_resource(uri: str) -> str: + """讀取資源內容""" + if uri == "notes://learning-path": + return """ +# AI 學習路線圖 + +## 階段一:基礎 +- Python 程式設計 +- 數學基礎(線性代數、微積分、概率統計) + +## 階段二:機器學習 +- Scikit-learn +- 特徵工程 +- 模型評估 + +## 階段三:深度學習 +- PyTorch / TensorFlow +- CNN, RNN, Transformer + +## 階段四:LLM 應用 +- Prompt Engineering +- RAG 系統 +- Agent 開發 +""" + elif uri == "notes://glossary": + return """ +# AI/ML 術語表 + +- **LLM**: Large Language Model,大型語言模型 +- **RAG**: Retrieval-Augmented Generation,檢索增強生成 +- **LoRA**: Low-Rank Adaptation,低秩適應微調 +- **MCP**: Model Context Protocol,模型上下文協議 +- **Agent**: 能夠自主使用工具完成任務的 AI 系統 +""" + else: + return f"資源不存在: {uri}" + + +# ============ 主程序 ============ + +async def main(): + """主函數""" + if not MCP_AVAILABLE: + print("\n" + "="*50) + print("MCP SDK 概念演示") + print("="*50) + + print("\n工具列表:") + print("1. get_time - 獲取當前時間") + print("2. calculate - 數學計算") + print("3. search_notes - 搜尋筆記") + + print("\n範例調用:") + print(f"get_time() = {get_current_time()}") + print(f"calculate('2+3*4') = {calculate('2+3*4')}") + print(f"search_notes('RAG') = {search_notes('RAG')}") + + print("\n要實際運行 MCP 伺服器,請安裝:") + print(" pip install mcp") + return + + # 啟動 MCP 伺服器(透過 stdio) + async with stdio_server() as (read_stream, write_stream): + await server.run( + read_stream, + write_stream, + server.create_initialization_options() + ) + + +if __name__ == "__main__": + asyncio.run(main()) diff --git "a/3.LLM\346\207\211\347\224\250\345\267\245\347\250\213/11.MCP\345\215\224\350\255\260\350\210\207\345\267\245\345\205\267\350\252\277\347\224\250/examples/02_mcp_client_example.py" "b/3.LLM\346\207\211\347\224\250\345\267\245\347\250\213/11.MCP\345\215\224\350\255\260\350\210\207\345\267\245\345\205\267\350\252\277\347\224\250/examples/02_mcp_client_example.py" new file mode 100644 index 0000000..c01156c --- /dev/null +++ "b/3.LLM\346\207\211\347\224\250\345\267\245\347\250\213/11.MCP\345\215\224\350\255\260\350\210\207\345\267\245\345\205\267\350\252\277\347\224\250/examples/02_mcp_client_example.py" @@ -0,0 +1,325 @@ +""" +MCP 客戶端範例 +展示如何連接和使用 MCP 伺服器 + +這個範例展示了如何: +1. 連接到 MCP 伺服器 +2. 列出可用工具 +3. 調用工具 +4. 處理響應 +""" + +import asyncio +import json +from typing import Any, Optional +from dataclasses import dataclass, field + + +# ============ MCP 客戶端模擬實現 ============ + +@dataclass +class MCPTool: + """MCP 工具定義""" + name: str + description: str + input_schema: dict + + +@dataclass +class MCPResource: + """MCP 資源定義""" + uri: str + name: str + description: str + mime_type: str = "text/plain" + + +@dataclass +class MCPClient: + """ + MCP 客戶端模擬 + 實際使用時應該使用 mcp.client 模組 + """ + server_name: str + tools: list = field(default_factory=list) + resources: list = field(default_factory=list) + connected: bool = False + + async def connect(self) -> bool: + """連接到 MCP 伺服器""" + print(f"正在連接到 MCP 伺服器: {self.server_name}") + await asyncio.sleep(0.1) # 模擬連接延遲 + self.connected = True + print("✅ 連接成功") + return True + + async def disconnect(self) -> None: + """斷開連接""" + self.connected = False + print("已斷開連接") + + async def list_tools(self) -> list[MCPTool]: + """列出可用工具""" + if not self.connected: + raise ConnectionError("未連接到伺服器") + + # 模擬從伺服器獲取工具列表 + self.tools = [ + MCPTool( + name="get_weather", + description="獲取指定城市的天氣資訊", + input_schema={ + "type": "object", + "properties": { + "city": {"type": "string", "description": "城市名稱"}, + "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]} + }, + "required": ["city"] + } + ), + MCPTool( + name="search_web", + description="搜尋網路資訊", + input_schema={ + "type": "object", + "properties": { + "query": {"type": "string"}, + "limit": {"type": "integer", "default": 10} + }, + "required": ["query"] + } + ), + MCPTool( + name="read_file", + description="讀取檔案內容", + input_schema={ + "type": "object", + "properties": { + "path": {"type": "string"} + }, + "required": ["path"] + } + ) + ] + return self.tools + + async def call_tool(self, name: str, arguments: dict) -> dict: + """ + 調用工具 + + Args: + name: 工具名稱 + arguments: 工具參數 + + Returns: + 工具執行結果 + """ + if not self.connected: + raise ConnectionError("未連接到伺服器") + + print(f"📞 調用工具: {name}") + print(f" 參數: {json.dumps(arguments, ensure_ascii=False)}") + + # 模擬工具執行 + await asyncio.sleep(0.2) + + # 模擬響應 + if name == "get_weather": + result = { + "status": "success", + "data": { + "city": arguments.get("city", "Unknown"), + "temperature": 25, + "unit": arguments.get("unit", "celsius"), + "condition": "晴天", + "humidity": 60 + } + } + elif name == "search_web": + result = { + "status": "success", + "data": { + "results": [ + {"title": f"搜尋結果 1 - {arguments.get('query')}", "url": "https://example.com/1"}, + {"title": f"搜尋結果 2 - {arguments.get('query')}", "url": "https://example.com/2"}, + ], + "total": 100 + } + } + elif name == "read_file": + result = { + "status": "success", + "data": { + "content": f"模擬檔案內容: {arguments.get('path')}", + "size": 1024 + } + } + else: + result = { + "status": "error", + "error": f"未知工具: {name}" + } + + print(f" 結果: {json.dumps(result, ensure_ascii=False, indent=2)}") + return result + + async def list_resources(self) -> list[MCPResource]: + """列出可用資源""" + if not self.connected: + raise ConnectionError("未連接到伺服器") + + self.resources = [ + MCPResource( + uri="file:///docs/readme.md", + name="README", + description="專案說明文件", + mime_type="text/markdown" + ), + MCPResource( + uri="db://users", + name="用戶資料庫", + description="用戶資料表", + mime_type="application/json" + ) + ] + return self.resources + + async def read_resource(self, uri: str) -> str: + """讀取資源內容""" + if not self.connected: + raise ConnectionError("未連接到伺服器") + + print(f"📖 讀取資源: {uri}") + return f"資源內容 ({uri}): 這是模擬的資源內容..." + + +# ============ 使用範例 ============ + +async def example_basic_usage(): + """基礎使用範例""" + print("\n" + "="*60) + print("MCP 客戶端基礎使用範例") + print("="*60) + + client = MCPClient(server_name="example-server") + + try: + # 連接到伺服器 + await client.connect() + + # 列出工具 + print("\n📋 可用工具:") + tools = await client.list_tools() + for tool in tools: + print(f" - {tool.name}: {tool.description}") + + # 調用工具 + print("\n🔧 工具調用示範:") + + # 獲取天氣 + weather = await client.call_tool("get_weather", {"city": "Taipei", "unit": "celsius"}) + + # 搜尋網頁 + search = await client.call_tool("search_web", {"query": "MCP protocol", "limit": 5}) + + # 列出資源 + print("\n📚 可用資源:") + resources = await client.list_resources() + for resource in resources: + print(f" - {resource.name} ({resource.uri})") + + finally: + await client.disconnect() + + +async def example_with_llm_integration(): + """ + 與 LLM 整合的範例 + 展示如何將 MCP 工具轉換為 LLM 可用的格式 + """ + print("\n" + "="*60) + print("MCP + LLM 整合範例") + print("="*60) + + client = MCPClient(server_name="llm-integration-server") + await client.connect() + + try: + # 獲取工具並轉換為 OpenAI 格式 + tools = await client.list_tools() + + openai_tools = [] + for tool in tools: + openai_tools.append({ + "type": "function", + "function": { + "name": tool.name, + "description": tool.description, + "parameters": tool.input_schema + } + }) + + print("\n🤖 轉換為 OpenAI 工具格式:") + print(json.dumps(openai_tools, ensure_ascii=False, indent=2)) + + # 模擬 LLM 決定使用工具 + print("\n💭 模擬 LLM 決策過程:") + print(" 用戶: '台北今天天氣如何?'") + print(" LLM: 我需要使用 get_weather 工具...") + + # 執行工具調用 + result = await client.call_tool("get_weather", {"city": "Taipei"}) + + # 模擬 LLM 整合結果 + print("\n🎯 LLM 最終回覆:") + if result["status"] == "success": + data = result["data"] + print(f" 台北今天天氣{data['condition']}," + f"氣溫 {data['temperature']}°C," + f"濕度 {data['humidity']}%。") + + finally: + await client.disconnect() + + +async def example_error_handling(): + """錯誤處理範例""" + print("\n" + "="*60) + print("MCP 錯誤處理範例") + print("="*60) + + client = MCPClient(server_name="error-demo-server") + + # 嘗試在未連接時調用工具 + print("\n❌ 測試未連接錯誤:") + try: + await client.call_tool("test", {}) + except ConnectionError as e: + print(f" 捕獲到錯誤: {e}") + + await client.connect() + + # 調用不存在的工具 + print("\n❌ 測試工具不存在錯誤:") + result = await client.call_tool("nonexistent_tool", {}) + if result["status"] == "error": + print(f" 錯誤訊息: {result['error']}") + + await client.disconnect() + + +# ============ 主程序 ============ + +async def main(): + """運行所有範例""" + await example_basic_usage() + await example_with_llm_integration() + await example_error_handling() + + print("\n" + "="*60) + print("✅ 所有範例執行完成") + print("="*60) + + +if __name__ == "__main__": + asyncio.run(main()) diff --git "a/3.LLM\346\207\211\347\224\250\345\267\245\347\250\213/12.\351\200\262\351\232\216\346\217\220\347\244\272\345\267\245\347\250\213\350\210\207\347\265\220\346\247\213\345\214\226\350\274\270\345\207\272/1_\347\265\220\346\247\213\345\214\226\350\274\270\345\207\272\346\214\207\345\215\227.md" "b/3.LLM\346\207\211\347\224\250\345\267\245\347\250\213/12.\351\200\262\351\232\216\346\217\220\347\244\272\345\267\245\347\250\213\350\210\207\347\265\220\346\247\213\345\214\226\350\274\270\345\207\272/1_\347\265\220\346\247\213\345\214\226\350\274\270\345\207\272\346\214\207\345\215\227.md" new file mode 100644 index 0000000..10fee89 --- /dev/null +++ "b/3.LLM\346\207\211\347\224\250\345\267\245\347\250\213/12.\351\200\262\351\232\216\346\217\220\347\244\272\345\267\245\347\250\213\350\210\207\347\265\220\346\247\213\345\214\226\350\274\270\345\207\272/1_\347\265\220\346\247\213\345\214\226\350\274\270\345\207\272\346\214\207\345\215\227.md" @@ -0,0 +1,1464 @@ +# 結構化輸出完整指南 + +> **最後更新**: 2025-01 +> **難度**: 中級到進階 +> **預計閱讀時間**: 45 分鐘 + +--- + +## 目錄 + +1. [為什麼需要結構化輸出](#1-為什麼需要結構化輸出) +2. [JSON Schema 強制輸出](#2-json-schema-強制輸出) +3. [Function Calling 深度實踐](#3-function-calling-深度實踐) +4. [XML 結構化輸出](#4-xml-結構化輸出) +5. [Pydantic 模型整合](#5-pydantic-模型整合) +6. [錯誤處理與重試機制](#6-錯誤處理與重試機制) +7. [實戰案例](#7-實戰案例) + +--- + +## 1. 為什麼需要結構化輸出 + +### 1.1 傳統輸出的問題 + +```python +# 傳統方式:期望 AI 返回 JSON,但結果不可預測 +prompt = "分析這個產品評論的情感,返回 JSON 格式" + +# 可能的輸出問題: +# 1. 返回 Markdown 包裹的 JSON +# 2. 額外的解釋文字 +# 3. 格式不一致 +# 4. 欄位命名不固定 +``` + +### 1.2 結構化輸出的優勢 + +| 特性 | 傳統輸出 | 結構化輸出 | +|------|---------|-----------| +| **可靠性** | 依賴模型理解 | Schema 強制約束 | +| **解析難度** | 需要複雜正則 | 直接 JSON 解析 | +| **錯誤率** | 5-15% 解析失敗 | 接近 0% | +| **開發效率** | 需要大量驗證代碼 | 類型安全 | + +--- + +## 2. JSON Schema 強制輸出 + +### 2.1 OpenAI Structured Output (推薦) + +```python +from openai import OpenAI +from pydantic import BaseModel, Field +from typing import List, Optional, Literal +from enum import Enum + +client = OpenAI() + +# 定義情感枚舉 +class Sentiment(str, Enum): + POSITIVE = "positive" + NEGATIVE = "negative" + NEUTRAL = "neutral" + MIXED = "mixed" + +# 定義輸出結構 +class ReviewAnalysis(BaseModel): + """產品評論分析結果""" + sentiment: Sentiment = Field(description="整體情感傾向") + confidence: float = Field(ge=0, le=1, description="置信度 0-1") + key_points: List[str] = Field(description="評論要點", min_length=1, max_length=5) + pros: List[str] = Field(default=[], description="優點列表") + cons: List[str] = Field(default=[], description="缺點列表") + summary: str = Field(description="一句話總結", max_length=100) + recommended: bool = Field(description="是否推薦") + +# 使用 Structured Output +def analyze_review(review_text: str) -> ReviewAnalysis: + response = client.beta.chat.completions.parse( + model="gpt-4o-2024-08-06", + messages=[ + { + "role": "system", + "content": "你是專業的產品評論分析師。請仔細分析評論並提取關鍵資訊。" + }, + { + "role": "user", + "content": f"請分析以下產品評論:\n\n{review_text}" + } + ], + response_format=ReviewAnalysis + ) + + return response.choices[0].message.parsed + +# 使用範例 +review = """ +這款耳機音質非常棒,低音渾厚,高音清晰。 +佩戴也很舒適,可以長時間使用。 +唯一的缺點是價格有點貴,而且不支援有線連接。 +整體來說還是很滿意的。 +""" + +result = analyze_review(review) +print(f"情感: {result.sentiment.value}") +print(f"置信度: {result.confidence:.2%}") +print(f"要點: {result.key_points}") +print(f"推薦: {'是' if result.recommended else '否'}") +``` + +### 2.2 複雜嵌套結構 + +```python +from pydantic import BaseModel, Field +from typing import List, Optional, Dict, Any +from datetime import datetime + +class ContactInfo(BaseModel): + """聯絡資訊""" + email: Optional[str] = Field(default=None, pattern=r'^[\w.-]+@[\w.-]+\.\w+$') + phone: Optional[str] = None + address: Optional[str] = None + +class Skill(BaseModel): + """技能""" + name: str + level: Literal["beginner", "intermediate", "advanced", "expert"] + years_of_experience: float = Field(ge=0) + +class Education(BaseModel): + """教育背景""" + institution: str + degree: str + field_of_study: str + start_year: int = Field(ge=1950, le=2030) + end_year: Optional[int] = Field(default=None, ge=1950, le=2030) + gpa: Optional[float] = Field(default=None, ge=0, le=4.0) + +class WorkExperience(BaseModel): + """工作經驗""" + company: str + position: str + start_date: str # YYYY-MM + end_date: Optional[str] = None # YYYY-MM 或 "present" + responsibilities: List[str] = Field(min_length=1) + achievements: List[str] = Field(default=[]) + +class ResumeExtraction(BaseModel): + """履歷資訊提取""" + name: str + title: Optional[str] = None + contact: ContactInfo + summary: str = Field(max_length=500) + skills: List[Skill] = Field(min_length=1) + education: List[Education] = Field(min_length=1) + work_experience: List[WorkExperience] = Field(default=[]) + certifications: List[str] = Field(default=[]) + languages: List[Dict[str, str]] = Field(default=[]) + + class Config: + json_schema_extra = { + "examples": [{ + "name": "張小明", + "title": "資深軟體工程師", + "contact": {"email": "zhang@example.com"}, + "summary": "10年軟體開發經驗", + "skills": [{"name": "Python", "level": "expert", "years_of_experience": 8}], + "education": [{"institution": "台灣大學", "degree": "碩士", "field_of_study": "資訊工程", "start_year": 2010, "end_year": 2012}] + }] + } + +def extract_resume(resume_text: str) -> ResumeExtraction: + """從履歷文本中提取結構化資訊""" + response = client.beta.chat.completions.parse( + model="gpt-4o", + messages=[ + { + "role": "system", + "content": """你是專業的履歷分析師。請從履歷中準確提取所有相關資訊。 + +注意事項: +- 技能等級根據描述判斷:初學(beginner)、中級(intermediate)、進階(advanced)、專家(expert) +- 日期格式使用 YYYY-MM +- 如果某項資訊未提供,使用適當的預設值或 null +- 確保所有提取的資訊準確無誤""" + }, + { + "role": "user", + "content": f"請提取以下履歷的資訊:\n\n{resume_text}" + } + ], + response_format=ResumeExtraction + ) + + return response.choices[0].message.parsed +``` + +### 2.3 Anthropic Claude 結構化輸出 + +```python +from anthropic import Anthropic +import json +from pydantic import BaseModel, ValidationError + +client = Anthropic() + +class TaskExtraction(BaseModel): + """任務提取結果""" + tasks: List[Dict[str, Any]] + priority_order: List[int] + estimated_total_hours: float + warnings: List[str] + +def extract_tasks_claude(text: str) -> TaskExtraction: + """使用 Claude 提取任務""" + + # Claude 使用 XML 標籤強制 JSON 輸出 + system_prompt = """你是一個任務提取專家。請分析文本並提取所有任務。 + +嚴格按照以下 JSON 格式輸出,不要有任何其他文字: +{ + "tasks": [ + { + "id": 1, + "title": "任務標題", + "description": "任務描述", + "priority": "high/medium/low", + "estimated_hours": 2.0, + "dependencies": [] + } + ], + "priority_order": [1, 2, 3], + "estimated_total_hours": 10.0, + "warnings": ["任何需要注意的事項"] +}""" + + response = client.messages.create( + model="claude-sonnet-4-20250514", + max_tokens=2048, + system=system_prompt, + messages=[ + { + "role": "user", + "content": f"請從以下文本中提取任務:\n\n{text}" + } + ] + ) + + # 解析 JSON + try: + result_text = response.content[0].text + # 清理可能的 markdown 包裹 + if result_text.startswith("```"): + result_text = result_text.split("```")[1] + if result_text.startswith("json"): + result_text = result_text[4:] + + data = json.loads(result_text.strip()) + return TaskExtraction(**data) + except (json.JSONDecodeError, ValidationError) as e: + raise ValueError(f"Failed to parse response: {e}") + +# 更可靠的方法:使用 Claude 的 Tool Use +def extract_tasks_with_tools(text: str) -> dict: + """使用 Claude Tool Use 獲得結構化輸出""" + + tools = [ + { + "name": "submit_task_extraction", + "description": "提交提取的任務列表", + "input_schema": { + "type": "object", + "properties": { + "tasks": { + "type": "array", + "items": { + "type": "object", + "properties": { + "id": {"type": "integer"}, + "title": {"type": "string"}, + "description": {"type": "string"}, + "priority": {"type": "string", "enum": ["high", "medium", "low"]}, + "estimated_hours": {"type": "number"}, + "dependencies": {"type": "array", "items": {"type": "integer"}} + }, + "required": ["id", "title", "priority", "estimated_hours"] + } + }, + "priority_order": { + "type": "array", + "items": {"type": "integer"} + }, + "estimated_total_hours": {"type": "number"}, + "warnings": { + "type": "array", + "items": {"type": "string"} + } + }, + "required": ["tasks", "priority_order", "estimated_total_hours"] + } + } + ] + + response = client.messages.create( + model="claude-sonnet-4-20250514", + max_tokens=2048, + tools=tools, + tool_choice={"type": "tool", "name": "submit_task_extraction"}, + messages=[ + { + "role": "user", + "content": f"分析以下文本並提取所有任務,然後使用 submit_task_extraction 工具提交結果:\n\n{text}" + } + ] + ) + + # 提取工具調用結果 + for block in response.content: + if block.type == "tool_use": + return block.input + + raise ValueError("No tool use in response") +``` + +--- + +## 3. Function Calling 深度實踐 + +### 3.1 完整的 Function Calling 流程 + +```python +from openai import OpenAI +import json +from typing import Callable, Dict, Any, List +from dataclasses import dataclass +import asyncio + +client = OpenAI() + +@dataclass +class ToolDefinition: + """工具定義""" + name: str + description: str + parameters: dict + function: Callable + +class FunctionCallingAgent: + """Function Calling 代理""" + + def __init__(self, model: str = "gpt-4o"): + self.model = model + self.tools: Dict[str, ToolDefinition] = {} + self.conversation_history: List[dict] = [] + + def register_tool(self, tool: ToolDefinition): + """註冊工具""" + self.tools[tool.name] = tool + + def _get_tool_schemas(self) -> List[dict]: + """獲取工具 Schema 列表""" + return [ + { + "type": "function", + "function": { + "name": tool.name, + "description": tool.description, + "parameters": tool.parameters + } + } + for tool in self.tools.values() + ] + + def _execute_tool(self, name: str, arguments: dict) -> Any: + """執行工具""" + if name not in self.tools: + raise ValueError(f"Unknown tool: {name}") + + tool = self.tools[name] + return tool.function(**arguments) + + def chat(self, user_message: str, system_prompt: str = None) -> str: + """對話並處理 Function Calling""" + + # 添加系統提示 + if system_prompt and not self.conversation_history: + self.conversation_history.append({ + "role": "system", + "content": system_prompt + }) + + # 添加用戶訊息 + self.conversation_history.append({ + "role": "user", + "content": user_message + }) + + # 第一次 API 調用 + response = client.chat.completions.create( + model=self.model, + messages=self.conversation_history, + tools=self._get_tool_schemas() if self.tools else None, + tool_choice="auto" + ) + + assistant_message = response.choices[0].message + + # 處理工具調用 + while assistant_message.tool_calls: + self.conversation_history.append(assistant_message) + + # 執行所有工具調用 + for tool_call in assistant_message.tool_calls: + function_name = tool_call.function.name + function_args = json.loads(tool_call.function.arguments) + + try: + result = self._execute_tool(function_name, function_args) + tool_result = {"success": True, "data": result} + except Exception as e: + tool_result = {"success": False, "error": str(e)} + + # 添加工具結果 + self.conversation_history.append({ + "role": "tool", + "tool_call_id": tool_call.id, + "content": json.dumps(tool_result, ensure_ascii=False) + }) + + # 繼續對話獲取最終回應 + response = client.chat.completions.create( + model=self.model, + messages=self.conversation_history, + tools=self._get_tool_schemas() if self.tools else None + ) + + assistant_message = response.choices[0].message + + # 添加最終回應 + self.conversation_history.append({ + "role": "assistant", + "content": assistant_message.content + }) + + return assistant_message.content + +# 定義實際工具函數 +def search_products( + query: str, + category: str = None, + min_price: float = None, + max_price: float = None, + sort_by: str = "relevance" +) -> List[dict]: + """搜索產品(模擬)""" + # 實際實現會連接資料庫或 API + return [ + {"id": "P001", "name": f"{query} - 產品A", "price": 299, "rating": 4.5}, + {"id": "P002", "name": f"{query} - 產品B", "price": 199, "rating": 4.2}, + {"id": "P003", "name": f"{query} - 產品C", "price": 399, "rating": 4.8}, + ] + +def get_product_details(product_id: str) -> dict: + """獲取產品詳情(模擬)""" + return { + "id": product_id, + "name": "示例產品", + "description": "這是一個很棒的產品", + "price": 299, + "stock": 50, + "specifications": {"weight": "200g", "dimensions": "10x10x5cm"} + } + +def add_to_cart(product_id: str, quantity: int = 1) -> dict: + """添加到購物車(模擬)""" + return { + "success": True, + "cart_id": "CART123", + "item": {"product_id": product_id, "quantity": quantity}, + "message": f"已添加 {quantity} 件商品到購物車" + } + +def calculate_shipping(address: str, items: List[dict]) -> dict: + """計算運費(模擬)""" + return { + "standard": {"price": 60, "days": "3-5"}, + "express": {"price": 120, "days": "1-2"}, + "same_day": {"price": 200, "days": "當日"} + } + +# 創建代理並註冊工具 +agent = FunctionCallingAgent() + +agent.register_tool(ToolDefinition( + name="search_products", + description="搜索產品目錄。可以按關鍵詞、分類、價格範圍搜索", + parameters={ + "type": "object", + "properties": { + "query": {"type": "string", "description": "搜索關鍵詞"}, + "category": {"type": "string", "description": "產品分類", "enum": ["electronics", "clothing", "books", "home"]}, + "min_price": {"type": "number", "description": "最低價格"}, + "max_price": {"type": "number", "description": "最高價格"}, + "sort_by": {"type": "string", "description": "排序方式", "enum": ["relevance", "price_asc", "price_desc", "rating"]} + }, + "required": ["query"] + }, + function=search_products +)) + +agent.register_tool(ToolDefinition( + name="get_product_details", + description="獲取特定產品的詳細資訊", + parameters={ + "type": "object", + "properties": { + "product_id": {"type": "string", "description": "產品 ID"} + }, + "required": ["product_id"] + }, + function=get_product_details +)) + +agent.register_tool(ToolDefinition( + name="add_to_cart", + description="將產品添加到購物車", + parameters={ + "type": "object", + "properties": { + "product_id": {"type": "string", "description": "產品 ID"}, + "quantity": {"type": "integer", "description": "數量", "minimum": 1, "default": 1} + }, + "required": ["product_id"] + }, + function=add_to_cart +)) + +agent.register_tool(ToolDefinition( + name="calculate_shipping", + description="計算運費選項", + parameters={ + "type": "object", + "properties": { + "address": {"type": "string", "description": "配送地址"}, + "items": { + "type": "array", + "items": { + "type": "object", + "properties": { + "product_id": {"type": "string"}, + "quantity": {"type": "integer"} + } + }, + "description": "購物車商品" + } + }, + "required": ["address", "items"] + }, + function=calculate_shipping +)) + +# 使用範例 +system_prompt = """你是一個電商購物助手。幫助用戶搜索商品、查看詳情、添加購物車和計算運費。 +請根據用戶需求使用適當的工具。如果需要多步操作,請依序完成。""" + +response = agent.chat( + "我想找一個藍牙耳機,預算在 500 以內,幫我看看有什麼選擇", + system_prompt=system_prompt +) +print(response) +``` + +### 3.2 並行工具調用 + +```python +import asyncio +from concurrent.futures import ThreadPoolExecutor +from typing import List, Dict, Any + +class ParallelToolExecutor: + """並行工具執行器""" + + def __init__(self, max_workers: int = 5): + self.executor = ThreadPoolExecutor(max_workers=max_workers) + + async def execute_tools_parallel( + self, + tool_calls: List[dict], + tool_functions: Dict[str, Callable] + ) -> List[Dict[str, Any]]: + """並行執行多個工具調用""" + + loop = asyncio.get_event_loop() + + async def execute_single(tool_call): + function_name = tool_call["function"]["name"] + function_args = json.loads(tool_call["function"]["arguments"]) + + if function_name not in tool_functions: + return { + "tool_call_id": tool_call["id"], + "role": "tool", + "content": json.dumps({"error": f"Unknown function: {function_name}"}) + } + + try: + # 在線程池中執行同步函數 + result = await loop.run_in_executor( + self.executor, + lambda: tool_functions[function_name](**function_args) + ) + + return { + "tool_call_id": tool_call["id"], + "role": "tool", + "content": json.dumps({"success": True, "data": result}, ensure_ascii=False) + } + except Exception as e: + return { + "tool_call_id": tool_call["id"], + "role": "tool", + "content": json.dumps({"success": False, "error": str(e)}) + } + + # 並行執行所有工具調用 + results = await asyncio.gather(*[ + execute_single(tc) for tc in tool_calls + ]) + + return results + +# 使用範例 +async def main(): + executor = ParallelToolExecutor() + + tool_functions = { + "search_products": search_products, + "get_product_details": get_product_details + } + + # 模擬多個工具調用 + tool_calls = [ + {"id": "call_1", "function": {"name": "search_products", "arguments": '{"query": "耳機"}'}}, + {"id": "call_2", "function": {"name": "search_products", "arguments": '{"query": "手機"}'}}, + {"id": "call_3", "function": {"name": "get_product_details", "arguments": '{"product_id": "P001"}'}} + ] + + results = await executor.execute_tools_parallel(tool_calls, tool_functions) + print(results) + +# asyncio.run(main()) +``` + +--- + +## 4. XML 結構化輸出 + +### 4.1 使用 XML 格式的場景 + +XML 格式適合: +- 層次化內容(如文檔結構) +- 需要屬性標注的資料 +- 與 Claude 模型搭配使用 + +```python +import xml.etree.ElementTree as ET +from typing import Dict, List, Any, Optional +from dataclasses import dataclass + +@dataclass +class DocumentSection: + """文檔章節""" + id: str + title: str + content: str + subsections: List['DocumentSection'] + metadata: Dict[str, str] + +def xml_prompt_for_document_analysis() -> str: + """文檔分析的 XML 格式提示""" + return """ +請分析文檔並以 XML 格式輸出結構化結果: + + + + 文檔標題 + 作者 + 日期 + 文檔類型:report/article/manual/other + + + + 文檔摘要 + + + +
+ 章節標題 + 章節內容摘要 + + 要點1 + 要點2 + + +
+ ... +
+
+
+
+ + + 人名 + 組織名 + 地點 + 日期 + + + + 結論1 + + + + 建議1 + +
+""" + +class XMLParser: + """XML 解析器""" + + @staticmethod + def parse_document_analysis(xml_string: str) -> Dict[str, Any]: + """解析文檔分析 XML""" + + # 清理可能的 markdown 包裹 + xml_string = xml_string.strip() + if xml_string.startswith("```"): + lines = xml_string.split("\n") + xml_string = "\n".join(lines[1:-1]) + + root = ET.fromstring(xml_string) + + result = { + "metadata": {}, + "summary": "", + "sections": [], + "entities": [], + "conclusions": [], + "recommendations": [] + } + + # 解析 metadata + metadata = root.find("metadata") + if metadata is not None: + for child in metadata: + result["metadata"][child.tag] = child.text + + # 解析 summary + summary = root.find("summary") + if summary is not None: + result["summary"] = { + "text": summary.text, + "importance": summary.get("importance", "medium") + } + + # 遞歸解析 sections + def parse_section(section_elem) -> Dict: + section = { + "id": section_elem.get("id"), + "level": int(section_elem.get("level", 1)), + "title": "", + "content": "", + "key_points": [], + "subsections": [] + } + + title = section_elem.find("title") + if title is not None: + section["title"] = title.text + + content = section_elem.find("content") + if content is not None: + section["content"] = content.text + + key_points = section_elem.find("key_points") + if key_points is not None: + for point in key_points.findall("point"): + section["key_points"].append({ + "text": point.text, + "importance": point.get("importance", "medium") + }) + + subsections = section_elem.find("subsections") + if subsections is not None: + for sub in subsections.findall("section"): + section["subsections"].append(parse_section(sub)) + + return section + + sections = root.find("sections") + if sections is not None: + for section in sections.findall("section"): + result["sections"].append(parse_section(section)) + + # 解析 entities + entities = root.find("entities") + if entities is not None: + for entity in entities.findall("entity"): + result["entities"].append({ + "text": entity.text, + "type": entity.get("type"), + "relevance": entity.get("relevance", "medium") + }) + + # 解析 conclusions + conclusions = root.find("conclusions") + if conclusions is not None: + for conclusion in conclusions.findall("conclusion"): + result["conclusions"].append({ + "text": conclusion.text, + "confidence": conclusion.get("confidence", "medium") + }) + + # 解析 recommendations + recommendations = root.find("recommendations") + if recommendations is not None: + for rec in recommendations.findall("recommendation"): + result["recommendations"].append({ + "text": rec.text, + "priority": int(rec.get("priority", 0)) + }) + + return result + +# 使用範例 +def analyze_document_with_xml(document_text: str) -> Dict[str, Any]: + """使用 XML 格式分析文檔""" + + response = client.chat.completions.create( + model="gpt-4o", + messages=[ + { + "role": "system", + "content": xml_prompt_for_document_analysis() + }, + { + "role": "user", + "content": f"請分析以下文檔:\n\n{document_text}" + } + ] + ) + + xml_output = response.choices[0].message.content + return XMLParser.parse_document_analysis(xml_output) +``` + +--- + +## 5. Pydantic 模型整合 + +### 5.1 進階 Pydantic 模型設計 + +```python +from pydantic import BaseModel, Field, field_validator, model_validator +from typing import List, Optional, Dict, Any, Union +from datetime import datetime, date +from enum import Enum +import re + +class Priority(str, Enum): + CRITICAL = "critical" + HIGH = "high" + MEDIUM = "medium" + LOW = "low" + +class Status(str, Enum): + TODO = "todo" + IN_PROGRESS = "in_progress" + REVIEW = "review" + DONE = "done" + BLOCKED = "blocked" + +class Tag(BaseModel): + """標籤""" + name: str = Field(min_length=1, max_length=50) + color: str = Field(default="#808080", pattern=r'^#[0-9A-Fa-f]{6}$') + +class User(BaseModel): + """用戶""" + id: str + name: str + email: str = Field(pattern=r'^[\w.-]+@[\w.-]+\.\w+$') + + @field_validator('email') + @classmethod + def validate_email(cls, v): + if not re.match(r'^[\w.-]+@[\w.-]+\.\w+$', v): + raise ValueError('Invalid email format') + return v.lower() + +class Comment(BaseModel): + """評論""" + id: str + author: User + content: str = Field(min_length=1) + created_at: datetime + mentions: List[str] = Field(default=[]) + +class Subtask(BaseModel): + """子任務""" + id: str + title: str = Field(min_length=1, max_length=200) + completed: bool = False + assignee: Optional[User] = None + +class Task(BaseModel): + """任務""" + id: str + title: str = Field(min_length=1, max_length=200) + description: Optional[str] = Field(default=None, max_length=2000) + priority: Priority = Priority.MEDIUM + status: Status = Status.TODO + assignee: Optional[User] = None + reporter: User + tags: List[Tag] = Field(default=[]) + subtasks: List[Subtask] = Field(default=[]) + comments: List[Comment] = Field(default=[]) + due_date: Optional[date] = None + estimated_hours: Optional[float] = Field(default=None, ge=0) + actual_hours: Optional[float] = Field(default=None, ge=0) + dependencies: List[str] = Field(default=[], description="依賴的任務 ID 列表") + created_at: datetime = Field(default_factory=datetime.now) + updated_at: datetime = Field(default_factory=datetime.now) + + @model_validator(mode='after') + def validate_dates(self): + if self.due_date and self.due_date < date.today(): + # 允許過去的截止日期,但添加警告 + pass + return self + + @field_validator('dependencies') + @classmethod + def validate_no_self_dependency(cls, v, info): + task_id = info.data.get('id') + if task_id and task_id in v: + raise ValueError('Task cannot depend on itself') + return v + +class Sprint(BaseModel): + """衝刺""" + id: str + name: str + goal: str = Field(max_length=500) + start_date: date + end_date: date + tasks: List[Task] = Field(default=[]) + velocity: Optional[float] = None + + @model_validator(mode='after') + def validate_sprint_dates(self): + if self.end_date <= self.start_date: + raise ValueError('End date must be after start date') + return self + + @property + def total_estimated_hours(self) -> float: + return sum(t.estimated_hours or 0 for t in self.tasks) + + @property + def completion_rate(self) -> float: + if not self.tasks: + return 0.0 + completed = sum(1 for t in self.tasks if t.status == Status.DONE) + return completed / len(self.tasks) + +class ProjectAnalysis(BaseModel): + """專案分析結果""" + project_name: str + analysis_date: datetime = Field(default_factory=datetime.now) + sprints: List[Sprint] = Field(default=[]) + backlog: List[Task] = Field(default=[]) + + # 分析結果 + health_score: float = Field(ge=0, le=100) + risk_factors: List[str] = Field(default=[]) + recommendations: List[str] = Field(default=[]) + + # 統計 + total_tasks: int + completed_tasks: int + overdue_tasks: int + blocked_tasks: int + + @property + def completion_percentage(self) -> float: + if self.total_tasks == 0: + return 0.0 + return (self.completed_tasks / self.total_tasks) * 100 + +# 使用 Structured Output 與複雜 Pydantic 模型 +def analyze_project_status(project_description: str) -> ProjectAnalysis: + """分析專案狀態""" + + response = client.beta.chat.completions.parse( + model="gpt-4o", + messages=[ + { + "role": "system", + "content": """你是專案管理專家。請分析專案描述,提取任務、評估健康狀況,並提供建議。 + +分析要點: +1. 識別所有任務和子任務 +2. 評估任務優先級和狀態 +3. 識別風險因素 +4. 提供改進建議 +5. 計算整體健康分數 (0-100)""" + }, + { + "role": "user", + "content": f"請分析以下專案:\n\n{project_description}" + } + ], + response_format=ProjectAnalysis + ) + + return response.choices[0].message.parsed +``` + +--- + +## 6. 錯誤處理與重試機制 + +### 6.1 穩健的結構化輸出處理 + +```python +from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type +from pydantic import ValidationError +import json +import logging + +logger = logging.getLogger(__name__) + +class StructuredOutputError(Exception): + """結構化輸出錯誤""" + pass + +class OutputParsingError(StructuredOutputError): + """輸出解析錯誤""" + pass + +class SchemaValidationError(StructuredOutputError): + """Schema 驗證錯誤""" + pass + +class StructuredOutputHandler: + """結構化輸出處理器""" + + def __init__( + self, + client: OpenAI, + model: str = "gpt-4o", + max_retries: int = 3, + fallback_model: str = "gpt-4o-mini" + ): + self.client = client + self.model = model + self.max_retries = max_retries + self.fallback_model = fallback_model + + @retry( + stop=stop_after_attempt(3), + wait=wait_exponential(multiplier=1, min=1, max=10), + retry=retry_if_exception_type((OutputParsingError, SchemaValidationError)) + ) + def get_structured_output( + self, + messages: List[dict], + response_model: type, + temperature: float = 0.0 + ) -> Any: + """獲取結構化輸出,帶重試機制""" + + try: + response = self.client.beta.chat.completions.parse( + model=self.model, + messages=messages, + response_format=response_model, + temperature=temperature + ) + + result = response.choices[0].message.parsed + + if result is None: + raise OutputParsingError("Parsed result is None") + + return result + + except ValidationError as e: + logger.warning(f"Validation error: {e}") + raise SchemaValidationError(f"Schema validation failed: {e}") + + except json.JSONDecodeError as e: + logger.warning(f"JSON decode error: {e}") + raise OutputParsingError(f"Failed to parse JSON: {e}") + + except Exception as e: + logger.error(f"Unexpected error: {e}") + raise + + def get_with_fallback( + self, + messages: List[dict], + response_model: type, + fallback_prompt: str = None + ) -> Any: + """帶降級策略的結構化輸出""" + + try: + return self.get_structured_output(messages, response_model) + + except StructuredOutputError as e: + logger.warning(f"Primary model failed: {e}, trying fallback") + + # 嘗試使用更詳細的提示 + if fallback_prompt: + enhanced_messages = messages.copy() + enhanced_messages[-1]["content"] += f"\n\n{fallback_prompt}" + + try: + return self.get_structured_output( + enhanced_messages, + response_model + ) + except StructuredOutputError: + pass + + # 嘗試降級模型 + try: + original_model = self.model + self.model = self.fallback_model + result = self.get_structured_output(messages, response_model) + self.model = original_model + return result + except StructuredOutputError: + self.model = original_model + raise + + def get_with_repair( + self, + messages: List[dict], + response_model: type + ) -> Any: + """帶自動修復的結構化輸出""" + + try: + return self.get_structured_output(messages, response_model) + + except SchemaValidationError as e: + # 嘗試讓模型修復輸出 + repair_messages = messages + [ + { + "role": "assistant", + "content": str(e) + }, + { + "role": "user", + "content": f"""上一個輸出有驗證錯誤:{e} + +請修正輸出,確保符合要求的 Schema。只輸出修正後的 JSON,不要有其他文字。""" + } + ] + + return self.get_structured_output(repair_messages, response_model) + +# 使用範例 +handler = StructuredOutputHandler(client) + +class SimpleAnalysis(BaseModel): + summary: str + score: float = Field(ge=0, le=10) + tags: List[str] + +result = handler.get_with_fallback( + messages=[ + {"role": "user", "content": "分析這段文字的情感"} + ], + response_model=SimpleAnalysis, + fallback_prompt="請確保輸出有效的 JSON,score 在 0-10 之間" +) +``` + +--- + +## 7. 實戰案例 + +### 7.1 電商評論分析系統 + +```python +from pydantic import BaseModel, Field +from typing import List, Optional, Dict +from enum import Enum +from datetime import datetime + +class AspectSentiment(str, Enum): + VERY_POSITIVE = "very_positive" + POSITIVE = "positive" + NEUTRAL = "neutral" + NEGATIVE = "negative" + VERY_NEGATIVE = "very_negative" + +class ProductAspect(BaseModel): + """產品面向分析""" + aspect: str = Field(description="分析面向,如:品質、價格、物流、客服") + sentiment: AspectSentiment + confidence: float = Field(ge=0, le=1) + evidence: str = Field(description="支持該判斷的原文摘錄") + +class ReviewIntent(str, Enum): + PRAISE = "praise" + COMPLAINT = "complaint" + QUESTION = "question" + SUGGESTION = "suggestion" + COMPARISON = "comparison" + NEUTRAL = "neutral" + +class ReviewAnalysisResult(BaseModel): + """評論分析結果""" + review_id: str + overall_sentiment: AspectSentiment + overall_score: float = Field(ge=1, le=5, description="1-5 星評分") + intent: ReviewIntent + aspects: List[ProductAspect] = Field(min_length=1) + key_phrases: List[str] = Field(description="關鍵詞/短語") + is_spam: bool = Field(description="是否為垃圾評論") + spam_reason: Optional[str] = None + buyer_verified: bool = Field(description="是否為已購買用戶") + response_suggestion: Optional[str] = Field( + default=None, + description="建議的商家回覆" + ) + +class BatchReviewAnalysis(BaseModel): + """批量評論分析結果""" + total_reviews: int + analysis_timestamp: datetime + reviews: List[ReviewAnalysisResult] + + # 聚合統計 + sentiment_distribution: Dict[str, int] + average_score: float + top_complaints: List[str] + top_praises: List[str] + spam_count: int + response_needed_count: int + +def analyze_reviews_batch( + reviews: List[Dict[str, str]], + product_name: str +) -> BatchReviewAnalysis: + """批量分析產品評論""" + + reviews_text = "\n\n".join([ + f"評論 ID: {r['id']}\n內容: {r['content']}\n購買驗證: {r.get('verified', '否')}" + for r in reviews + ]) + + response = client.beta.chat.completions.parse( + model="gpt-4o", + messages=[ + { + "role": "system", + "content": f"""你是專業的電商評論分析師。分析以下 {product_name} 的評論。 + +分析要求: +1. 識別每條評論的情感傾向和評分 +2. 提取多個面向(品質、價格、物流、客服等)的情感 +3. 判斷評論意圖 +4. 識別垃圾評論(刷單、無關內容等) +5. 對負面評論提供回覆建議 +6. 最後提供聚合統計""" + }, + { + "role": "user", + "content": f"請分析以下評論:\n\n{reviews_text}" + } + ], + response_format=BatchReviewAnalysis + ) + + return response.choices[0].message.parsed + +# 使用範例 +sample_reviews = [ + {"id": "R001", "content": "品質很好,但物流太慢了,等了一周才到", "verified": "是"}, + {"id": "R002", "content": "性價比超高!推薦購買", "verified": "是"}, + {"id": "R003", "content": "好評好評好評好評", "verified": "否"}, + {"id": "R004", "content": "客服態度很差,問了幾次都不回覆", "verified": "是"}, +] + +result = analyze_reviews_batch(sample_reviews, "無線藍牙耳機") +print(f"平均評分: {result.average_score}") +print(f"情感分佈: {result.sentiment_distribution}") +print(f"主要投訴: {result.top_complaints}") +``` + +### 7.2 智能表單填寫助手 + +```python +from pydantic import BaseModel, Field +from typing import List, Optional, Dict, Any, Union +from enum import Enum + +class FieldType(str, Enum): + TEXT = "text" + NUMBER = "number" + DATE = "date" + EMAIL = "email" + PHONE = "phone" + SELECT = "select" + MULTISELECT = "multiselect" + BOOLEAN = "boolean" + +class FormField(BaseModel): + """表單欄位""" + id: str + label: str + type: FieldType + required: bool = True + options: Optional[List[str]] = None # 用於 select/multiselect + validation_pattern: Optional[str] = None + placeholder: Optional[str] = None + +class ExtractedValue(BaseModel): + """提取的值""" + field_id: str + value: Union[str, int, float, bool, List[str], None] + confidence: float = Field(ge=0, le=1) + source_text: Optional[str] = Field( + default=None, + description="值來源的原文" + ) + needs_confirmation: bool = Field( + default=False, + description="是否需要用戶確認" + ) + +class FormFillingResult(BaseModel): + """表單填寫結果""" + form_id: str + extracted_values: List[ExtractedValue] + missing_required_fields: List[str] + ambiguous_fields: List[str] + suggestions: List[str] + completion_percentage: float = Field(ge=0, le=100) + +def create_form_filling_assistant(form_fields: List[FormField]) -> Callable: + """創建表單填寫助手""" + + fields_description = "\n".join([ + f"- {f.id}: {f.label} ({f.type.value})" + + (f", 選項: {f.options}" if f.options else "") + + (f", {'必填' if f.required else '選填'}") + for f in form_fields + ]) + + def fill_form(user_input: str) -> FormFillingResult: + response = client.beta.chat.completions.parse( + model="gpt-4o", + messages=[ + { + "role": "system", + "content": f"""你是智能表單填寫助手。根據用戶提供的資訊,自動填寫表單欄位。 + +表單欄位: +{fields_description} + +填寫原則: +1. 從用戶輸入中準確提取對應欄位的值 +2. 對於不確定的值,設置較低的 confidence 和 needs_confirmation=true +3. 識別缺失的必填欄位 +4. 如果有多個可能的值,標記為 ambiguous +5. 提供改進輸入的建議""" + }, + { + "role": "user", + "content": f"請根據以下資訊填寫表單:\n\n{user_input}" + } + ], + response_format=FormFillingResult + ) + + return response.choices[0].message.parsed + + return fill_form + +# 使用範例 +fields = [ + FormField(id="name", label="姓名", type=FieldType.TEXT), + FormField(id="email", label="電子郵件", type=FieldType.EMAIL), + FormField(id="phone", label="電話號碼", type=FieldType.PHONE), + FormField(id="birth_date", label="出生日期", type=FieldType.DATE), + FormField(id="education", label="最高學歷", type=FieldType.SELECT, + options=["高中", "大學", "碩士", "博士"]), + FormField(id="skills", label="技能", type=FieldType.MULTISELECT, + options=["Python", "JavaScript", "Java", "Go", "Rust"]), +] + +fill_form = create_form_filling_assistant(fields) + +result = fill_form(""" +我叫張小明,今年 28 歲(1996年5月出生), +目前是碩士學位,會寫 Python 和 JavaScript。 +聯絡方式是 zhangxm@email.com,手機 0912-345-678 +""") + +for v in result.extracted_values: + print(f"{v.field_id}: {v.value} (信心度: {v.confidence:.0%})") +print(f"完成度: {result.completion_percentage}%") +``` + +--- + +## 總結 + +### 最佳實踐 + +1. **選擇正確的方法** + - 簡單結構:直接 JSON mode + - 複雜結構:Pydantic + Structured Output + - Claude:Tool Use 或 XML + +2. **設計健壯的 Schema** + - 使用 Pydantic 進行類型驗證 + - 添加合理的預設值 + - 使用 Field 描述增加清晰度 + +3. **處理錯誤情況** + - 實現重試機制 + - 準備降級策略 + - 記錄失敗案例用於改進 + +4. **優化效能** + - 並行處理多個請求 + - 快取常用結構 + - 選擇合適的模型 + +--- + +## 相關資源 + +- [OpenAI Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs) +- [Pydantic Documentation](https://docs.pydantic.dev/) +- [Anthropic Tool Use](https://docs.anthropic.com/claude/docs/tool-use) +- [JSON Schema](https://json-schema.org/) + +--- + +## 相關章節 + +- [提示優化框架](./2_提示優化框架.md) +- [Function Calling 深度指南](./README.md#3-function-calling深度指南) +- [LLM API 整合](../2.LLM%20as%20API/README.md) diff --git "a/9.\351\235\242\350\251\246\346\272\226\345\202\231\350\210\207\350\201\267\346\245\255\347\231\274\345\261\225/1.LLM\351\235\242\350\251\246\351\241\214\345\272\253/01_\345\237\272\347\244\216\346\246\202\345\277\265\351\241\214.md" "b/9.\351\235\242\350\251\246\346\272\226\345\202\231\350\210\207\350\201\267\346\245\255\347\231\274\345\261\225/1.LLM\351\235\242\350\251\246\351\241\214\345\272\253/01_\345\237\272\347\244\216\346\246\202\345\277\265\351\241\214.md" new file mode 100644 index 0000000..d554f04 --- /dev/null +++ "b/9.\351\235\242\350\251\246\346\272\226\345\202\231\350\210\207\350\201\267\346\245\255\347\231\274\345\261\225/1.LLM\351\235\242\350\251\246\351\241\214\345\272\253/01_\345\237\272\347\244\216\346\246\202\345\277\265\351\241\214.md" @@ -0,0 +1,2139 @@ +# LLM 基礎概念題 + +> **題目數量**: 35 題 +> **難度分佈**: 基礎 40% | 中等 40% | 高級 20% +> **涵蓋主題**: Transformer架構、Attention機制、訓練技巧、RAG、Agent、推理優化 + +--- + +## 目錄 + +1. [Transformer 架構](#1-transformer-架構) +2. [Attention 機制](#2-attention-機制) +3. [訓練與微調技術](#3-訓練與微調技術) +4. [RAG 基礎概念](#4-rag-基礎概念) +5. [Agent 基礎概念](#5-agent-基礎概念) +6. [推理優化基礎](#6-推理優化基礎) + +--- + +## 1. Transformer 架構 + +### Q1: 解釋 Transformer 的整體架構 +**難度**: 基礎 + +**參考答案**: + +Transformer 由 **Encoder** 和 **Decoder** 兩部分組成: + +``` +輸入 → Embedding + Position Encoding + ↓ + ┌─────────────┐ + │ Encoder │ × N層 + │ (自注意力) │ + └─────────────┘ + ↓ + ┌─────────────┐ + │ Decoder │ × N層 + │ (交叉注意力) │ + └─────────────┘ + ↓ + Linear + Softmax → 輸出 +``` + +**Encoder 層結構**: +1. Multi-Head Self-Attention +2. Add & Norm (殘差連接 + 層歸一化) +3. Feed-Forward Network +4. Add & Norm + +**Decoder 層結構**: +1. Masked Multi-Head Self-Attention +2. Add & Norm +3. Multi-Head Cross-Attention (連接Encoder輸出) +4. Add & Norm +5. Feed-Forward Network +6. Add & Norm + +**重要特點**: +- 完全基於注意力機制,捨棄了 RNN 的遞迴結構 +- 支持並行計算,訓練效率高 +- 能夠捕獲長距離依賴關係 + +--- + +### Q2: GPT 和 BERT 在架構上有什麼區別? +**難度**: 基礎 + +**參考答案**: + +| 特性 | GPT | BERT | +|------|-----|------| +| **架構** | 只有 Decoder | 只有 Encoder | +| **注意力** | Causal (單向) | Bidirectional (雙向) | +| **預訓練任務** | 下一個詞預測 | MLM + NSP | +| **生成能力** | 強 (自回歸) | 弱 (需額外設計) | +| **理解能力** | 較弱 | 強 | +| **典型應用** | 文本生成、對話 | 分類、NER、問答 | + +**GPT 的 Causal Mask**: +``` + t1 t2 t3 t4 +t1 1 0 0 0 +t2 1 1 0 0 +t3 1 1 1 0 +t4 1 1 1 1 +``` +- 每個 token 只能看到自己和之前的 token +- 適合自回歸生成任務 + +**BERT 的 MLM**: +- 隨機遮蓋 15% 的 token +- 模型預測被遮蓋的 token +- 能夠利用雙向上下文 + +--- + +### Q3: 什麼是 Layer Normalization?為什麼 Transformer 使用它而不是 Batch Normalization? +**難度**: 中等 + +**參考答案**: + +**Layer Normalization 公式**: +``` +LayerNorm(x) = γ * (x - μ) / (σ + ε) + β +``` +其中 μ 和 σ 在特徵維度上計算,而非 batch 維度。 + +**為什麼選擇 LayerNorm**: + +1. **序列長度變化**: NLP 任務中序列長度不固定,BatchNorm 統計量不穩定 +2. **小 batch 友好**: LayerNorm 在單個樣本上計算,不依賴 batch 大小 +3. **推理一致性**: BatchNorm 推理時需要 running statistics,LayerNorm 不需要 +4. **並行計算**: LayerNorm 更適合 Transformer 的並行結構 + +**計算差異**: +```python +# BatchNorm: 在 batch 維度歸一化 +# 形狀: (batch, seq, hidden) → 在 batch 維度計算統計量 + +# LayerNorm: 在 hidden 維度歸一化 +# 形狀: (batch, seq, hidden) → 在 hidden 維度計算統計量 +``` + +--- + +### Q4: 解釋 Pre-Norm 和 Post-Norm 的區別 +**難度**: 中等 + +**參考答案**: + +**Post-Norm (原始 Transformer)**: +``` +x → Attention → Add(x) → LayerNorm → FFN → Add → LayerNorm → output +``` + +**Pre-Norm (GPT-2, LLaMA 等)**: +``` +x → LayerNorm → Attention → Add(x) → LayerNorm → FFN → Add(x) → output +``` + +**關鍵差異**: + +| 特性 | Post-Norm | Pre-Norm | +|------|-----------|----------| +| **訓練穩定性** | 較差,需要 warmup | 更穩定 | +| **最終性能** | 略好 | 略差 | +| **梯度流動** | 可能有梯度消失 | 更順暢 | +| **深層模型** | 訓練困難 | 更容易訓練 | + +**Pre-Norm 更穩定的原因**: +- 殘差連接直接傳遞梯度,不經過 LayerNorm +- 輸入在進入子層前已歸一化,數值更穩定 + +**現代實踐**: 大多數大型 LLM (GPT-3, LLaMA, Claude) 使用 Pre-Norm + +--- + +### Q5: 什麼是 RMSNorm?它相比 LayerNorm 有什麼優勢? +**難度**: 中等 + +**參考答案**: + +**RMSNorm (Root Mean Square Normalization)**: +``` +RMSNorm(x) = x / RMS(x) * γ +RMS(x) = sqrt(mean(x²)) +``` + +**與 LayerNorm 的對比**: + +| 特性 | LayerNorm | RMSNorm | +|------|-----------|---------| +| **計算** | 需要 mean 和 variance | 只需要 RMS | +| **參數** | γ 和 β | 只有 γ | +| **計算量** | 較多 | 減少約 7-64% | +| **效果** | 基準 | 相當或略好 | + +**LLaMA 使用 RMSNorm 的原因**: +1. 計算更高效(省去計算均值和偏移) +2. 實驗表明效果相當 +3. 更適合大規模模型 + +**代碼實現**: +```python +class RMSNorm(nn.Module): + def __init__(self, dim, eps=1e-6): + super().__init__() + self.eps = eps + self.weight = nn.Parameter(torch.ones(dim)) + + def forward(self, x): + rms = torch.sqrt(torch.mean(x ** 2, dim=-1, keepdim=True) + self.eps) + return x / rms * self.weight +``` + +--- + +### Q6: 什麼是 Feed-Forward Network (FFN)?它在 Transformer 中的作用是什麼? +**難度**: 基礎 + +**參考答案**: + +**FFN 結構**: +``` +FFN(x) = Activation(xW₁ + b₁)W₂ + b₂ +``` + +**標準配置**: +- 隱藏層維度通常是輸入的 4 倍 (d_ff = 4 * d_model) +- 原始 Transformer 使用 ReLU,現代模型常用 GELU 或 SwiGLU + +**作用**: +1. **非線性變換**: 引入非線性,增強表達能力 +2. **特徵變換**: 在每個位置獨立進行特徵變換 +3. **容量載體**: FFN 參數佔模型總參數約 2/3,是知識存儲的主要位置 + +**SwiGLU (LLaMA 使用)**: +```python +def swiglu(x, W, V, W2): + return (F.silu(x @ W) * (x @ V)) @ W2 +``` + +**為什麼使用 GLU 變體**: +- 更好的訓練穩定性 +- 更強的表達能力 +- 在相同參數量下性能更好 + +--- + +## 2. Attention 機制 + +### Q7: 詳細解釋 Self-Attention 的計算過程 +**難度**: 基礎 + +**參考答案**: + +**Self-Attention 公式**: +``` +Attention(Q, K, V) = softmax(QK^T / √d_k) V +``` + +**計算步驟**: + +1. **線性投影生成 Q, K, V**: +```python +Q = X @ W_Q # (batch, seq, d_model) @ (d_model, d_k) → (batch, seq, d_k) +K = X @ W_K +V = X @ W_V +``` + +2. **計算注意力分數**: +```python +scores = Q @ K.T # (batch, seq, d_k) @ (batch, d_k, seq) → (batch, seq, seq) +``` + +3. **縮放**: +```python +scores = scores / sqrt(d_k) +``` + +4. **Softmax 歸一化**: +```python +attention_weights = softmax(scores, dim=-1) # 每行和為1 +``` + +5. **加權求和**: +```python +output = attention_weights @ V # (batch, seq, seq) @ (batch, seq, d_v) → (batch, seq, d_v) +``` + +**為什麼要縮放**: +- 當 d_k 較大時,點積值會變得很大 +- 大的點積值導致 softmax 輸出接近 one-hot +- 梯度會變得非常小,影響訓練 + +--- + +### Q8: Multi-Head Attention 如何實現?為什麼要使用多頭? +**難度**: 中等 + +**參考答案**: + +**Multi-Head Attention 公式**: +``` +MultiHead(Q, K, V) = Concat(head_1, ..., head_h) W^O +head_i = Attention(QW_i^Q, KW_i^K, VW_i^V) +``` + +**實現方式**: +```python +class MultiHeadAttention(nn.Module): + def __init__(self, d_model, num_heads): + self.num_heads = num_heads + self.head_dim = d_model // num_heads + + self.W_q = nn.Linear(d_model, d_model) + self.W_k = nn.Linear(d_model, d_model) + self.W_v = nn.Linear(d_model, d_model) + self.W_o = nn.Linear(d_model, d_model) + + def forward(self, x): + batch, seq, d_model = x.shape + + # 投影並分頭 + Q = self.W_q(x).view(batch, seq, self.num_heads, self.head_dim).transpose(1, 2) + K = self.W_k(x).view(batch, seq, self.num_heads, self.head_dim).transpose(1, 2) + V = self.W_v(x).view(batch, seq, self.num_heads, self.head_dim).transpose(1, 2) + + # 計算注意力 + scores = (Q @ K.transpose(-2, -1)) / sqrt(self.head_dim) + attn = softmax(scores, dim=-1) + context = attn @ V + + # 合併頭並投影 + context = context.transpose(1, 2).contiguous().view(batch, seq, d_model) + return self.W_o(context) +``` + +**使用多頭的原因**: +1. **學習不同類型的關係**: 不同頭可以關注語法、語義、位置等不同方面 +2. **增加表達能力**: 相當於在不同子空間並行執行注意力 +3. **提高穩定性**: 多頭的聚合減少了單頭的隨機性 +4. **計算效率**: 與單個大頭相比,計算量相同但表達能力更強 + +--- + +### Q9: 什麼是 Grouped Query Attention (GQA)? +**難度**: 中等 + +**參考答案**: + +**GQA 是 Multi-Head Attention (MHA) 和 Multi-Query Attention (MQA) 的折中方案**。 + +**三種注意力對比**: + +| 方法 | Q 頭數 | K/V 頭數 | KV-Cache 大小 | +|------|--------|----------|---------------| +| MHA | H | H | 2 × H × d | +| MQA | H | 1 | 2 × d | +| GQA | H | G (1 北京(2189萬) +Final Answer: 上海的人口更多,約 2487 萬人,比北京多約 300 萬人。 +``` + +**實現**: +```python +class ReActAgent: + def __init__(self, llm, tools): + self.llm = llm + self.tools = tools + + def run(self, question): + prompt = f"""Answer the following question using the tools available. + +Question: {question} + +Available Tools: +{self.format_tools()} + +Use this format: +Thought: [your reasoning] +Action: [tool_name(args)] +Observation: [tool result] +... (repeat as needed) +Thought: [final reasoning] +Final Answer: [your answer] + +Begin! +""" + context = prompt + + while True: + response = self.llm.generate(context) + + if "Final Answer:" in response: + return self.extract_answer(response) + + # 解析並執行動作 + action = self.parse_action(response) + observation = self.execute_tool(action) + + context += f"\n{response}\nObservation: {observation}\n" +``` + +**ReAct 優勢**: +1. 推理過程可解釋 +2. 支持自我糾錯 +3. 適合多步驟任務 + +--- + +### Q25: 什麼是 Function Calling?如何設計好的工具描述? +**難度**: 中等 + +**參考答案**: + +**Function Calling** 讓 LLM 調用預定義的函數/工具。 + +**工作流程**: +``` +1. 定義工具 schema +2. 用戶提問 +3. LLM 決定調用哪個工具及參數 +4. 執行工具 +5. 將結果返回給 LLM +6. LLM 生成最終回答 +``` + +**好的工具描述要素**: + +**1. 清晰的名稱**: +```json +// 好 +"name": "search_web" +"name": "calculate_math" +"name": "send_email" + +// 差 +"name": "do_stuff" +"name": "function1" +``` + +**2. 詳細的描述**: +```json +// 好 +{ + "description": "Search the web for current information. Use when the user asks about recent events, news, or information that may have changed after your knowledge cutoff." +} + +// 差 +{ + "description": "Search" +} +``` + +**3. 完整的參數定義**: +```json +{ + "name": "get_weather", + "description": "Get current weather for a location", + "parameters": { + "type": "object", + "properties": { + "location": { + "type": "string", + "description": "City name or coordinates (e.g., 'Beijing' or '39.9,116.4')" + }, + "unit": { + "type": "string", + "enum": ["celsius", "fahrenheit"], + "default": "celsius", + "description": "Temperature unit" + } + }, + "required": ["location"] + } +} +``` + +**4. 使用示例**: +```json +{ + "examples": [ + { + "query": "北京今天天氣怎麼樣", + "function_call": { + "name": "get_weather", + "arguments": {"location": "Beijing", "unit": "celsius"} + } + } + ] +} +``` + +**常見錯誤**: +- 描述模糊不清 +- 缺少參數約束 +- 沒有說明何時使用 +- 缺少錯誤處理說明 + +--- + +### Q26: 比較 ReAct、Plan-and-Execute、Reflexion 三種 Agent 模式 +**難度**: 高級 + +**參考答案**: + +**1. ReAct (Reasoning + Acting)**: +``` +特點: 交替進行推理和行動 +流程: Think → Act → Observe → Think → Act → ... +優點: 實時調整,可解釋 +缺點: 可能陷入循環 +``` + +**2. Plan-and-Execute**: +``` +特點: 先規劃後執行 +流程: Plan (生成完整計劃) → Execute (逐步執行) +優點: 結構化,適合複雜任務 +缺點: 計劃可能不完美,需要重新規劃 +``` + +**3. Reflexion**: +``` +特點: 自我反思改進 +流程: Act → Evaluate → Reflect → Update → Act (improved) +優點: 能從錯誤中學習 +缺點: 需要更多計算 +``` + +**對比**: + +| 模式 | 適用場景 | 複雜度 | 可靠性 | +|------|---------|-------|-------| +| ReAct | 簡單任務,需要實時反應 | 低 | 中 | +| Plan-and-Execute | 複雜多步驟任務 | 中 | 中高 | +| Reflexion | 需要高準確率的任務 | 高 | 高 | + +**代碼示意**: + +```python +# ReAct +def react(task): + while not done: + thought = think(context) + action = act(thought) + observation = observe(action) + context.update(thought, action, observation) + +# Plan-and-Execute +def plan_execute(task): + plan = generate_plan(task) + for step in plan: + result = execute(step) + if failed(result): + plan = replan(plan, result) + +# Reflexion +def reflexion(task): + for attempt in range(max_attempts): + result = execute(task) + if success(result): + return result + reflection = reflect(result) + update_strategy(reflection) +``` + +--- + +## 6. 推理優化基礎 + +### Q27: 什麼是 KV-Cache?它如何加速推理? +**難度**: 中等 + +**參考答案**: + +**問題**: 自回歸生成時,每生成一個 token 都要計算所有之前 token 的注意力。 + +**KV-Cache 原理**: +``` +不使用 KV-Cache: +Step 1: Q₁ @ K₁ᵀ → Attn(Q₁, K₁, V₁) +Step 2: Q₂ @ [K₁, K₂]ᵀ → Attn(Q₂, [K₁,K₂], [V₁,V₂]) ← 重複計算 K₁, V₁ +Step 3: Q₃ @ [K₁, K₂, K₃]ᵀ → ... ← 重複計算 K₁, K₂, V₁, V₂ + +使用 KV-Cache: +Step 1: 計算 K₁, V₁ → Cache = {K₁, V₁} +Step 2: 只計算 K₂, V₂ → Cache = {K₁K₂, V₁V₂} +Step 3: 只計算 K₃, V₃ → Cache = {K₁K₂K₃, V₁V₂V₃} +``` + +**複雜度改善**: +- 不使用 Cache: O(n³) 總計算量(生成 n 個 token) +- 使用 Cache: O(n²) 總計算量 + +**記憶體需求**: +```python +kv_cache_memory = 2 * num_layers * batch_size * seq_len * num_heads * head_dim * precision_bytes + +# 例如 LLaMA-7B, seq_len=2048, batch=1, FP16 +# 2 * 32 * 1 * 2048 * 32 * 128 * 2 bytes ≈ 1 GB +``` + +**實現**: +```python +class KVCache: + def __init__(self, num_layers): + self.cache = [{"k": None, "v": None} for _ in range(num_layers)] + + def update(self, layer_idx, new_k, new_v): + if self.cache[layer_idx]["k"] is None: + self.cache[layer_idx]["k"] = new_k + self.cache[layer_idx]["v"] = new_v + else: + self.cache[layer_idx]["k"] = torch.cat([ + self.cache[layer_idx]["k"], new_k + ], dim=-2) + self.cache[layer_idx]["v"] = torch.cat([ + self.cache[layer_idx]["v"], new_v + ], dim=-2) + + def get(self, layer_idx): + return self.cache[layer_idx]["k"], self.cache[layer_idx]["v"] +``` + +--- + +### Q28: 解釋模型量化的基本原理 +**難度**: 中等 + +**參考答案**: + +**量化**: 將高精度數值(FP32/FP16)轉換為低精度(INT8/INT4)。 + +**量化公式**: +``` +量化: Q(x) = round(x / scale) + zero_point +反量化: x' = (Q(x) - zero_point) * scale +``` + +**量化類型**: + +| 類型 | 描述 | 精度影響 | +|------|------|---------| +| W8A8 | 權重和激活都 8-bit | 小 | +| W4A16 | 權重 4-bit,激活 16-bit | 中等 | +| W4A8 | 權重 4-bit,激活 8-bit | 較大 | + +**量化方法**: + +**1. PTQ (Post-Training Quantization)**: +```python +# 訓練後直接量化 +def ptq_quantize(weights, bits=8): + min_val = weights.min() + max_val = weights.max() + scale = (max_val - min_val) / (2**bits - 1) + zero_point = round(-min_val / scale) + quantized = round(weights / scale) + zero_point + return quantized, scale, zero_point +``` + +**2. QAT (Quantization-Aware Training)**: +```python +# 訓練時模擬量化 +class FakeQuantize(nn.Module): + def forward(self, x): + # 模擬量化和反量化 + x_q = torch.round(x / self.scale) + self.zero_point + x_dq = (x_q - self.zero_point) * self.scale + # 直通估計器 + return x + (x_dq - x).detach() +``` + +**3. GPTQ (針對 LLM 的量化)**: +- 基於二階信息(Hessian) +- 逐層量化,補償誤差 +- 4-bit 量化效果好 + +**4. AWQ (Activation-aware Weight Quantization)**: +- 基於激活分佈保護重要權重 +- 自動搜索縮放因子 + +--- + +### Q29: 什麼是 Speculative Decoding?它如何加速生成? +**難度**: 高級 + +**參考答案**: + +**Speculative Decoding** 使用小模型生成草稿,大模型並行驗證。 + +**基本原理**: +``` +傳統生成: 大模型串行生成 n 個 token +推測解碼: 小模型生成 k 個草稿 token → 大模型一次驗證 +``` + +**流程**: +``` +1. 小模型生成 k 個 token: [t₁, t₂, t₃, t₄, t₅] + +2. 大模型並行計算所有位置的概率: + P(t₁|prompt), P(t₂|prompt,t₁), ..., P(t₅|prompt,t₁..t₄) + +3. 驗證每個 token: + - 如果 P_large(tᵢ) >= P_small(tᵢ): 接受 + - 否則: 以一定概率拒絕,從此處重新生成 + +4. 假設接受了 3 個: [t₁, t₂, t₃, ✗] + - 大模型已經計算了正確的 t₄ + - 返回 [t₁, t₂, t₃, t₄_correct] +``` + +**加速原理**: +``` +傳統: n 次大模型前向傳播 +推測: n/acceptance_rate 次大模型前向傳播(每次並行驗證多個 token) +``` + +**實現示意**: +```python +def speculative_decode(draft_model, target_model, prompt, k=5): + generated = [] + + while not finished: + # 1. 小模型生成 k 個草稿 token + draft_tokens, draft_probs = draft_model.generate(prompt + generated, k) + + # 2. 大模型並行驗證 + target_probs = target_model.get_probs(prompt + generated + draft_tokens) + + # 3. 決定接受哪些 + accepted = [] + for i in range(k): + r = random.random() + if r < min(1, target_probs[i] / draft_probs[i]): + accepted.append(draft_tokens[i]) + else: + # 從目標分佈採樣 + accepted.append(sample_corrected(target_probs[i], draft_probs[i])) + break + + generated.extend(accepted) + + return generated +``` + +**加速效果**: 通常 1.5-2.5 倍(取決於小模型質量) + +--- + +### Q30: 什麼是 Continuous Batching?它解決什麼問題? +**難度**: 中等 + +**參考答案**: + +**問題**: 傳統靜態 batching 效率低 + +``` +傳統 Batching: +Request 1: [A, B, C, D, EOS] +Request 2: [A, B, C, D, E, F, EOS] +Request 3: [A, B, EOS, PAD, PAD, PAD, PAD] + +→ Request 1, 3 完成後仍需等待 Request 2 +→ PAD token 浪費計算資源 +``` + +**Continuous Batching 解決方案**: +``` +動態加入和移除請求: + +Step 1: [R1, R2, R3] → 生成一個 token +Step 2: [R1, R2, R3] → R3 完成,移除 +Step 3: [R1, R2, R4] → R4 加入,繼續 +Step 4: [R1, R2, R4] → R1 完成,移除 +... +``` + +**優勢**: +1. **更高吞吐**: 無需等待最長請求 +2. **更低延遲**: 完成即返回 +3. **資源利用率高**: 無 padding 浪費 + +**實現架構**: +```python +class ContinuousBatcher: + def __init__(self, model, max_batch_size): + self.model = model + self.max_batch_size = max_batch_size + self.active_requests = [] + self.waiting_queue = [] + + def add_request(self, request): + self.waiting_queue.append(request) + + def step(self): + # 填充 batch + while len(self.active_requests) < self.max_batch_size and self.waiting_queue: + self.active_requests.append(self.waiting_queue.pop(0)) + + if not self.active_requests: + return + + # 執行一步推理 + outputs = self.model.forward(self.active_requests) + + # 處理完成的請求 + completed = [] + for req, output in zip(self.active_requests, outputs): + req.add_token(output) + if req.is_finished(): + completed.append(req) + + # 移除完成的請求 + for req in completed: + self.active_requests.remove(req) + req.return_result() +``` + +**vLLM 的 PagedAttention**: +- 類似操作系統的分頁內存管理 +- KV-Cache 分成固定大小的頁 +- 進一步提高記憶體利用率 + +--- + +### Q31: 解釋模型並行的不同策略 +**難度**: 高級 + +**參考答案**: + +**為什麼需要模型並行**: +- 模型太大,單 GPU 放不下 +- 例如 LLaMA-70B 需要 ~140GB (FP16) + +**並行策略**: + +**1. Data Parallelism (數據並行)**: +``` +每個 GPU 有完整模型副本,處理不同數據 + +GPU 0: Model + Data[0:B/n] +GPU 1: Model + Data[B/n:2B/n] +... +GPU n: Model + Data[(n-1)B/n:B] + +→ 梯度同步 → 更新 +``` + +**2. Tensor Parallelism (張量並行)**: +``` +將單層拆分到多個 GPU + +例如 FFN: Y = XW₁W₂ +GPU 0: Y₀ = XW₁₀W₂₀ +GPU 1: Y₁ = XW₁₁W₂₁ +→ AllReduce: Y = Y₀ + Y₁ +``` + +**3. Pipeline Parallelism (流水線並行)**: +``` +不同層放在不同 GPU + +GPU 0: Layer 0-7 +GPU 1: Layer 8-15 +GPU 2: Layer 16-23 +GPU 3: Layer 24-31 + +Micro-batch 流水線處理 +``` + +**4. Sequence Parallelism (序列並行)**: +``` +將長序列拆分到多個 GPU + +GPU 0: tokens[0:L/n] +GPU 1: tokens[L/n:2L/n] +... + +需要處理跨設備注意力 +``` + +**組合使用**: +``` +大規模訓練通常使用 3D 並行: +- Data Parallel across nodes +- Tensor Parallel within node +- Pipeline Parallel across layers +``` + +**選擇指南**: + +| 策略 | 適用場景 | 通信開銷 | +|------|---------|---------| +| Data Parallel | 模型可放入單 GPU | 梯度同步 | +| Tensor Parallel | 單層太大 | 高(每層 AllReduce)| +| Pipeline Parallel | 模型層數多 | 低(層間通信)| +| Sequence Parallel | 序列太長 | 中(注意力通信)| + +--- + +### Q32: 什麼是混合精度訓練? +**難度**: 中等 + +**參考答案**: + +**混合精度訓練**: 同時使用 FP16/BF16 和 FP32 進行訓練。 + +**為什麼使用混合精度**: +1. **速度**: FP16 計算比 FP32 快 2-8 倍 +2. **記憶體**: FP16 佔用減半 +3. **精度**: 通過技巧保持 FP32 精度 + +**核心技術**: + +**1. Loss Scaling**: +```python +# FP16 梯度可能太小(underflow) +# 放大 loss 使梯度在 FP16 範圍內 + +scaled_loss = loss * scale_factor # e.g., 1024 +scaled_loss.backward() + +# 更新前縮放回來 +optimizer.step(scale=1/scale_factor) +``` + +**2. Master Weights**: +```python +# 保持 FP32 的權重副本用於更新 +fp32_weights = model.float() +fp16_weights = fp32_weights.half() + +# 前向: FP16 +output = model_fp16(input.half()) + +# 反向: FP16 梯度 +loss.backward() + +# 更新: FP32 +fp32_weights -= lr * fp16_gradients.float() +fp16_weights = fp32_weights.half() +``` + +**3. BF16 的優勢**: +``` +FP16: 1 sign + 5 exponent + 10 mantissa +BF16: 1 sign + 8 exponent + 7 mantissa + +BF16 範圍更大(與 FP32 相同),可能不需要 loss scaling +``` + +**PyTorch 實現**: +```python +# 使用 autocast +from torch.cuda.amp import autocast, GradScaler + +scaler = GradScaler() + +for data, target in dataloader: + optimizer.zero_grad() + + with autocast(): + output = model(data) + loss = criterion(output, target) + + scaler.scale(loss).backward() + scaler.step(optimizer) + scaler.update() +``` + +--- + +### Q33: 解釋 vLLM 的 PagedAttention +**難度**: 高級 + +**參考答案**: + +**PagedAttention** 借鑑操作系統的虛擬內存管理 KV-Cache。 + +**問題**: 傳統 KV-Cache 管理 +``` +每個請求預分配最大長度的 KV-Cache +→ 實際生成長度 << 最大長度 +→ 大量內存浪費(fragmentation) +``` + +**PagedAttention 解決方案**: +``` +1. 將 KV-Cache 分成固定大小的"頁" +2. 按需動態分配頁 +3. 使用頁表映射邏輯位置到物理位置 +``` + +**工作原理**: +``` +Physical Memory (KV-Cache Pool): +┌─────┬─────┬─────┬─────┬─────┬─────┐ +│ P0 │ P1 │ P2 │ P3 │ P4 │ ... │ +└─────┴─────┴─────┴─────┴─────┴─────┘ + +Request A Page Table: Request B Page Table: +Logical → Physical Logical → Physical +0 → P2 0 → P0 +1 → P5 1 → P3 +2 → P1 2 → P4 +``` + +**優勢**: +1. **高記憶體利用率**: ~100%(vs 傳統方法 ~20-40%) +2. **支持更大 batch**: 相同記憶體可處理更多請求 +3. **高效記憶體共享**: 相同前綴可共享頁 + +**記憶體共享示例**: +``` +Prompt: "Translate to French: " + +Request 1: "Translate to French: Hello" → "Bonjour" +Request 2: "Translate to French: World" → "Monde" + +共享前綴的 KV-Cache 頁,只為不同部分分配新頁 +``` + +**實現示意**: +```python +class PagedAttention: + def __init__(self, page_size=16, num_pages=1000): + self.page_size = page_size + self.pages = torch.zeros(num_pages, page_size, head_dim) + self.free_pages = list(range(num_pages)) + self.page_tables = {} # request_id -> [page_ids] + + def allocate_page(self, request_id): + if not self.free_pages: + raise MemoryError("No free pages") + page_id = self.free_pages.pop() + self.page_tables.setdefault(request_id, []).append(page_id) + return page_id + + def attention(self, query, request_id): + page_ids = self.page_tables[request_id] + keys = torch.cat([self.pages[pid] for pid in page_ids]) + values = torch.cat([self.pages[pid] for pid in page_ids]) + return scaled_dot_product_attention(query, keys, values) +``` + +--- + +### Q34: 什麼是 Token Streaming?如何實現? +**難度**: 基礎 + +**參考答案**: + +**Token Streaming**: 生成過程中實時返回 token,而非等待完成。 + +**用戶體驗對比**: +``` +非 Streaming: +用戶: "寫一首詩" +[等待 10 秒...] +AI: [一次性顯示整首詩] + +Streaming: +用戶: "寫一首詩" +AI: "春" [0.1s] +AI: "春風" [0.2s] +AI: "春風拂" [0.3s] +... +``` + +**實現方式**: + +**1. Server-Sent Events (SSE)**: +```python +from fastapi import FastAPI +from fastapi.responses import StreamingResponse + +app = FastAPI() + +async def generate_stream(prompt): + for token in model.generate_tokens(prompt): + yield f"data: {json.dumps({'token': token})}\n\n" + yield "data: [DONE]\n\n" + +@app.get("/chat/stream") +async def chat_stream(prompt: str): + return StreamingResponse( + generate_stream(prompt), + media_type="text/event-stream" + ) +``` + +**2. WebSocket**: +```python +@app.websocket("/chat/ws") +async def chat_websocket(websocket: WebSocket): + await websocket.accept() + + prompt = await websocket.receive_text() + + for token in model.generate_tokens(prompt): + await websocket.send_json({"token": token}) + + await websocket.send_json({"done": True}) + await websocket.close() +``` + +**3. 前端處理**: +```javascript +// SSE +const eventSource = new EventSource('/chat/stream?prompt=...'); +eventSource.onmessage = (event) => { + if (event.data === '[DONE]') { + eventSource.close(); + return; + } + const data = JSON.parse(event.data); + displayToken(data.token); +}; + +// WebSocket +const ws = new WebSocket('ws://localhost/chat/ws'); +ws.onmessage = (event) => { + const data = JSON.parse(event.data); + if (data.done) { + ws.close(); + return; + } + displayToken(data.token); +}; +``` + +**實現注意事項**: +1. 及時 flush buffer +2. 處理連接中斷 +3. 支持取消生成 + +--- + +### Q35: 如何處理 LLM 的長上下文問題? +**難度**: 高級 + +**參考答案**: + +**長上下文挑戰**: +1. 注意力複雜度 O(n²) +2. KV-Cache 記憶體線性增長 +3. 長距離依賴捕獲困難 + +**解決方案**: + +**1. 高效注意力機制**: + +| 方法 | 複雜度 | 說明 | +|------|-------|------| +| Flash Attention | O(n) 記憶體 | 分塊計算 | +| Sliding Window | O(n×w) | 局部注意力 | +| Sparse Attention | O(n√n) | 稀疏模式 | +| Linear Attention | O(n) | 近似注意力 | + +**2. 位置編碼外推**: +```python +# RoPE 外推 +# 訓練: 4K context +# 推理: 通過插值支持 16K + +def extend_rope(original_max_len, new_max_len): + scale = original_max_len / new_max_len + # 調整頻率 + freqs = 1.0 / (base ** (torch.arange(0, dim, 2) / dim)) + freqs = freqs * scale # 縮放頻率 + return freqs +``` + +**3. 檢索增強**: +```python +# 長文檔分塊存儲,按需檢索 +class LongContextRAG: + def process(self, long_document, query): + # 1. 分塊索引 + chunks = chunk_document(long_document) + index = create_index(chunks) + + # 2. 檢索相關塊 + relevant_chunks = index.search(query, top_k=5) + + # 3. 用檢索結果回答 + return llm.answer(query, relevant_chunks) +``` + +**4. 壓縮歷史**: +```python +# 壓縮舊對話 +def compress_history(messages, max_tokens): + if count_tokens(messages) < max_tokens: + return messages + + # 保留 system 和最近消息 + system = messages[0] + recent = messages[-5:] + + # 壓縮中間消息 + middle = messages[1:-5] + summary = llm.summarize(middle) + + return [system, {"role": "system", "content": f"Earlier: {summary}"}] + recent +``` + +**5. 流式處理**: +```python +# 滑動窗口處理超長文本 +def process_long_text(text, window_size=4000, overlap=500): + results = [] + + for i in range(0, len(text), window_size - overlap): + window = text[i:i + window_size] + result = llm.process(window) + results.append(result) + + return merge_results(results) +``` + +**現代 LLM 長上下文能力**: +- GPT-4 Turbo: 128K +- Claude 3: 200K +- Gemini 1.5: 1M (最長) + +--- + +## 總結 + +本基礎概念題庫涵蓋了 LLM 面試中最重要的基礎知識點: + +1. **Transformer 架構**: 理解 Encoder-Decoder、各種 Norm、FFN 等核心組件 +2. **Attention 機制**: 掌握 Self-Attention、Multi-Head、位置編碼等關鍵技術 +3. **訓練技術**: 了解 SFT、RLHF、DPO、LoRA 等微調方法 +4. **RAG**: 理解檢索增強生成的原理和評估方法 +5. **Agent**: 掌握 ReAct、Function Calling 等 Agent 模式 +6. **推理優化**: 了解 KV-Cache、量化、並行等加速技術 + +建議按難度循序漸進學習,結合實際代碼加深理解。 diff --git "a/9.\351\235\242\350\251\246\346\272\226\345\202\231\350\210\207\350\201\267\346\245\255\347\231\274\345\261\225/1.LLM\351\235\242\350\251\246\351\241\214\345\272\253/02_\346\236\266\346\247\213\350\250\255\350\250\210\351\241\214.md" "b/9.\351\235\242\350\251\246\346\272\226\345\202\231\350\210\207\350\201\267\346\245\255\347\231\274\345\261\225/1.LLM\351\235\242\350\251\246\351\241\214\345\272\253/02_\346\236\266\346\247\213\350\250\255\350\250\210\351\241\214.md" new file mode 100644 index 0000000..eba6391 --- /dev/null +++ "b/9.\351\235\242\350\251\246\346\272\226\345\202\231\350\210\207\350\201\267\346\245\255\347\231\274\345\261\225/1.LLM\351\235\242\350\251\246\351\241\214\345\272\253/02_\346\236\266\346\247\213\350\250\255\350\250\210\351\241\214.md" @@ -0,0 +1,418 @@ +# LLM 架構設計面試題 + +> **題目數量**: 25 題 +> **難度分佈**: 中等 40% | 高級 60% +> **涵蓋主題**: RAG系統、Agent架構、推理優化、分佈式訓練、生產部署 + +--- + +## 目錄 + +1. [RAG 系統設計](#1-rag-系統設計) +2. [Agent 架構設計](#2-agent-架構設計) +3. [推理服務設計](#3-推理服務設計) +4. [訓練架構設計](#4-訓練架構設計) +5. [生產部署設計](#5-生產部署設計) + +--- + +## 1. RAG 系統設計 + +### Q1: 設計一個企業級 RAG 系統 +**難度**: 高級 + +**問題**: +設計一個支援百萬級文檔、多用戶的企業級 RAG 系統。需要考慮: +- 文檔攝取和索引 +- 檢索策略 +- 生成質量保證 +- 系統可擴展性 + +**參考答案**: + +``` + 企業級 RAG 架構 +┌─────────────────────────────────────────────────────────────┐ +│ 用戶層 │ +│ ┌─────────────────────────────────────────────────────┐ │ +│ │ Web UI │ API Gateway │ SDK │ │ +│ └─────────────────────────────────────────────────────┘ │ +├─────────────────────────────────────────────────────────────┤ +│ 應用服務層 │ +│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ +│ │ Query Router │ │ Reranker │ │ Response Gen │ │ +│ └──────────────┘ └──────────────┘ └──────────────┘ │ +├─────────────────────────────────────────────────────────────┤ +│ 檢索引擎層 │ +│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ +│ │Vector Search │ │ Keyword BM25 │ │ Graph Search │ │ +│ │ (FAISS) │ │ (ES/OS) │ │ (Neo4j) │ │ +│ └──────────────┘ └──────────────┘ └──────────────┘ │ +├─────────────────────────────────────────────────────────────┤ +│ 數據處理層 │ +│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ +│ │ Doc Parser │ │ Chunker │ │ Embedding │ │ +│ │ (Unstructd) │ │ (Adaptive) │ │ Service │ │ +│ └──────────────┘ └──────────────┘ └──────────────┘ │ +├─────────────────────────────────────────────────────────────┤ +│ 存儲層 │ +│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ +│ │ Object Store │ │ VectorDB │ │ MetaDB │ │ +│ │ (S3) │ │ (Qdrant) │ │ (PostgreSQL) │ │ +│ └──────────────┘ └──────────────┘ └──────────────┘ │ +└─────────────────────────────────────────────────────────────┘ +``` + +**關鍵設計決策**: + +1. **混合檢索策略** + - 語義檢索 (向量) + 關鍵字檢索 (BM25) + 知識圖譜 + - 使用 Reciprocal Rank Fusion 融合結果 + +2. **自適應分塊** + - 根據文檔類型選擇分塊策略 + - 代碼:按函數/類分塊 + - 文檔:按段落/章節分塊 + +3. **多級緩存** + - Query 緩存:相同問題直接返回 + - Embedding 緩存:避免重複計算 + - Document 緩存:熱門文檔預加載 + +--- + +### Q2: 如何評估和提升 RAG 系統質量? +**難度**: 中等 + +**參考答案**: + +**評估指標**: + +| 指標類型 | 具體指標 | 說明 | +|---------|---------|------| +| 檢索質量 | Recall@K | 相關文檔是否被檢索到 | +| 檢索質量 | NDCG@K | 相關文檔排序質量 | +| 生成質量 | Faithfulness | 回答是否基於檢索內容 | +| 生成質量 | Answer Relevancy | 回答是否回答了問題 | +| 端到端 | EM / F1 | 與標準答案的匹配度 | + +**提升策略**: +1. **檢索優化**: 查詢改寫、HyDE、多路召回 +2. **上下文優化**: 重排序、壓縮、去重 +3. **生成優化**: 更好的提示詞、Self-RAG + +--- + +### Q3: Self-RAG 與傳統 RAG 的區別? +**難度**: 中等 + +**參考答案**: + +``` +傳統 RAG: +Query → 檢索 → LLM 生成 → 回答 + +Self-RAG: +Query → LLM 判斷是否需要檢索 + → 如需要:檢索 → LLM 評估相關性 → 篩選 + → LLM 生成 → 自我批評 → 最終回答 +``` + +**Self-RAG 的特殊 Token**: +- `[Retrieve]`: 是否需要檢索 +- `[ISREL]`: 檢索結果是否相關 +- `[ISSUP]`: 回答是否被檢索支持 +- `[ISUSE]`: 回答是否有用 + +--- + +## 2. Agent 架構設計 + +### Q4: 設計一個多 Agent 協作系統 +**難度**: 高級 + +**問題**: 設計一個能處理複雜任務的多 Agent 系統,包含規劃、執行、驗證等能力。 + +**參考答案**: + +``` + Multi-Agent 協作架構 +┌────────────────────────────────────────────────────┐ +│ Orchestrator │ +│ ┌──────────────────────────────────────────────┐ │ +│ │ Task Planning & Routing │ │ +│ └──────────────────────────────────────────────┘ │ +├────────────────────────────────────────────────────┤ +│ Specialized Agents │ +│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ +│ │ Planner │ │ Executor │ │ Critic │ │ +│ │ Agent │ │ Agents │ │ Agent │ │ +│ └────────────┘ └────────────┘ └────────────┘ │ +│ │ │ +│ ┌─────────────┼─────────────┐ │ +│ ▼ ▼ ▼ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ +│ │ Code Gen │ │ Research │ │ Analysis │ │ +│ │ Agent │ │ Agent │ │ Agent │ │ +│ └──────────┘ └──────────┘ └──────────┘ │ +├────────────────────────────────────────────────────┤ +│ Tool Layer │ +│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ +│ │ Code │ │ Search │ │ Database │ │ +│ │ Execution │ │ APIs │ │ Access │ │ +│ └────────────┘ └────────────┘ └────────────┘ │ +├────────────────────────────────────────────────────┤ +│ Shared Memory │ +│ ┌──────────────────────────────────────────────┐ │ +│ │ Context Store │ Task Queue │ Results │ │ +│ └──────────────────────────────────────────────┘ │ +└────────────────────────────────────────────────────┘ +``` + +**關鍵組件**: + +1. **Orchestrator**: 任務分解、路由、協調 +2. **Planner Agent**: 制定執行計劃 +3. **Executor Agents**: 專門化的執行代理 +4. **Critic Agent**: 驗證和質量保證 +5. **Shared Memory**: 狀態共享和通信 + +--- + +### Q5: ReAct vs Function Calling vs Tool Use 的區別? +**難度**: 中等 + +**參考答案**: + +| 特性 | ReAct | Function Calling | Tool Use (MCP) | +|------|-------|------------------|----------------| +| **實現方式** | 提示詞工程 | API 原生支持 | 標準協議 | +| **思維鏈** | 顯式 Thought | 隱式 | 可選 | +| **可靠性** | 依賴提示詞 | 較高 | 高 | +| **工具發現** | 手動配置 | 手動配置 | 動態發現 | +| **跨平台** | 通用 | API 特定 | 標準化 | + +--- + +## 3. 推理服務設計 + +### Q6: 設計一個高併發 LLM 推理服務 +**難度**: 高級 + +**參考答案**: + +``` + 高併發 LLM 推理服務架構 +┌─────────────────────────────────────────────────────┐ +│ 負載均衡層 │ +│ ┌─────────────────────────────────────────────┐ │ +│ │ Nginx / AWS ALB │ │ +│ │ (流式連接支持, WebSocket 支持) │ │ +│ └─────────────────────────────────────────────┘ │ +├─────────────────────────────────────────────────────┤ +│ API 網關 │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ +│ │ 認證 │ │ 限流 │ │ 路由 │ │ +│ └──────────┘ └──────────┘ └──────────┘ │ +├─────────────────────────────────────────────────────┤ +│ 推理服務層 │ +│ ┌─────────────────────────────────────────────┐ │ +│ │ Request Scheduler │ │ +│ │ (Continuous Batching, Priority Queue) │ │ +│ └─────────────────────────────────────────────┘ │ +│ │ │ │ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ +│ │ vLLM │ │ vLLM │ │ vLLM │ │ +│ │ Worker 1 │ │ Worker 2 │ │ Worker N │ │ +│ │ (GPU 0) │ │ (GPU 1) │ │ (GPU N) │ │ +│ └──────────┘ └──────────┘ └──────────┘ │ +├─────────────────────────────────────────────────────┤ +│ 優化技術 │ +│ • PagedAttention (KV-Cache 分頁管理) │ +│ • Continuous Batching (動態批處理) │ +│ • Speculative Decoding (投機解碼) │ +│ • Tensor Parallelism (張量並行) │ +└─────────────────────────────────────────────────────┘ +``` + +**關鍵優化**: +1. **Continuous Batching**: 動態合併請求 +2. **KV-Cache 優化**: PagedAttention 減少內存浪費 +3. **量化**: INT8/INT4 減少內存和計算 +4. **Speculative Decoding**: 使用小模型加速 + +--- + +### Q7: 如何實現 LLM 的流式輸出? +**難度**: 中等 + +**參考答案**: + +```python +# Server-Sent Events (SSE) 實現 +from fastapi import FastAPI +from fastapi.responses import StreamingResponse +import asyncio + +app = FastAPI() + +async def generate_stream(prompt: str): + """流式生成""" + # 模擬 token 逐個生成 + response = "這是一個流式回應的範例..." + for char in response: + yield f"data: {char}\n\n" + await asyncio.sleep(0.05) # 模擬生成延遲 + yield "data: [DONE]\n\n" + +@app.post("/v1/chat/completions") +async def chat_completions(request: dict): + if request.get("stream", False): + return StreamingResponse( + generate_stream(request["messages"][-1]["content"]), + media_type="text/event-stream" + ) + # 非流式回應... +``` + +--- + +## 4. 訓練架構設計 + +### Q8: 設計一個分佈式 LLM 訓練系統 +**難度**: 高級 + +**參考答案**: + +``` + 分佈式訓練並行策略 + +1. Data Parallelism (數據並行) + ┌────────┐ ┌────────┐ ┌────────┐ + │ GPU 0 │ │ GPU 1 │ │ GPU 2 │ + │ Model │ │ Model │ │ Model │ + │(完整) │ │(完整) │ │(完整) │ + │ Data │ │ Data │ │ Data │ + │ Batch1 │ │ Batch2 │ │ Batch3 │ + └────────┘ └────────┘ └────────┘ + ↓ ↓ ↓ + └──────────┼──────────┘ + ↓ + AllReduce Gradients + +2. Tensor Parallelism (張量並行) + ┌─────────────────────────────┐ + │ Attention Layer │ + │ ┌───────┐ ┌───────┐ │ + │ │ Head │ │ Head │ ... │ + │ │ 0-3 │ │ 4-7 │ │ + │ │ GPU 0 │ │ GPU 1 │ │ + │ └───────┘ └───────┘ │ + └─────────────────────────────┘ + +3. Pipeline Parallelism (流水線並行) + GPU 0: Layers 0-7 ─► + GPU 1: Layers 8-15 ─► + GPU 2: Layers 16-23 ─► + GPU 3: Layers 24-31 ─► +``` + +**選擇策略**: +- 模型能放入單卡: DP +- 單層太大: TP +- 層數太多: PP +- 實際應用: 3D 並行 (DP + TP + PP) + +--- + +### Q9: FSDP 與 DeepSpeed ZeRO 的區別? +**難度**: 高級 + +**參考答案**: + +| 特性 | FSDP | DeepSpeed ZeRO | +|------|------|----------------| +| 開發者 | Meta (PyTorch) | Microsoft | +| 整合度 | PyTorch 原生 | 獨立庫 | +| ZeRO 階段 | ZeRO-3 | ZeRO-1/2/3 + Infinity | +| CPU Offload | 支持 | 支持 | +| NVMe Offload | 有限 | 完整支持 | +| 易用性 | 較簡單 | 配置較複雜 | + +--- + +## 5. 生產部署設計 + +### Q10: 設計一個 LLM 應用的監控系統 +**難度**: 中等 + +**參考答案**: + +``` + LLM 監控指標體系 + +┌─────────────────────────────────────────────────┐ +│ 業務指標 │ +│ • 用戶滿意度 (CSAT) │ +│ • 任務完成率 │ +│ • 人工干預率 │ +└─────────────────────────────────────────────────┘ + ↑ +┌─────────────────────────────────────────────────┐ +│ 質量指標 │ +│ • 幻覺率 (Hallucination Rate) │ +│ • 相關性分數 │ +│ • 安全違規率 │ +└─────────────────────────────────────────────────┘ + ↑ +┌─────────────────────────────────────────────────┐ +│ 性能指標 │ +│ • TTFT (Time To First Token) │ +│ • TPS (Tokens Per Second) │ +│ • P50/P95/P99 延遲 │ +│ • 吞吐量 (QPS) │ +└─────────────────────────────────────────────────┘ + ↑ +┌─────────────────────────────────────────────────┐ +│ 資源指標 │ +│ • GPU 使用率 / 內存 │ +│ • KV-Cache 命中率 │ +│ • 批處理效率 │ +└─────────────────────────────────────────────────┘ +``` + +--- + +### Q11: 如何實現 LLM 的 A/B 測試? +**難度**: 中等 + +**參考答案**: + +1. **流量分流**: 基於用戶 ID 哈希分配 +2. **指標收集**: 延遲、質量、用戶反饋 +3. **統計顯著性**: 使用假設檢驗 +4. **漸進式發布**: 1% → 10% → 50% → 100% + +```python +def ab_test_router(user_id: str, experiment: str) -> str: + """A/B 測試路由""" + hash_value = hash(f"{user_id}:{experiment}") % 100 + + if hash_value < 10: # 10% 新模型 + return "model_v2" + return "model_v1" +``` + +--- + +## 總結 + +架構設計面試重點: +1. **系統思維**: 考慮擴展性、可靠性、成本 +2. **權衡取捨**: 沒有完美方案,說清楚 trade-offs +3. **實際經驗**: 結合真實項目經驗 +4. **持續學習**: 了解最新技術趨勢 + +--- + +*本題庫持續更新中* diff --git a/DEPENDENCY_UPDATE_REPORT.md b/DEPENDENCY_UPDATE_REPORT.md new file mode 100644 index 0000000..7029956 --- /dev/null +++ b/DEPENDENCY_UPDATE_REPORT.md @@ -0,0 +1,189 @@ +# 依賴版本更新報告 + +> **生成日期**: 2026-01-15 +> **檢查範圍**: pyproject.toml, requirements*.txt + +--- + +## 當前版本 vs 最新建議版本 + +### 核心依賴 + +| 套件 | 當前版本 | 建議更新 | 狀態 | 備註 | +|------|---------|---------|------|------| +| numpy | >=1.24.0 | >=2.0.0 | 🟡 可選 | NumPy 2.0 有 breaking changes | +| pandas | >=2.0.0 | >=2.2.0 | ✅ 建議 | 性能改進 | +| scipy | >=1.10.0 | >=1.14.0 | ✅ 建議 | 新功能 | +| matplotlib | >=3.7.0 | >=3.9.0 | ✅ 建議 | | +| pytest | >=7.4.0 | >=8.0.0 | ✅ 建議 | 新功能 | + +### 機器學習 + +| 套件 | 當前版本 | 建議更新 | 狀態 | +|------|---------|---------|------| +| scikit-learn | >=1.3.0 | >=1.5.0 | ✅ 建議 | +| xgboost | >=2.0.0 | >=2.1.0 | ✅ OK | +| lightgbm | >=4.0.0 | >=4.5.0 | ✅ 建議 | +| mlflow | >=2.8.0 | >=2.18.0 | ✅ 建議 | + +### 深度學習 + +| 套件 | 當前版本 | 建議更新 | 狀態 | 備註 | +|------|---------|---------|------|------| +| torch | >=2.5.0 | >=2.5.1 | ✅ OK | 最新穩定版 | +| tensorflow | >=2.20.0 | >=2.18.0 | ⚠️ 檢查 | 版本號需確認 | +| transformers | >=4.45.0 | >=4.47.0 | ✅ 建議 | 新模型支持 | +| accelerate | >=1.0.0 | >=1.2.0 | ✅ 建議 | | + +### LLM 應用 + +| 套件 | 當前版本 | 建議更新 | 狀態 | 備註 | +|------|---------|---------|------|------| +| openai | >=1.50.0 | >=1.57.0 | ✅ 建議 | 新 API 功能 | +| anthropic | >=0.39.0 | >=0.40.0 | ✅ 建議 | Claude API 更新 | +| langchain | >=0.3.0 | >=0.3.11 | ✅ 建議 | Bug 修復 | +| langchain-openai | >=0.2.0 | >=0.2.12 | ✅ 建議 | | +| langgraph | >=0.2.0 | >=0.2.56 | ✅ 建議 | 新功能 | +| chromadb | >=0.5.0 | >=0.5.23 | ✅ 建議 | | +| gradio | >=5.0.0 | >=5.9.0 | ✅ 建議 | UI 改進 | +| streamlit | >=1.39.0 | >=1.41.0 | ✅ 建議 | | + +### 開發工具 + +| 套件 | 當前版本 | 建議更新 | 狀態 | +|------|---------|---------|------| +| pytest | >=7.4.0 | >=8.0.0 | ✅ 建議 | +| pytest-cov | >=4.1.0 | >=6.0.0 | ✅ 建議 | +| ruff | >=0.1.0 | >=0.8.0 | ✅ 建議 | +| black | >=23.0.0 | >=24.10.0 | ✅ 建議 | +| mypy | >=1.6.0 | >=1.13.0 | ✅ 建議 | + +--- + +## 建議的更新操作 + +### 優先級 1: 安全性更新 (立即執行) + +```bash +pip install --upgrade \ + openai>=1.57.0 \ + anthropic>=0.40.0 \ + requests>=2.32.0 +``` + +### 優先級 2: 功能性更新 (本週) + +```bash +pip install --upgrade \ + langchain>=0.3.11 \ + langchain-openai>=0.2.12 \ + langgraph>=0.2.56 \ + transformers>=4.47.0 +``` + +### 優先級 3: 開發工具更新 (本月) + +```bash +pip install --upgrade \ + pytest>=8.0.0 \ + ruff>=0.8.0 \ + mypy>=1.13.0 +``` + +--- + +## 新增建議套件 + +考慮新增以下套件以增強專案功能: + +### 推理優化 +```toml +# pyproject.toml [project.optional-dependencies.inference] +vllm = ">=0.6.0" # 高效推理引擎 +text-generation = ">=0.7.0" # TGI 客戶端 +``` + +### MCP 協議 +```toml +# pyproject.toml [project.optional-dependencies.mcp] +mcp = ">=1.0.0" # MCP SDK +``` + +### 監控與追蹤 +```toml +# pyproject.toml [project.optional-dependencies.monitoring] +langfuse = ">=2.0.0" # LLM 追蹤 +opentelemetry-api = ">=1.27.0" +opentelemetry-sdk = ">=1.27.0" +``` + +### 進階 RAG +```toml +# pyproject.toml [project.optional-dependencies.rag-advanced] +llama-index = ">=0.11.0" # 替代 RAG 框架 +cohere = ">=5.0.0" # Reranking +``` + +--- + +## 注意事項 + +### Breaking Changes 警告 + +1. **NumPy 2.0** + - 許多函數簽名變更 + - 建議先在測試環境驗證 + +2. **LangChain 0.3** + - 已穩定,但與 0.2 有 API 差異 + - 確保使用正確的 import 路徑 + +3. **Transformers 4.46+** + - 部分舊模型類被棄用 + - 檢查 deprecation warnings + +### 相容性矩陣 + +| Python | PyTorch | TensorFlow | 建議 | +|--------|---------|------------|------| +| 3.9 | 2.5.x | 2.17.x | ✅ | +| 3.10 | 2.5.x | 2.18.x | ✅ 推薦 | +| 3.11 | 2.5.x | 2.18.x | ✅ 推薦 | +| 3.12 | 2.5.x | 2.18.x | ✅ | +| 3.13 | 待測試 | 待測試 | ⚠️ | + +--- + +## 自動化更新建議 + +### 使用 Dependabot + +```yaml +# .github/dependabot.yml +version: 2 +updates: + - package-ecosystem: "pip" + directory: "/" + schedule: + interval: "weekly" + groups: + python-packages: + patterns: + - "*" +``` + +### 使用 pre-commit 自動更新 + +```yaml +# .pre-commit-config.yaml +repos: + - repo: https://github.com/python-poetry/poetry + rev: "1.8.0" + hooks: + - id: poetry-check + - id: poetry-lock +``` + +--- + +*報告自動生成,建議定期執行依賴審查* diff --git a/PROJECT_STATUS_ANALYSIS_2026.md b/PROJECT_STATUS_ANALYSIS_2026.md new file mode 100644 index 0000000..0b324dc --- /dev/null +++ b/PROJECT_STATUS_ANALYSIS_2026.md @@ -0,0 +1,339 @@ +# My-AI-Learning-Notes 專案現狀分析報告 + +> **分析日期**: 2026-01-15 +> **分析範圍**: 全面專案評估 +> **專案規模**: 1.8GB | 1,227個文件 | 161個目錄 | 982個Notebooks + +--- + +## 執行摘要 + +這是一個非常完整且專業的 AI 學習筆記專案,涵蓋從數學基礎、機器學習、深度學習到 LLM 應用的完整學習路徑。專案展現出優秀的工程化實踐,但在測試覆蓋率、安全性和國際化方面仍有改進空間。 + +### 總體評分矩陣 + +| 維度 | 評分 | 狀態 | 備註 | +|------|------|------|------| +| 內容完整性 | ⭐⭐⭐⭐⭐ (5/5) | ✅ 優秀 | 從基礎到進階完整覆蓋 | +| LLM/Agent技術 | ⭐⭐⭐⭐⭐ (5/5) | ✅ 優秀 | RAG、Agent、MCP皆有涵蓋 | +| 專案工程化 | ⭐⭐⭐⭐☆ (4/5) | ✅ 良好 | CI/CD、Docker、MkDocs完備 | +| 技術棧更新度 | ⭐⭐⭐⭐☆ (4/5) | ✅ 良好 | 主要依賴版本較新 | +| 測試覆蓋率 | ⭐☆☆☆☆ (1/5) | 🔴 待改進 | 覆蓋率 < 5% | +| 安全性實踐 | ⭐⭐⭐☆☆ (3/5) | 🟡 中等 | CORS、認證需加強 | +| 國際化支持 | ⭐⭐☆☆☆ (2/5) | 🟡 待改進 | 已有i18n目錄但內容不足 | +| 文檔品質 | ⭐⭐⭐⭐⭐ (5/5) | ✅ 優秀 | README、指南完備 | + +**整體評分: ⭐⭐⭐⭐☆ (3.9/5)** + +--- + +## 一、專案優勢亮點 + +### 1.1 完整的學習體系 + +專案提供從零基礎到 AI 工程師的完整學習路徑: + +``` +學習階段架構: +├── 1.從AI到LLM基礎 (1.7GB) +│ ├── 數學基礎(線性代數、微積分、概率統計) +│ ├── Python 快速入門 +│ ├── ML & 數據分析 +│ ├── 深度學習(PyTorch、TensorFlow、YOLO、SAM2) +│ └── 論文復現項目 +├── 2.深入LLM模型工程與運維 +│ ├── Transformer 架構 +│ ├── 預訓練與微調(LoRA、QLoRA) +│ ├── 偏好對齊(RLHF、DPO) +│ └── 模型壓縮與部署 +├── 3.LLM應用工程 +│ ├── RAG 系統(基礎到進階) +│ ├── Agent 系統 +│ ├── MCP 協議 +│ └── 多模態生成 +├── 4.相關的更新Blog(鐵人賽30天) +├── 5.AI研究前沿_2024-2025(50+論文) +├── 6.DeepLearning.ai短課程紀錄 +└── 9.面試準備與職業發展 +``` + +### 1.2 豐富的技術生態 + +**核心依賴已更新至最新穩定版本:** +- PyTorch ≥2.5.0 +- TensorFlow ≥2.20.0 +- Transformers ≥4.45.0 +- LangChain ≥0.3.0 +- OpenAI ≥1.50.0 +- Anthropic ≥0.39.0 + +### 1.3 專業的工程化配置 + +| 工具/配置 | 狀態 | 說明 | +|-----------|------|------| +| pyproject.toml | ✅ 完備 | 完整的專案配置與依賴管理 | +| Docker Compose | ✅ 完備 | ChromaDB、Qdrant、Ollama等 | +| GitHub Actions | ✅ 完備 | CI、Deploy、Benchmark三條流水線 | +| MkDocs | ✅ 完備 | Material主題文檔系統 | +| pre-commit | ✅ 完備 | 自動化代碼品質檢查 | +| Makefile | ✅ 完備 | 28+個開發命令 | +| Dev Container | ✅ 完備 | 開發環境容器化 | + +### 1.4 實戰項目豐富 + +``` +demos/ +├── gradio/ # Gradio UI 示例 +└── streamlit/ # Streamlit 應用示例 + +exercises/ +├── agent/ # Agent 工具使用練習 +├── rag/ # RAG 分塊練習 +└── prompt-engineering/ # 提示工程練習 + +5.AI研究前沿/實戰項目/ +├── RAG-ChatBot/ # 完整 RAG 聊天機器人 +├── AI-Code-Review/ # AI 代碼審查工具 +├── Document-Analyzer/ # 文檔分析器 +└── web-ui/ # Web 介面項目 +``` + +--- + +## 二、需要改進的問題 + +### 2.1 🔴 P0 - 緊急修復(1-2週) + +#### 問題1: 測試覆蓋率極低 + +**現狀分析:** +- 測試文件僅4個(`tests/` 目錄) +- 估計覆蓋率 < 5% +- 982個 Notebooks 幾乎無測試 +- 191個 Python 腳本幾乎無測試 + +**影響:** +- 代碼品質無法保證 +- 重構風險極高 +- Bug 難以發現 + +**建議改進:** +``` +tests/ +├── unit/ # 單元測試 +│ ├── test_rag/ # RAG 檢索測試 +│ ├── test_agent/ # Agent 功能測試 +│ └── test_embedding/ # Embedding 生成測試 +├── integration/ # 集成測試 +│ ├── test_rag_pipeline/ +│ └── test_agent_workflow/ +├── e2e/ # 端到端測試 +├── fixtures/ # 測試數據 +└── conftest.py # pytest 配置 +``` + +**目標:** 3個月內達到 50% 測試覆蓋率 + +--- + +#### 問題2: 安全性問題 + +**發現的安全風險:** + +| 問題 | 位置 | 嚴重程度 | +|------|------|----------| +| CORS 配置過於寬鬆 (`allow_origins=["*"]`) | 實戰項目 API | 🔴 高 | +| 缺少 API 身份驗證 | 所有 API 端點 | 🔴 高 | +| 缺少速率限制 | 所有 API 端點 | 🟡 中 | +| 安全掃描設置 `continue-on-error: true` | CI 配置 | 🟡 中 | + +**建議修復:** +```python +# 1. 收緊 CORS 配置 +allow_origins=[ + "https://yourdomain.com", + "https://app.yourdomain.com" +] + +# 2. 添加 API 認證 +from fastapi.security import HTTPBearer +security = HTTPBearer() + +# 3. 添加速率限制 +from slowapi import Limiter +limiter = Limiter(key_func=get_remote_address) +``` + +--- + +### 2.2 🟡 P1 - 短期補充(2-4週) + +#### 問題3: 國際化支持不足 + +**現狀:** +- 有 `i18n/` 目錄但內容有限 +- 核心文檔僅繁體中文 +- 缺少系統性的英文翻譯 + +**建議:** +1. 優先翻譯核心文檔(README、QUICKSTART、LEARNING_PATHS) +2. 建立翻譯工作流程 +3. 使用 MkDocs 的多語言支持 + +--- + +#### 問題4: 前端技術棧需更新 + +**版本對比:** + +| 技術 | 當前版本 | 最新穩定版 | 建議 | +|------|---------|-----------|------| +| Next.js | 14.x | 15.x | ⬆️ 升級 | +| React | 18.x | 19.x | ⬆️ 升級 | +| TypeScript | 5.2.x | 5.7.x | ⬆️ 升級 | +| Tailwind CSS | 3.x | 4.x | ⬆️ 升級 | + +--- + +### 2.3 🟢 P2 - 中期完善(1-3個月) + +#### 問題5: 新興技術覆蓋不足 + +**缺失內容:** + +| 領域 | 覆蓋率 | 建議新增 | +|------|--------|----------| +| Web3 + AI 融合 | 0% | 區塊鏈AI、去中心化ML | +| AR/VR/XR + AI | 0% | 空間計算、3D生成 | +| Quantum Computing | 0% | 量子ML基礎 | +| Edge AI | 部分 | 邊緣部署優化 | + +--- + +#### 問題6: DevOps 增強 + +**建議新增組件:** + +| 組件 | 用途 | 優先級 | +|------|------|--------| +| OpenTelemetry | 分佈式追蹤 | 🔴 高 | +| Jaeger | 追蹤可視化 | 🔴 高 | +| ArgoCD | GitOps 部署 | 🟡 中 | +| Helm Charts | K8s 部署模板 | 🟡 中 | + +--- + +## 三、具體改進建議清單 + +### 可立即執行的改進 + +| # | 改進項目 | 預計工時 | 優先級 | +|---|----------|---------|--------| +| 1 | 建立測試框架,新增核心功能單元測試 | 30-40h | 🔴 P0 | +| 2 | 修復 CORS 和 API 安全配置 | 15-20h | 🔴 P0 | +| 3 | 更新前端依賴版本 | 10-15h | 🔴 P0 | +| 4 | 完善 MCP 協議文檔和範例 | 12-16h | 🟡 P1 | +| 5 | 新增進階 Prompt Engineering 章節 | 16-20h | 🟡 P1 | +| 6 | 補充現代對齊方法(DPO、IPO、SimPO) | 8-10h | 🟡 P1 | +| 7 | 新增推理模型應用指南(o1、DeepSeek-R1) | 10-14h | 🟡 P1 | +| 8 | 添加 OpenTelemetry 監控 | 20-25h | 🟢 P2 | +| 9 | 核心文檔英文翻譯 | 20-25h | 🟢 P2 | +| 10 | 擴充面試題庫和職業發展內容 | 40-50h | 🟢 P2 | + +### 可選的新功能開發 + +| # | 新功能 | 說明 | 預計工時 | +|---|--------|------|----------| +| 1 | 互動式學習系統 | 添加在線練習和測驗 | 40h | +| 2 | 學習進度追蹤 | 讓用戶追蹤學習狀態 | 20h | +| 3 | 社區討論區整合 | GitHub Discussions | 10h | +| 4 | 視覺化演示系統 | 模型架構動態展示 | 40h | +| 5 | PDF/ePub 導出 | 多格式文檔輸出 | 20h | +| 6 | AI 助教機器人 | 基於專案內容的問答 | 30h | + +--- + +## 四、建議實施路線圖 + +``` +Phase 1 (Week 1-2): 緊急修復 +├── 安全性修復(CORS/認證/限流) +├── 測試框架建立 +├── 前端版本更新 +└── CI/CD 品質門檻強化 + +Phase 2 (Week 3-4): 內容補充 +├── MCP 協議完整文檔 +├── Prompt Engineering 2.0 +├── 現代對齊方法 +└── 推理模型指南 + +Phase 3 (Week 5-8): 中等功能 +├── OpenTelemetry 監控 +├── 測試覆蓋率達 30% +├── 面試題庫基礎 +└── 核心文檔英文化 + +Phase 4 (Week 9-12): 進階擴展 +├── 新興技術模塊(選擇性) +├── 測試覆蓋率達 50% +├── 完整職業發展指南 +└── 社區建設 +``` + +--- + +## 五、專案統計資訊 + +### 文件統計 + +| 類型 | 數量 | 佔比 | +|------|------|------| +| Jupyter Notebooks | 982 | 80% | +| Markdown 文檔 | 332 | 27% | +| Python 腳本 | 191 | 16% | +| 配置文件 (JSON/YAML) | 54 | 4% | +| **總計** | **1,227** | - | + +### 目錄規模 + +| 目錄 | 大小 | 說明 | +|------|------|------| +| 1.從AI到LLM基礎 | 1.7GB | 最大模塊,包含大量notebooks | +| img/ | 3.1MB | 圖片資源 | +| 4.相關的更新Blog | 19MB | 鐵人賽與後續更新 | +| 3.LLM應用工程 | 2.7MB | 應用開發相關 | +| 2.深入LLM模型工程 | 2.6MB | 模型訓練與運維 | + +### 依賴生態 + +- **核心依賴**: 50+ 個 Python 套件 +- **開發工具**: pytest, ruff, black, mypy, pre-commit +- **支持服務**: ChromaDB, Qdrant, PostgreSQL, MongoDB, Redis, Ollama + +--- + +## 六、總結與建議 + +### 優勢總結 +1. 內容體系完整,從入門到進階全覆蓋 +2. 技術棧現代化,依賴版本較新 +3. 工程化配置專業,CI/CD 完備 +4. 實戰項目豐富,有實際應用價值 +5. 文檔品質高,學習路徑清晰 + +### 重點改進方向 +1. **測試覆蓋率**:這是最迫切需要改進的問題 +2. **安全性**:API 端點需要加強認證與限流 +3. **國際化**:擴大影響力需要英文支持 +4. **持續更新**:保持技術棧與內容的時效性 + +### 建議優先順序 +1. 🔴 **立即執行**: 安全性修復 + 測試框架建立 +2. 🟡 **短期補充**: 內容更新 + 前端升級 +3. 🟢 **中期完善**: 國際化 + 新興技術覆蓋 + +--- + +*本報告自動生成於 2026-01-15* +*下次建議審查日期: 2026-02-15* diff --git a/i18n/en/README.md b/i18n/en/README.md index 72beb8c..2a445d5 100644 --- a/i18n/en/README.md +++ b/i18n/en/README.md @@ -1,159 +1,300 @@ -# AI Learning Notes +# My AI Learning Notes + +> **Traditional Chinese Version (Taiwan)** | Complete AI Engineer Learning Path Guide +> Systematic learning notes from foundational mathematics to LLM application development + +[![GitHub stars](https://img.shields.io/github/stars/markl-a/My-AI-Learning-Notes?style=social)](https://github.com/markl-a/My-AI-Learning-Notes) +[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE) +[![Last Update](https://img.shields.io/badge/last%20update-2025--01-green.svg)](https://github.com/markl-a/My-AI-Learning-Notes) -> A comprehensive learning guide from AI fundamentals to LLM application engineering +**[English](./README.md)** | [Traditional Chinese (Original)](../../README.md) -[![GitHub Stars](https://img.shields.io/github/stars/markl-a/My-AI-Learning-Notes?style=social)](https://github.com/markl-a/My-AI-Learning-Notes) -[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) -[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md) +## Related Frameworks and Algorithm Learning + +This project covers AI technology learning notes from beginner to advanced levels, including complete hands-on examples and project exercises. -## About This Project +### Core Learning Modules -**AI Learning Notes** is a systematic learning resource covering the complete journey from AI fundamentals to Large Language Model (LLM) applications. Whether you're a beginner or an experienced practitioner, you'll find valuable content here. +0. **[ETL Process with Python](https://github.com/markl-a/My-AI-Learning-Notes/blob/main/1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/3.ML_%26_Data_Analysis/5_Best_Practices_and_MLOps_Basics/%E4%BD%BF%E7%94%A8_Python_%E9%80%B2%E8%A1%8C%E7%AE%A1%E9%81%93_ETL.ipynb)** + - Complete Extract, Transform, Load (ETL) pipeline + - Hands-on examples: Processing CSV, JSON, SQL data sources with Pandas + - Advanced techniques: Data cleaning, outlier handling, data validation + +1. **[Python Quick Start](https://github.com/markl-a/My-AI-Learning-Notes/blob/main/1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/2.AI_Intro/1.%E5%BF%AB%E9%80%9F%E5%85%A5%E9%96%80python.ipynb)** + - Python fundamentals: Variables, data types, control flow + - Object-oriented programming: Classes, inheritance, polymorphism + - Practical packages: collections, itertools, functools + +2. **[NumPy & Pandas Learning Records](https://github.com/markl-a/My-AI-Learning-Notes/blob/main/1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/3.ML_%26_Data_Analysis/1_Data_Acquisition_and_Analysis/2.Python%E7%9A%84%EF%BC%AD%EF%BC%AC%E7%9B%B8%E9%97%9C%E6%A8%A1%E5%A1%8A%E5%A5%97%E4%BB%B6%E4%BD%BF%E7%94%A8.ipynb)** + - NumPy array operations: Indexing, slicing, broadcasting + - Advanced Pandas DataFrame: merge, groupby, pivot_table + - Performance optimization: Vectorized operations, memory management + +3. **[TensorFlow Learning Records](https://github.com/markl-a/My-AI-Learning-Notes/tree/main/1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/4.DL/01.Tensorflow2)** + - TensorFlow 2.x complete tutorial: Eager Execution, tf.function + - Keras API integration: Sequential, Functional, Subclassing models + - Model deployment: TensorFlow Serving, TensorFlow Lite + +4. **[Keras Learning Records](https://github.com/markl-a/My-AI-Learning-Notes/tree/main/1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/4.DL/02.Keras3)** + - Keras 3.0 multi-backend support: TensorFlow, PyTorch, JAX + - Custom Layers and models + - Callback mechanisms: EarlyStopping, ModelCheckpoint, TensorBoard + +5. **[PyTorch Learning Records](https://github.com/markl-a/My-AI-Learning-Notes/tree/main/1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/4.DL/03.Pytorch)** + - PyTorch basics: Tensor operations, automatic differentiation + - Dataset and DataLoader design patterns + - torch.compile() inference acceleration (PyTorch 2.x) + - Distributed training: DDP, FSDP + +6. **[YOLO Usage](https://github.com/markl-a/My-AI-Learning-Notes/tree/main/1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/4.DL/04.Ultralytics)** + - Complete comparison of YOLOv8, YOLOv9, YOLOv10 + - Object detection, instance segmentation, pose estimation + - Custom dataset training: Annotation tools (Roboflow, CVAT) + +7. **[GaLore Experiment and Medical Chat](https://github.com/markl-a/My-AI-Learning-Notes/tree/main/2.%E6%B7%B1%E5%85%A5LLM%E6%A8%A1%E5%9E%8B%E5%B7%A5%E7%A8%8B%E8%88%87LLM%E9%81%8B%E7%B6%AD/GaLore_Demo)** + - GaLore memory-efficient training technique + - Medical dialogue system implementation + - Comparative analysis with LoRA and QLoRA + +8. **[LangChain Learning Records](https://github.com/markl-a/My-AI-Learning-Notes/tree/main/3.LLM%E6%87%89%E7%94%A8%E5%B7%A5%E7%A8%8B/1.LangchainDemos)** + - LangChain core concepts: Chains, Agents, Memory + - LangGraph workflow orchestration + - LangSmith observability and debugging + +9. **[MLflow Introduction](https://github.com/markl-a/My-AI-Learning-Notes/blob/main/1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/4.DL/MLFLOW%E5%85%A5%E9%96%80%E4%BB%8B%E7%B4%B9%EF%BC%9A%E9%80%9A%E9%81%8ECOLAB%2C%20NGROK%2C%20PYCARET.ipynb)** + - Experiment tracking, model registry, model deployment + - MLflow Projects reproducibility + - MLflow Models multi-framework support + +10. **[Video Quality Assessment Paper Reading and Implementation](https://github.com/markl-a/My-AI-Learning-Notes/tree/main/1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/4.DL/06.Paper_with_code/Exploring%20Video%20Quality%20Assessment%20on%20User%20Generated%20Contents%20from%20Aesthetic%20and%20Technical%20Perspectives)** + - DOVER, FAST-VQA and other paper implementations + - Video quality assessment metrics and methods + +11. **[Segment Anything 2 Paper Interpretation and Example Usage](https://github.com/markl-a/My-AI-Learning-Notes/tree/main/1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/4.DL/03.Pytorch/3.Segment%20Anything%202)** + - SAM2 architecture analysis: Memory attention mechanism + - Video object tracking implementation + - Medical image segmentation applications + +12. **[LangChain Multimodal RAG Example Modifications](https://github.com/markl-a/LLM-agent-Demo/tree/main/2.Multi_modal_RAG)** + - Image-text hybrid Retrieval-Augmented Generation + - CLIP, BLIP visual encoder integration + - PDF, image, table mixed processing + +## Table of Contents + +- [Introduction](#introduction) +- [Quick Start](#-quick-start) +- [2024-2025 Latest Technology Updates](#2024-2025-latest-technology-updates) +- [Algorithms and Data Structures](#algorithms-and-data-structures) +- [From AI to LLM Fundamentals](#from-ai-to-llm-fundamentals) +- [Deep Dive into LLM Model Engineering and Operations](#deep-dive-into-llm-model-engineering-and-operations) +- [LLM Application Engineering](#llm-application-engineering) +- [AI Research Frontier 2024-2025](#ai-research-frontier-2024-2025) +- [Hands-on Project Collection](#hands-on-project-collection) +- [Related Update Blog](#related-update-blog) +- [DeepLearning.AI Short Course Learning Records](#deeplearningai-short-course-learning-records) +- [FAQ](#faq) +- [Learning Resources](#learning-resources) + +![cover](../../img/aie_cover.png) + +## Introduction -## Project Structure +This Notes repository is primarily my personal understanding and organization of AI engineering-related knowledge and skills. The main directory structure is based on [llm-course](https://github.com/mlabonne/llm-course) with extensions and expansions, plus some essential knowledge and skills related to AI, ML, DL, and data analysis. + +Currently, the focus is mainly on LLMs. Besides the content I've worked on before, other content will only be added when necessary. + +### Project Features + +- **Traditional Chinese (Taiwan)**: Uses professional terminology and expressions commonly used in Taiwan +- **Complete Hands-on Examples**: Each chapter includes executable Jupyter Notebooks and code +- **Project-Oriented Learning**: Learn technical applications from real projects, not just theory +- **Continuous Updates**: Tracks the latest technology developments and industry trends in 2024-2025 +- **Open Source Community**: Suggestions, bug reports, and content contributions are welcome + +### Target Audience + +| Role | Learning Focus | Expected Outcome | +|------|----------------|------------------| +| **AI Beginners** | Systematic learning starting from math fundamentals | Master AI/ML basics in 8-12 months | +| **Software Engineers** | Quick mastery of LLM application development | Can develop RAG systems in 2-3 months | +| **Data Scientists** | Deep learning and LLM expertise | Master model training and optimization in 4-6 months | +| **Enterprise Developers** | Integrating AI into products | Can deploy production-grade applications in 3-4 months | +| **Researchers** | Deep dive into model architectures and cutting-edge technology | Continuous learning of latest papers and implementations | -``` -📁 My-AI-Learning-Notes/ -├── 1.AI-to-LLM-Fundamentals/ # Deep Learning → Transformers → LLMs -├── 2.LLM-Model-Engineering/ # Training, Evaluation, Fine-tuning -├── 3.LLM-Application-Engineering/ # Deployment, RAG, Agents, Prompt Engineering -├── 4.Update-Blog/ # Latest updates and articles -├── 5.AI-Research-Frontier/ # 2024-2025 cutting-edge research -├── 6.DeepLearning-Short-Courses/ # DeepLearning.ai course notes -└── 9.Interview-Career/ # Interview prep and career development -``` +--- -## Learning Paths +## Quick Start -### 🎯 Path 1: AI Fundamentals (2-3 months) +> **New here? Don't know where to start?** We've prepared a complete learning navigation system for you! -For beginners with basic programming knowledge: +### Core Navigation Documents -1. Deep Learning Basics → Neural Networks, Backpropagation -2. Transformer Architecture → Attention Mechanism, Encoder-Decoder -3. Pre-trained Models → BERT, GPT Series +| Document | Description | Target Audience | +|----------|-------------|-----------------| +| **[QUICKSTART.md](./QUICKSTART.md)** | 5-minute quick start guide | Everyone - Quick overview of how to use this project | +| **[LEARNING_PATHS.md](./LEARNING_PATHS.md)** | Complete learning path planning | Those who want systematic learning | +| **[RESOURCES.md](../../RESOURCES.md)** | Comprehensive resource index | Those needing tools, APIs, datasets, etc. | +| **[CONTRIBUTING.md](./CONTRIBUTING.md)** | Contribution guide | Those who want to participate in project development | -### 🚀 Path 2: LLM Applications (1-2 months) +### 10-Minute Hands-on Scenarios -For developers who want to build LLM applications: +Choose the quick hands-on exercise based on your goal: -1. LLM APIs → OpenAI, Anthropic, local deployment -2. Prompt Engineering → Techniques, best practices -3. RAG Systems → Retrieval-Augmented Generation -4. Agent Development → Tool use, planning, multi-agent +1. **[LLM Chat Application](./QUICKSTART.md#llm-chat-application-10-minutes)** - Quickly build a chatbot +2. **[RAG System](./QUICKSTART.md#rag-retrieval-augmented-generation-10-minutes)** - Build a knowledge base Q&A system +3. **[Agent Workflow](./QUICKSTART.md#agent-workflow-10-minutes)** - Create an autonomous decision-making AI assistant +4. **[Image Generation](./QUICKSTART.md#image-generation-10-minutes)** - Generate images using Stable Diffusion -### 🔬 Path 3: LLM Engineering (3-4 months) +### Recommended Learning Paths -For ML engineers who want to train/fine-tune models: +- **[Application Development Path](./LEARNING_PATHS.md#llm-application-development-fast-track-3-months)** - Master LLM application development in 3 months +- **[Engineering Practice Path](./LEARNING_PATHS.md#zero-to-ai-engineer-12-months)** - From zero to AI engineer in 12 months +- **[Research Direction Path](./LEARNING_PATHS.md#ai-research-direction-continuous-learning)** - Deep dive into cutting-edge papers and algorithms -1. Training Techniques → Distributed training, optimization -2. Fine-tuning → LoRA, QLoRA, full fine-tuning -3. Evaluation → Benchmarks, human evaluation -4. Deployment → Quantization, inference optimization +--- -## Key Topics +### Learning Outcomes -### Fundamentals +After completing this course, you will be able to: -| Topic | Description | Directory | -|-------|-------------|-----------| -| Deep Learning | Neural networks, optimization | `1.從AI到LLM基礎/深度學習基礎` | -| Transformers | Attention, architecture | `1.從AI到LLM基礎/Transformer架構` | -| Pre-training | BERT, GPT, training objectives | `1.從AI到LLM基礎/預訓練模型` | +1. **Build Complete ML/DL Pipelines** + - Data collection -> Preprocessing -> Model training -> Evaluation -> Deployment + - Hands-on projects: Handwritten digit recognition, sentiment analysis system -### Application Engineering +2. **Develop LLM Applications** + - RAG systems: Q&A bots combined with external knowledge bases + - Agent systems: Multi-step reasoning autonomous AI + - Hands-on projects: Traditional Chinese customer service bot, document analysis assistant -| Topic | Description | Directory | -|-------|-------------|-----------| -| LLM Deployment | vLLM, TGI, local deployment | `3.LLM應用工程/1.LLM 部署` | -| Prompt Engineering | Techniques, frameworks | `3.LLM應用工程/3.提示工程學` | -| RAG | Retrieval, embedding, ranking | `3.LLM應用工程/4.RAG` | -| Agents | Tool use, planning, frameworks | `3.LLM應用工程/5.Agent` | +3. **Fine-tune and Deploy Models** + - Fine-tune open-source LLMs using LoRA/QLoRA + - Quantization and optimization: GGUF, GPTQ, AWQ + - Deployment: vLLM, Ollama, TensorRT-LLM -### Cutting-Edge Research (2024-2025) +4. **Master MLOps Workflows** + - Experiment tracking: MLflow, Weights & Biases + - Model monitoring: LangSmith, Arize Phoenix + - CI/CD: GitHub Actions, Docker -- Multi-modal AI (Vision-Language Models) -- Reasoning Models (o1, DeepSeek-R1) -- Agent Systems -- AI Safety & Alignment -- Efficient Inference +## 2024-2025 Latest Technology Updates -## Quick Start +The evolution of LLM and multimodal technology has accelerated since 2024. Here's a summary of recent key developments worth following, with corresponding sections in these notes for extended reading or hands-on practice: -### Prerequisites +### Large Foundation Models and Reasoning Models -- Python 3.9+ -- Basic understanding of machine learning -- Familiarity with PyTorch or TensorFlow +- **OpenAI GPT-4o / GPT-4o mini (2024.05)**: Multimodal (text, image, audio) one-shot integration. The derived *Responses API* has been added to [LLM Application Engineering](../../3.LLM%E6%87%89%E7%94%A8%E5%B7%A5%E7%A8%8B/README.md) with directly executable example programs. +- **OpenAI o1 reasoning series (2024.09)**: Provides multi-step thinking results for long-chain reasoning and programming problems. Recommended to test with `model=gpt-4o` or `model=o1-mini` for comparison. +- **Meta Llama 3 (2024.04)**: Provides 8B/70B open-source weights. The instruction-tuned version supports 8K+ context and can be applied in the "1. LLM Deployment" and "6. Inference Optimization" chapters. +- **Google Gemini 1.5 Pro / Flash (2024.02)**: Native support for million-token long context and multimodal input, suitable for integration into multimedia RAG or Agent tasks. +- **Mistral Large 2 (2024.07)** and **Microsoft Phi-3.5 / Phi-4 (2024 Q3-Q4)**: Provide lighter model choices for cloud and edge, convenient for deployment scenarios described in "8. Edge and On-device." -### Setup +### Agentic Ecosystem and Toolchain -```bash -# Clone the repository -git clone https://github.com/markl-a/My-AI-Learning-Notes.git -cd My-AI-Learning-Notes +- **LangGraph 0.2+**: Officially provides durable execution, checkpoints, LangSmith tracing, and LangGraph Studio visualization integration. Updated summaries and example programs have been added to the [Agent Workflow Notes](../../3.LLM%E6%87%89%E7%94%A8%E5%B7%A5%E7%A8%8B/3.Agent/). +- **Model Context Protocol (MCP)**: Becoming the common standard for tool invocation by OpenAI, Anthropic, etc. You can quickly set up an MCP server using the Python SDK. Related instructions have been added to the Agent tool integration chapter. +- **CrewAI, AutoGen, LlamaIndex 2025 series updates**: Enhanced for multi-Agent collaboration, task scheduling, and observability. Corresponding data has been updated in the "3. Agent" and "5. Advanced RAG" chapters. -# Create virtual environment -python -m venv venv -source venv/bin/activate # Linux/Mac -# or: venv\Scripts\activate # Windows +### Inference and Deployment Ecosystem -# Install dependencies -pip install -r requirements.txt -``` +- **vLLM 0.4.x, SGLang, TensorRT-LLM 0.10**: Provide efficient inference with multi-path concurrency, chunked KV Cache, and Streaming. Notes in the "6. Inference Optimization" section have been supplemented with differences and deployment recommendations. +- **Ollama 0.3, LM Studio 0.3**: Local model managers support dynamic quantization and OpenAI-Compatible API, which can directly connect with RAG/Agent examples. +- **OpenTelemetry GenAI, LangSmith / Arize Phoenix 2024**: Establish evaluation and tracing infrastructure. New guidance has been added in the "Production-grade Evaluation" chapter. -### Running Examples +### Practice and Exercise Recommendations -```bash -# Run RAG example -cd 3.LLM應用工程/4.RAG/examples -python simple_rag.py +- First read the updated chapters of [LLM Application Engineering](../../3.LLM%E6%87%89%E7%94%A8%E5%B7%A5%E7%A8%8B/README.md), which summarizes the latest models, inference frameworks, and security checklists. +- Try running the "Responses API x LangGraph Agent" combination example: + 1. Install `openai`, `langgraph` packages following the README instructions. + 2. After setting `OPENAI_API_KEY`, first test the `responses_quickstart.py` example to confirm multimodal input works correctly. + 3. Then execute `langgraph_agent_demo.py` to observe how durable execution and tool invocation records are tracked in LangSmith. -# Run Agent example -cd 3.LLM應用工程/5.Agent/examples -python tool_agent.py -``` +This summary will be continuously updated. When new models or tools emerge, evaluation and hands-on records will be added to the corresponding chapters. + +## Algorithms and Data Structures + +This section mainly records my algorithm practice and reading notes. While I'm not a university professor or expert-level, there may inevitably be some errors. Corrections are welcome. + +**My Algorithm Practice Repository**: [LeetcodePractice](https://github.com/markl-a/LeetcodePractice) + +### Why Do AI Engineers Need to Learn Algorithms? + +In the deep learning era, many question whether traditional algorithms still need to be learned. The answer is: **Absolutely!** + +1. **Optimizing Model Performance**: Understanding time/space complexity to choose appropriate data structures + - Example: Choosing HashTable over List for lookups, optimizing from O(n) to O(1) + +2. **Designing Efficient Pipelines**: Data processing and feature engineering require efficient algorithms + - Example: Using Heap for Top-K problems, more efficient than sorting + +3. **Understanding Deep Learning**: Many DL algorithms are essentially applications of graph theory and dynamic programming + - Example: The self-attention mechanism in Transformer is essentially a fully connected graph + - Example: Beam Search is a greedy algorithm based on priority queues -## Hands-on Projects +4. **Essential for Technical Interviews**: FAANG and other major companies' AI positions still value algorithm skills + - Google Brain, Meta AI, OpenAI and other teams' interviews include algorithm questions -| Project | Description | Difficulty | -|---------|-------------|------------| -| RAG Chatbot | Document Q&A system | ⭐⭐ | -| Code Review Agent | Automated code review | ⭐⭐⭐ | -| Document Analyzer | Multi-format document processing | ⭐⭐⭐ | -| Multi-Agent System | Collaborative AI agents | ⭐⭐⭐⭐ | +## From AI to LLM Fundamentals + +This section is primarily based on the LLM course directory structure with deepening and reinforcement. + +Main content includes: mathematical foundations, data analysis and processing, machine learning, NLP and CV in deep learning, and MLOps. + +(In addition to deepening the original content, CV and MLOps that may be used at work are added) + +**For detailed content, please see: [1.From AI to LLM Fundamentals/README.md](../../1.%E5%BE%9EAI%E5%88%B0LLM%E5%9F%BA%E7%A4%8E/README.md)** - Contains complete learning paths, hands-on project suggestions, and latest technology updates + +### Learning Objectives + +This chapter covers the complete knowledge system from traditional machine learning to deep learning: + +- **Mathematical Foundations**: Linear algebra, calculus, probability and statistics, optimization theory +- **Python and Data Science**: NumPy, Pandas, Matplotlib, Scikit-learn +- **Machine Learning**: Traditional ML algorithms, feature engineering, model evaluation +- **Deep Learning**: TensorFlow, PyTorch, Keras - three major frameworks +- **Computer Vision**: CNN, object detection, image segmentation +- **Natural Language Processing**: Transformer, BERT, GPT series +- **MLOps Basics**: Model deployment, version control, experiment tracking + +### Directory Structure + +``` +1.From AI to LLM Fundamentals/ +├── 1.Math_4_ML/ # Mathematics for Machine Learning +├── 2.AI_Intro/ # AI Introduction and History +├── 3.ML_&_Data_Analysis/ # Machine Learning and Data Analysis +└── 4.DL/ # Deep Learning (including CV and NLP) +``` + +--- ## Contributing -We welcome contributions! Please read our [Contributing Guide](CONTRIBUTING.md) before submitting. +We welcome all forms of contributions! Please read [CONTRIBUTING.md](./CONTRIBUTING.md) to learn about contribution guidelines. ### Ways to Contribute -- 📝 Fix typos or improve documentation -- 💻 Add code examples -- 🌍 Help with translations -- 🐛 Report issues -- 💡 Suggest new topics +- Report Bugs +- Suggest Features +- Improve Documentation +- Submit Code +- Share Learning Notes -## Resources +--- -- [Learning Resources](RESOURCES.md) - Curated list of learning materials -- [Glossary](GLOSSARY.md) - AI/LLM terminology -- [Prerequisites](PREREQUISITES.md) - Self-assessment checklist -- [Changelog](CHANGELOG.md) - Version history +## Contact and Contribute -## License +- **GitHub Issues**: Report problems or suggestions +- **Pull Requests**: Contribute code or documentation +- **Discussions**: Technical discussions and Q&A +- **Star the Project**: Support project development -This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. +--- -## Acknowledgments +**Start your AI learning journey now!** -- OpenAI, Anthropic, Google for their research papers -- DeepLearning.ai for educational content -- The open-source AI community +Remember: **The best time to start is now!** --- -⭐ If you find this helpful, please star this repository! - -📬 Questions? Open an [issue](https://github.com/markl-a/My-AI-Learning-Notes/issues) +Last updated: 2025-01-19 diff --git a/tests/e2e/__init__.py b/tests/e2e/__init__.py new file mode 100644 index 0000000..2b12ef2 --- /dev/null +++ b/tests/e2e/__init__.py @@ -0,0 +1,6 @@ +""" +End-to-End Tests Package + +端到端測試模組:測試完整的用戶流程。 +這些測試模擬真實用戶場景,驗證整體系統行為。 +""" diff --git a/tests/fixtures/__init__.py b/tests/fixtures/__init__.py new file mode 100644 index 0000000..d604c02 --- /dev/null +++ b/tests/fixtures/__init__.py @@ -0,0 +1,6 @@ +""" +Test Fixtures Package + +測試數據和共享資源模組。 +提供測試所需的樣本數據、模擬對象和配置。 +""" diff --git a/tests/fixtures/sample_data.py b/tests/fixtures/sample_data.py new file mode 100644 index 0000000..d16876c --- /dev/null +++ b/tests/fixtures/sample_data.py @@ -0,0 +1,123 @@ +""" +Sample Test Data + +提供測試使用的範例數據集合。 +""" + +# 範例文本數據 +SAMPLE_TEXTS = { + "zh_ml_intro": """ + 機器學習是人工智能的一個分支,它使計算機能夠從數據中學習並做出決策或預測, + 而無需明確編程。深度學習是機器學習的一個子領域,使用多層神經網絡來學習 + 數據的層次表示。 + """, + "en_ml_intro": """ + Machine learning is a branch of artificial intelligence that enables computers + to learn from data and make decisions or predictions without being explicitly + programmed. Deep learning is a subset of machine learning that uses multi-layer + neural networks to learn hierarchical representations of data. + """, + "zh_dl_intro": """ + 深度學習是一種機器學習方法,使用深層神經網絡來處理複雜的模式識別任務。 + 常見的架構包括卷積神經網絡(CNN)、循環神經網絡(RNN)和 Transformer。 + """, +} + +# 範例文檔集合 +SAMPLE_DOCUMENTS = [ + { + "id": "doc_001", + "title": "機器學習基礎", + "content": "監督學習、非監督學習和強化學習是機器學習的三大類型。", + "metadata": { + "category": "machine_learning", + "language": "zh", + "difficulty": "beginner", + }, + }, + { + "id": "doc_002", + "title": "深度學習與神經網絡", + "content": "神經網絡由輸入層、隱藏層和輸出層組成,通過反向傳播進行訓練。", + "metadata": { + "category": "deep_learning", + "language": "zh", + "difficulty": "intermediate", + }, + }, + { + "id": "doc_003", + "title": "Introduction to LLM", + "content": "Large Language Models are trained on massive text corpora using self-supervised learning.", + "metadata": { + "category": "llm", + "language": "en", + "difficulty": "advanced", + }, + }, + { + "id": "doc_004", + "title": "RAG 系統設計", + "content": "檢索增強生成(RAG)結合了檢索系統和生成模型,提高回答準確性。", + "metadata": { + "category": "rag", + "language": "zh", + "difficulty": "advanced", + }, + }, +] + +# LLM API 模擬回應 +MOCK_LLM_RESPONSES = { + "openai": { + "content": "這是一個模擬的 OpenAI GPT 回應。", + "model": "gpt-4", + "usage": { + "prompt_tokens": 100, + "completion_tokens": 50, + "total_tokens": 150, + }, + "finish_reason": "stop", + }, + "anthropic": { + "content": "這是一個模擬的 Anthropic Claude 回應。", + "model": "claude-3-opus", + "usage": { + "input_tokens": 100, + "output_tokens": 50, + }, + "stop_reason": "end_turn", + }, +} + +# 範例向量數據 +SAMPLE_EMBEDDINGS = { + "dimension": 384, + "vectors": [ + {"id": "vec_001", "values": [0.1] * 384, "metadata": {"text": "機器學習"}}, + {"id": "vec_002", "values": [0.2] * 384, "metadata": {"text": "深度學習"}}, + {"id": "vec_003", "values": [0.3] * 384, "metadata": {"text": "自然語言處理"}}, + ], +} + +# 範例配置 +SAMPLE_CONFIGS = { + "model_config": { + "model_name": "gpt-4", + "temperature": 0.7, + "max_tokens": 2048, + "top_p": 1.0, + }, + "rag_config": { + "chunk_size": 1000, + "chunk_overlap": 200, + "retrieval_k": 5, + "similarity_threshold": 0.7, + }, + "training_config": { + "batch_size": 32, + "learning_rate": 0.001, + "epochs": 10, + "early_stopping_patience": 3, + }, +} diff --git a/tests/integration/__init__.py b/tests/integration/__init__.py new file mode 100644 index 0000000..87dd2ad --- /dev/null +++ b/tests/integration/__init__.py @@ -0,0 +1,6 @@ +""" +Integration Tests Package + +集成測試模組:測試多個組件之間的交互。 +這些測試可能需要外部資源如數據庫或 API。 +""" diff --git a/tests/unit/__init__.py b/tests/unit/__init__.py new file mode 100644 index 0000000..3204758 --- /dev/null +++ b/tests/unit/__init__.py @@ -0,0 +1,6 @@ +""" +Unit Tests Package + +單元測試模組:測試單一函數或類的獨立功能。 +這些測試應該快速執行且不依賴外部資源。 +""" diff --git a/tests/unit/test_agent/__init__.py b/tests/unit/test_agent/__init__.py new file mode 100644 index 0000000..fe3685c --- /dev/null +++ b/tests/unit/test_agent/__init__.py @@ -0,0 +1 @@ +"""Agent 系統單元測試模組""" diff --git a/tests/unit/test_agent/test_agent_system.py b/tests/unit/test_agent/test_agent_system.py new file mode 100644 index 0000000..4ba06cb --- /dev/null +++ b/tests/unit/test_agent/test_agent_system.py @@ -0,0 +1,462 @@ +""" +Agent 系統單元測試 +測試 Agent 架構、工具調用、記憶機制等功能 +""" + +import pytest +from dataclasses import dataclass, field +from typing import Any, Callable, Dict, List, Optional +from enum import Enum +import json + + +# ============ 測試用的 Agent 類實現 ============ + +class ToolCallStatus(Enum): + """工具調用狀態""" + SUCCESS = "success" + FAILED = "failed" + PENDING = "pending" + + +@dataclass +class Tool: + """工具定義""" + name: str + description: str + parameters: Dict[str, Any] + function: Callable = None + + def execute(self, **kwargs) -> Any: + """執行工具""" + if self.function: + return self.function(**kwargs) + return f"Mock result for {self.name}" + + +@dataclass +class ToolCall: + """工具調用記錄""" + tool_name: str + arguments: Dict[str, Any] + result: Any = None + status: ToolCallStatus = ToolCallStatus.PENDING + + +@dataclass +class Message: + """對話訊息""" + role: str # "system", "user", "assistant", "tool" + content: str + tool_calls: List[ToolCall] = field(default_factory=list) + + +@dataclass +class AgentMemory: + """Agent 記憶""" + messages: List[Message] = field(default_factory=list) + max_messages: int = 100 + + def add_message(self, message: Message): + """添加訊息""" + self.messages.append(message) + if len(self.messages) > self.max_messages: + # 保留系統訊息,刪除最舊的對話 + system_msgs = [m for m in self.messages if m.role == "system"] + other_msgs = [m for m in self.messages if m.role != "system"] + other_msgs = other_msgs[-(self.max_messages - len(system_msgs)):] + self.messages = system_msgs + other_msgs + + def get_context(self, n: int = 10) -> List[Message]: + """獲取最近 n 條訊息""" + return self.messages[-n:] + + def clear(self): + """清空記憶""" + self.messages = [] + + +class Agent: + """基礎 Agent 類""" + + def __init__( + self, + name: str, + system_prompt: str = "", + tools: List[Tool] = None, + max_iterations: int = 10 + ): + self.name = name + self.system_prompt = system_prompt + self.tools = {t.name: t for t in (tools or [])} + self.memory = AgentMemory() + self.max_iterations = max_iterations + + if system_prompt: + self.memory.add_message(Message(role="system", content=system_prompt)) + + def add_tool(self, tool: Tool): + """添加工具""" + self.tools[tool.name] = tool + + def remove_tool(self, tool_name: str): + """移除工具""" + if tool_name in self.tools: + del self.tools[tool_name] + + def execute_tool(self, tool_name: str, arguments: Dict[str, Any]) -> ToolCall: + """執行工具調用""" + tool_call = ToolCall(tool_name=tool_name, arguments=arguments) + + if tool_name not in self.tools: + tool_call.status = ToolCallStatus.FAILED + tool_call.result = f"Tool '{tool_name}' not found" + return tool_call + + try: + tool = self.tools[tool_name] + result = tool.execute(**arguments) + tool_call.result = result + tool_call.status = ToolCallStatus.SUCCESS + except Exception as e: + tool_call.status = ToolCallStatus.FAILED + tool_call.result = str(e) + + return tool_call + + def get_available_tools(self) -> List[Dict[str, Any]]: + """獲取可用工具列表(OpenAI 格式)""" + return [ + { + "type": "function", + "function": { + "name": tool.name, + "description": tool.description, + "parameters": tool.parameters + } + } + for tool in self.tools.values() + ] + + +# ============ 測試類 ============ + +class TestTool: + """Tool 類測試""" + + def test_tool_creation(self): + """測試工具創建""" + tool = Tool( + name="calculator", + description="進行數學計算", + parameters={ + "type": "object", + "properties": { + "expression": {"type": "string"} + } + } + ) + + assert tool.name == "calculator" + assert tool.description == "進行數學計算" + + def test_tool_execution_mock(self): + """測試工具執行(Mock)""" + tool = Tool( + name="test_tool", + description="測試工具", + parameters={} + ) + + result = tool.execute() + assert "Mock result" in result + + def test_tool_execution_with_function(self): + """測試工具執行(真實函數)""" + def add(a: int, b: int) -> int: + return a + b + + tool = Tool( + name="add", + description="加法", + parameters={ + "type": "object", + "properties": { + "a": {"type": "integer"}, + "b": {"type": "integer"} + } + }, + function=add + ) + + result = tool.execute(a=2, b=3) + assert result == 5 + + +class TestToolCall: + """ToolCall 測試""" + + def test_tool_call_creation(self): + """測試工具調用創建""" + call = ToolCall( + tool_name="search", + arguments={"query": "test"} + ) + + assert call.tool_name == "search" + assert call.status == ToolCallStatus.PENDING + assert call.result is None + + def test_tool_call_status_update(self): + """測試狀態更新""" + call = ToolCall(tool_name="test", arguments={}) + call.status = ToolCallStatus.SUCCESS + call.result = "完成" + + assert call.status == ToolCallStatus.SUCCESS + assert call.result == "完成" + + +class TestMessage: + """Message 測試""" + + def test_message_creation(self): + """測試訊息創建""" + msg = Message(role="user", content="你好") + + assert msg.role == "user" + assert msg.content == "你好" + assert msg.tool_calls == [] + + def test_message_with_tool_calls(self): + """測試帶工具調用的訊息""" + tool_call = ToolCall(tool_name="search", arguments={"q": "AI"}) + msg = Message( + role="assistant", + content="讓我搜尋一下", + tool_calls=[tool_call] + ) + + assert len(msg.tool_calls) == 1 + assert msg.tool_calls[0].tool_name == "search" + + +class TestAgentMemory: + """AgentMemory 測試""" + + def test_memory_add_message(self): + """測試添加訊息""" + memory = AgentMemory() + memory.add_message(Message(role="user", content="測試")) + + assert len(memory.messages) == 1 + + def test_memory_max_messages(self): + """測試最大訊息限制""" + memory = AgentMemory(max_messages=5) + + for i in range(10): + memory.add_message(Message(role="user", content=f"訊息 {i}")) + + assert len(memory.messages) <= 5 + + def test_memory_preserve_system_message(self): + """測試保留系統訊息""" + memory = AgentMemory(max_messages=5) + memory.add_message(Message(role="system", content="你是助手")) + + for i in range(10): + memory.add_message(Message(role="user", content=f"訊息 {i}")) + + # 系統訊息應該被保留 + system_msgs = [m for m in memory.messages if m.role == "system"] + assert len(system_msgs) == 1 + assert system_msgs[0].content == "你是助手" + + def test_memory_get_context(self): + """測試獲取上下文""" + memory = AgentMemory() + for i in range(20): + memory.add_message(Message(role="user", content=f"訊息 {i}")) + + context = memory.get_context(n=5) + assert len(context) == 5 + assert context[-1].content == "訊息 19" + + def test_memory_clear(self): + """測試清空記憶""" + memory = AgentMemory() + memory.add_message(Message(role="user", content="測試")) + memory.clear() + + assert len(memory.messages) == 0 + + +class TestAgent: + """Agent 類測試""" + + def test_agent_creation(self): + """測試 Agent 創建""" + agent = Agent(name="TestAgent", system_prompt="你是一個測試助手") + + assert agent.name == "TestAgent" + assert len(agent.memory.messages) == 1 + assert agent.memory.messages[0].role == "system" + + def test_agent_add_tool(self): + """測試添加工具""" + agent = Agent(name="TestAgent") + tool = Tool(name="calculator", description="計算器", parameters={}) + + agent.add_tool(tool) + + assert "calculator" in agent.tools + + def test_agent_remove_tool(self): + """測試移除工具""" + tool = Tool(name="calculator", description="計算器", parameters={}) + agent = Agent(name="TestAgent", tools=[tool]) + + agent.remove_tool("calculator") + + assert "calculator" not in agent.tools + + def test_agent_execute_tool_success(self): + """測試成功執行工具""" + def multiply(x: int, y: int) -> int: + return x * y + + tool = Tool( + name="multiply", + description="乘法", + parameters={}, + function=multiply + ) + agent = Agent(name="TestAgent", tools=[tool]) + + result = agent.execute_tool("multiply", {"x": 3, "y": 4}) + + assert result.status == ToolCallStatus.SUCCESS + assert result.result == 12 + + def test_agent_execute_tool_not_found(self): + """測試執行不存在的工具""" + agent = Agent(name="TestAgent") + + result = agent.execute_tool("nonexistent", {}) + + assert result.status == ToolCallStatus.FAILED + assert "not found" in result.result + + def test_agent_execute_tool_error(self): + """測試工具執行錯誤""" + def error_func(): + raise ValueError("測試錯誤") + + tool = Tool(name="error_tool", description="錯誤工具", parameters={}, function=error_func) + agent = Agent(name="TestAgent", tools=[tool]) + + result = agent.execute_tool("error_tool", {}) + + assert result.status == ToolCallStatus.FAILED + assert "測試錯誤" in result.result + + def test_agent_get_available_tools(self): + """測試獲取可用工具列表""" + tools = [ + Tool(name="tool1", description="工具1", parameters={"type": "object"}), + Tool(name="tool2", description="工具2", parameters={"type": "object"}) + ] + agent = Agent(name="TestAgent", tools=tools) + + available = agent.get_available_tools() + + assert len(available) == 2 + assert all(t["type"] == "function" for t in available) + tool_names = [t["function"]["name"] for t in available] + assert "tool1" in tool_names + assert "tool2" in tool_names + + +class TestReActPattern: + """ReAct 模式測試""" + + def test_thought_action_observation_cycle(self): + """測試 Thought-Action-Observation 循環""" + # 模擬 ReAct 步驟 + steps = [] + + # Thought + steps.append({"type": "thought", "content": "我需要搜尋最新的 AI 新聞"}) + + # Action + steps.append({"type": "action", "tool": "search", "input": "latest AI news"}) + + # Observation + steps.append({"type": "observation", "result": "找到 10 篇相關文章"}) + + # Final Answer + steps.append({"type": "answer", "content": "根據搜尋結果..."}) + + assert len(steps) == 4 + assert steps[0]["type"] == "thought" + assert steps[1]["type"] == "action" + assert steps[2]["type"] == "observation" + assert steps[3]["type"] == "answer" + + def test_max_iterations(self): + """測試最大迭代次數限制""" + agent = Agent(name="TestAgent", max_iterations=5) + + assert agent.max_iterations == 5 + + +class TestFunctionCalling: + """Function Calling 測試""" + + def test_function_schema_generation(self): + """測試函數 Schema 生成""" + schema = { + "type": "function", + "function": { + "name": "get_weather", + "description": "獲取天氣資訊", + "parameters": { + "type": "object", + "properties": { + "location": { + "type": "string", + "description": "城市名稱" + }, + "unit": { + "type": "string", + "enum": ["celsius", "fahrenheit"] + } + }, + "required": ["location"] + } + } + } + + assert schema["type"] == "function" + assert schema["function"]["name"] == "get_weather" + assert "location" in schema["function"]["parameters"]["properties"] + + def test_parse_function_call(self): + """測試解析函數調用""" + # 模擬 OpenAI 的 function_call 響應 + function_call = { + "name": "get_weather", + "arguments": '{"location": "Taipei", "unit": "celsius"}' + } + + name = function_call["name"] + args = json.loads(function_call["arguments"]) + + assert name == "get_weather" + assert args["location"] == "Taipei" + assert args["unit"] == "celsius" + + +if __name__ == "__main__": + pytest.main([__file__, "-v"]) diff --git a/tests/unit/test_rag/__init__.py b/tests/unit/test_rag/__init__.py new file mode 100644 index 0000000..a180d88 --- /dev/null +++ b/tests/unit/test_rag/__init__.py @@ -0,0 +1 @@ +"""RAG 系統單元測試模組""" diff --git a/tests/unit/test_rag/test_text_splitter.py b/tests/unit/test_rag/test_text_splitter.py new file mode 100644 index 0000000..4d1b288 --- /dev/null +++ b/tests/unit/test_rag/test_text_splitter.py @@ -0,0 +1,271 @@ +""" +RAG 系統單元測試 - TextSplitter +測試文本拆分器的各種功能 +""" + +import pytest +import sys +from pathlib import Path + +# 添加專案路徑 +sys.path.insert(0, str(Path(__file__).parent.parent.parent.parent)) + +# 測試用的簡化 TextSplitter 實現(避免依賴問題) +class TextSplitter: + """文本拆分器""" + + def __init__(self, chunk_size: int = 500, chunk_overlap: int = 50): + self.chunk_size = chunk_size + self.chunk_overlap = chunk_overlap + + def split_text(self, text: str) -> list: + """拆分文本為多個塊""" + chunks = [] + start = 0 + + while start < len(text): + end = start + self.chunk_size + + if end < len(text): + for separator in ['\n\n', '\n', '。', '. ', ' ']: + pos = text.rfind(separator, start, end) + if pos != -1: + end = pos + len(separator) + break + + chunk = text[start:end].strip() + if chunk: + chunks.append(chunk) + + start = end - self.chunk_overlap + + return chunks + + +class TestTextSplitter: + """TextSplitter 單元測試類""" + + def test_basic_split(self): + """測試基本的文本拆分功能""" + splitter = TextSplitter(chunk_size=100, chunk_overlap=10) + text = "這是一段測試文字。" * 20 + + chunks = splitter.split_text(text) + + assert len(chunks) > 1, "應該產生多個塊" + assert all(len(chunk) <= 110 for chunk in chunks), "每個塊不應超過 chunk_size + 一些緩衝" + + def test_empty_text(self): + """測試空文本""" + splitter = TextSplitter() + chunks = splitter.split_text("") + + assert chunks == [], "空文本應返回空列表" + + def test_small_text(self): + """測試小於 chunk_size 的文本""" + splitter = TextSplitter(chunk_size=500) + text = "這是一小段文字" + + chunks = splitter.split_text(text) + + assert len(chunks) == 1, "小文本應只產生一個塊" + assert chunks[0] == text, "塊內容應與原文相同" + + def test_chunk_overlap(self): + """測試塊重疊功能""" + splitter = TextSplitter(chunk_size=50, chunk_overlap=10) + text = "A" * 100 + + chunks = splitter.split_text(text) + + # 確保有重疊 + assert len(chunks) >= 2, "應有多個塊" + + def test_sentence_boundary_split(self): + """測試句子邊界拆分""" + splitter = TextSplitter(chunk_size=50, chunk_overlap=5) + text = "第一句話。第二句話。第三句話。" + + chunks = splitter.split_text(text) + + # 檢查是否在句號處拆分 + for chunk in chunks: + if len(chunk) < 50: + continue + # 較長的塊應在句子邊界處結束 + assert chunk.endswith(('。', '\n', ' ')) or len(chunk) >= splitter.chunk_size - 10 + + def test_newline_split(self): + """測試換行符拆分""" + splitter = TextSplitter(chunk_size=50, chunk_overlap=5) + text = "第一段\n\n第二段\n\n第三段" + + chunks = splitter.split_text(text) + + assert len(chunks) >= 1 + + def test_custom_chunk_size(self): + """測試自定義塊大小""" + for chunk_size in [100, 200, 500, 1000]: + splitter = TextSplitter(chunk_size=chunk_size, chunk_overlap=10) + text = "測試文字。" * 500 + + chunks = splitter.split_text(text) + + # 大多數塊應接近指定大小 + for chunk in chunks[:-1]: # 排除最後一個可能較短的塊 + assert len(chunk) <= chunk_size + 50, f"塊大小不應遠超 {chunk_size}" + + def test_unicode_text(self): + """測試 Unicode 文本處理""" + splitter = TextSplitter(chunk_size=50, chunk_overlap=5) + text = "你好世界🌍!這是一段包含 emoji 的文字。繁體中文、简体中文、日本語、한국어" + + chunks = splitter.split_text(text) + + assert len(chunks) >= 1 + # 確保沒有損壞 Unicode 字符 + reconstructed = "".join(chunks) + assert "🌍" in reconstructed or "🌍" in text + + def test_no_empty_chunks(self): + """確保不產生空塊""" + splitter = TextSplitter(chunk_size=100, chunk_overlap=20) + text = " \n\n 內容 \n\n " + + chunks = splitter.split_text(text) + + for chunk in chunks: + assert chunk.strip() != "", "不應有空塊" + + +class TestDocument: + """Document 類測試""" + + def test_document_creation(self): + """測試文檔創建""" + from dataclasses import dataclass + from typing import Dict + + @dataclass + class Document: + content: str + metadata: Dict = None + + def __post_init__(self): + if self.metadata is None: + self.metadata = {} + + doc = Document(content="測試內容", metadata={"source": "test.txt"}) + + assert doc.content == "測試內容" + assert doc.metadata["source"] == "test.txt" + + def test_document_without_metadata(self): + """測試無 metadata 的文檔""" + from dataclasses import dataclass + from typing import Dict + + @dataclass + class Document: + content: str + metadata: Dict = None + + def __post_init__(self): + if self.metadata is None: + self.metadata = {} + + doc = Document(content="測試內容") + + assert doc.metadata == {} + + +class TestRetrievalMetrics: + """檢索指標測試""" + + @staticmethod + def precision_at_k(relevant: list, retrieved: list, k: int) -> float: + """計算 Precision@K""" + retrieved_k = retrieved[:k] + relevant_set = set(relevant) + hits = sum(1 for doc in retrieved_k if doc in relevant_set) + return hits / k if k > 0 else 0.0 + + @staticmethod + def recall_at_k(relevant: list, retrieved: list, k: int) -> float: + """計算 Recall@K""" + retrieved_k = retrieved[:k] + relevant_set = set(relevant) + hits = sum(1 for doc in retrieved_k if doc in relevant_set) + return hits / len(relevant_set) if relevant_set else 0.0 + + @staticmethod + def ndcg_at_k(relevant: list, retrieved: list, k: int) -> float: + """計算 NDCG@K (簡化版)""" + import math + + retrieved_k = retrieved[:k] + relevant_set = set(relevant) + + # 計算 DCG + dcg = 0.0 + for i, doc in enumerate(retrieved_k): + if doc in relevant_set: + dcg += 1.0 / math.log2(i + 2) # i+2 因為位置從 1 開始 + + # 計算理想 DCG + ideal_k = min(k, len(relevant)) + idcg = sum(1.0 / math.log2(i + 2) for i in range(ideal_k)) + + return dcg / idcg if idcg > 0 else 0.0 + + def test_precision_at_k(self): + """測試 Precision@K""" + relevant = ["doc1", "doc2", "doc3"] + retrieved = ["doc1", "doc4", "doc2", "doc5", "doc3"] + + p_at_1 = self.precision_at_k(relevant, retrieved, 1) + p_at_3 = self.precision_at_k(relevant, retrieved, 3) + p_at_5 = self.precision_at_k(relevant, retrieved, 5) + + assert p_at_1 == 1.0, "P@1 應為 1.0" + assert abs(p_at_3 - 2/3) < 0.01, "P@3 應為 2/3" + assert abs(p_at_5 - 3/5) < 0.01, "P@5 應為 3/5" + + def test_recall_at_k(self): + """測試 Recall@K""" + relevant = ["doc1", "doc2", "doc3"] + retrieved = ["doc1", "doc4", "doc2", "doc5", "doc3"] + + r_at_1 = self.recall_at_k(relevant, retrieved, 1) + r_at_3 = self.recall_at_k(relevant, retrieved, 3) + r_at_5 = self.recall_at_k(relevant, retrieved, 5) + + assert abs(r_at_1 - 1/3) < 0.01, "R@1 應為 1/3" + assert abs(r_at_3 - 2/3) < 0.01, "R@3 應為 2/3" + assert r_at_5 == 1.0, "R@5 應為 1.0" + + def test_ndcg_at_k(self): + """測試 NDCG@K""" + relevant = ["doc1", "doc2", "doc3"] + retrieved = ["doc1", "doc2", "doc3", "doc4", "doc5"] + + ndcg = self.ndcg_at_k(relevant, retrieved, 3) + + assert ndcg == 1.0, "完美排序的 NDCG@3 應為 1.0" + + def test_empty_results(self): + """測試空結果""" + relevant = ["doc1", "doc2"] + retrieved = [] + + p = self.precision_at_k(relevant, retrieved, 5) + r = self.recall_at_k(relevant, retrieved, 5) + + assert p == 0.0 + assert r == 0.0 + + +if __name__ == "__main__": + pytest.main([__file__, "-v"])