diff --git a/docs.json b/docs.json index f6aba9f15..51fefa6e1 100644 --- a/docs.json +++ b/docs.json @@ -120,6 +120,7 @@ "pages": [ "tutorials/basic/text-to-image", "tutorials/basic/image-to-image", + "tutorials/basic/style-transfer", "tutorials/basic/inpaint", "tutorials/basic/outpaint", "tutorials/basic/upscale", @@ -870,6 +871,7 @@ "pages": [ "zh-CN/tutorials/basic/text-to-image", "zh-CN/tutorials/basic/image-to-image", + "zh-CN/tutorials/basic/style-transfer", "zh-CN/tutorials/basic/inpaint", "zh-CN/tutorials/basic/outpaint", "zh-CN/tutorials/basic/upscale", diff --git a/tutorials/basic/style-transfer.mdx b/tutorials/basic/style-transfer.mdx new file mode 100644 index 000000000..2663978e1 --- /dev/null +++ b/tutorials/basic/style-transfer.mdx @@ -0,0 +1,144 @@ +--- +title: "ComfyUI Style Transfer Guide" +sidebarTitle: "Style Transfer" +description: "Learn how to apply artistic styles from reference images to your generations using style transfer workflows in ComfyUI" +--- + +Style transfer lets you apply the visual style of one image to the content of another. Instead of manually describing a style in your prompt, you provide a reference image and ComfyUI extracts and applies that style to your generation. + +Common use cases include: +- Applying an artist's style to your own compositions +- Converting photos into paintings, sketches, or illustrations +- Maintaining a consistent visual style across multiple images +- Combining the composition of one image with the aesthetics of another + +This guide covers three approaches to style transfer in ComfyUI, from basic to advanced: + +1. **Image-to-image style transfer** — the simplest method using prompts and denoise +2. **IP-Adapter style transfer** — using a reference image to guide style without changing composition +3. **ControlNet + IP-Adapter** — combining structural control with style guidance + +## Method 1: image-to-image style transfer + +The simplest way to do style transfer is through the [image-to-image](/tutorials/basic/image-to-image) workflow with style-focused prompts. + +### How it works + +This method encodes your reference image into latent space, then denoises it with a style-descriptive prompt. The `denoise` value in the KSampler controls how much the output deviates from the original. + +### When to use + +- Quick style experiments +- When you want to change both style and content +- When you don't need precise control over which elements change + +### Key parameters + +| Parameter | Recommended range | Effect | +|-----------|------------------|--------| +| `denoise` | 0.4–0.7 | Lower values keep more of the original image; higher values allow more stylistic freedom | +| Prompt | Style-descriptive | Describe the target style (e.g., "oil painting style, impressionist, thick brushstrokes") | + + +Start with a `denoise` value of 0.55 and adjust from there. Values below 0.3 may not change the style enough, while values above 0.8 may lose the original composition entirely. + + +For more details on this approach, see the [image-to-image tutorial](/tutorials/basic/image-to-image). + +## Method 2: IP-Adapter style transfer + +IP-Adapter (Image Prompt Adapter) is the most popular method for style transfer in ComfyUI. It allows you to use a reference image as a visual prompt, guiding the generation style without relying solely on text descriptions. + +### Model installation + +You need two models for IP-Adapter style transfer: + +1. **CLIP Vision model** — Download [CLIP-ViT-H-14-laion2B-s32B-b79K.safetensors](https://huggingface.co/h94/IP-Adapter/resolve/main/models/image_encoder/model.safetensors) and place it in your `ComfyUI/models/clip_vision` folder + +2. **IP-Adapter model** — Download [ip-adapter_sd15.safetensors](https://huggingface.co/h94/IP-Adapter/resolve/main/models/ip-adapter_sd15.safetensors) and place it in your `ComfyUI/models/ipadapter` folder + +3. **Checkpoint model** — Download [v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors) and place it in your `ComfyUI/models/checkpoints` folder + + +For SDXL-based workflows, use the corresponding SDXL IP-Adapter models instead. Check the [h94/IP-Adapter](https://huggingface.co/h94/IP-Adapter) repository for the full list of available models. + + +### Workflow overview + +The IP-Adapter style transfer workflow uses these key nodes: + +1. **Load Checkpoint** — loads your base model +2. **Load Image** — loads your style reference image +3. **CLIP Vision Encode** — encodes the reference image into a visual embedding +4. **IPAdapter Apply** — applies the style embedding to guide generation +5. **KSampler** — generates the final image + +### Key parameters + +| Parameter | Recommended range | Effect | +|-----------|------------------|--------| +| `weight` | 0.5–1.0 | Controls how strongly the reference style influences the output | +| `noise` | 0.0–0.5 | Adds variation; higher values create more diverse results | + + +For pure style transfer (keeping your own composition), use a weight around 0.6–0.8. Higher weights may start transferring content elements from the reference image as well. + + +### Style vs. composition control + +IP-Adapter transfers both style and content by default. To focus primarily on style: + +- Use a lower `weight` value (0.5–0.7) +- Write a detailed prompt describing your desired composition +- The prompt guides the composition while IP-Adapter guides the style + +## Method 3: ControlNet + IP-Adapter + +For maximum control, combine ControlNet (for structure) with IP-Adapter (for style). This lets you precisely define the composition while applying a reference style. + +### How it works + +- **ControlNet** extracts structural information (edges, depth, pose) from your input image and enforces that structure during generation +- **IP-Adapter** provides style guidance from a separate reference image +- Together, they let you say: "generate an image with *this* structure in *that* style" + +### Additional models needed + +In addition to the IP-Adapter models above, you need a ControlNet model. For example: + +- **Canny ControlNet** — Download [control_v11p_sd15_canny_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11p_sd15_canny_fp16.safetensors) and place it in your `ComfyUI/models/controlnet` folder + +### Workflow overview + +This workflow extends the IP-Adapter workflow with ControlNet: + +1. **Load Image (content)** — the image whose structure you want to preserve +2. **Canny edge detection** — extracts edges from the content image +3. **ControlNet Apply** — enforces the structural guidance +4. **Load Image (style)** — the reference image whose style you want to apply +5. **CLIP Vision Encode + IPAdapter Apply** — applies style from the reference +6. **KSampler** — generates the final image combining both controls + +### Recommended settings + +| Parameter | Recommended value | Notes | +|-----------|------------------|-------| +| ControlNet `strength` | 0.7–1.0 | Higher values enforce structure more strictly | +| IP-Adapter `weight` | 0.6–0.8 | Balance between original and reference style | +| KSampler `denoise` | 1.0 | Full denoise since ControlNet provides structure | + +## Tips for better results + +- **Choose clear style references** — images with distinct, consistent styles work best. Avoid reference images with mixed or subtle styles. +- **Match model to task** — SD 1.5 models work well for general style transfer. SDXL models produce higher quality results but require SDXL-specific IP-Adapter models. +- **Iterate on weights** — small changes in IP-Adapter weight (0.05 increments) can significantly affect results. Take time to find the sweet spot. +- **Combine with LoRA** — for consistent style across many images, consider training a [LoRA](/tutorials/basic/lora) on your target style and combining it with IP-Adapter for even stronger style adherence. +- **Use negative prompts** — describe what you don't want (e.g., "blurry, low quality, distorted") to improve output quality. + +## Try it yourself + +1. Start with method 1 (image-to-image) to understand how denoise affects style transfer +2. Move to method 2 (IP-Adapter) for more precise style control using a reference image +3. Combine ControlNet with IP-Adapter (method 3) when you need both structural accuracy and style transfer + +For more background on style transfer techniques in ComfyUI, see the [complete style transfer handbook](https://blog.comfy.org/p/the-complete-style-transfer-handbook) on the Comfy blog. diff --git a/zh-CN/tutorials/basic/style-transfer.mdx b/zh-CN/tutorials/basic/style-transfer.mdx new file mode 100644 index 000000000..bf0874daf --- /dev/null +++ b/zh-CN/tutorials/basic/style-transfer.mdx @@ -0,0 +1,144 @@ +--- +title: "ComfyUI 风格迁移指南" +sidebarTitle: "风格迁移" +description: "学习如何在 ComfyUI 中使用风格迁移工作流,将参考图片的艺术风格应用到你的生成图像中" +--- + +风格迁移(Style Transfer)可以将一张图片的视觉风格应用到另一张图片的内容上。你不需要在提示词中手动描述风格,只需提供一张参考图片,ComfyUI 就能提取并应用该风格到你的生成结果中。 + +常见使用场景包括: +- 将某位艺术家的风格应用到你自己的构图中 +- 将照片转换为油画、素描或插画风格 +- 在多张图片之间保持一致的视觉风格 +- 将一张图片的构图与另一张图片的美学风格结合 + +本指南涵盖 ComfyUI 中三种风格迁移方法,从基础到进阶: + +1. **图生图风格迁移** — 使用提示词和去噪参数的最简单方法 +2. **IP-Adapter 风格迁移** — 使用参考图片引导风格,不改变构图 +3. **ControlNet + IP-Adapter** — 结合结构控制与风格引导 + +## 方法一:图生图风格迁移 + +最简单的风格迁移方式是通过[图生图](/zh-CN/tutorials/basic/image-to-image)工作流配合风格描述提示词。 + +### 工作原理 + +该方法将参考图片编码到潜空间,然后使用风格描述提示词进行去噪。KSampler 中的 `denoise` 值控制输出与原图的偏离程度。 + +### 适用场景 + +- 快速风格实验 +- 需要同时改变风格和内容时 +- 不需要精确控制哪些元素发生变化时 + +### 关键参数 + +| 参数 | 推荐范围 | 效果 | +|------|---------|------| +| `denoise` | 0.4–0.7 | 较低值保留更多原始图像;较高值允许更多风格自由度 | +| 提示词 | 风格描述 | 描述目标风格(如 "oil painting style, impressionist, thick brushstrokes") | + + +建议从 `denoise` 值 0.55 开始调整。低于 0.3 的值可能无法充分改变风格,而高于 0.8 的值可能会完全丢失原始构图。 + + +更多详情请参阅[图生图教程](/zh-CN/tutorials/basic/image-to-image)。 + +## 方法二:IP-Adapter 风格迁移 + +IP-Adapter(Image Prompt Adapter)是 ComfyUI 中最流行的风格迁移方法。它允许你将参考图片作为视觉提示,引导生成风格,而不仅仅依赖文字描述。 + +### 模型安装 + +IP-Adapter 风格迁移需要两个模型: + +1. **CLIP Vision 模型** — 下载 [CLIP-ViT-H-14-laion2B-s32B-b79K.safetensors](https://huggingface.co/h94/IP-Adapter/resolve/main/models/image_encoder/model.safetensors) 并放入 `ComfyUI/models/clip_vision` 文件夹 + +2. **IP-Adapter 模型** — 下载 [ip-adapter_sd15.safetensors](https://huggingface.co/h94/IP-Adapter/resolve/main/models/ip-adapter_sd15.safetensors) 并放入 `ComfyUI/models/ipadapter` 文件夹 + +3. **基础模型** — 下载 [v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors) 并放入 `ComfyUI/models/checkpoints` 文件夹 + + +对于基于 SDXL 的工作流,请使用对应的 SDXL IP-Adapter 模型。完整模型列表请查看 [h94/IP-Adapter](https://huggingface.co/h94/IP-Adapter) 仓库。 + + +### 工作流概述 + +IP-Adapter 风格迁移工作流使用以下关键节点: + +1. **Load Checkpoint** — 加载基础模型 +2. **Load Image** — 加载风格参考图片 +3. **CLIP Vision Encode** — 将参考图片编码为视觉嵌入 +4. **IPAdapter Apply** — 应用风格嵌入来引导生成 +5. **KSampler** — 生成最终图像 + +### 关键参数 + +| 参数 | 推荐范围 | 效果 | +|------|---------|------| +| `weight` | 0.5–1.0 | 控制参考风格对输出的影响强度 | +| `noise` | 0.0–0.5 | 增加变化;较高值产生更多样的结果 | + + +对于纯风格迁移(保持自己的构图),建议使用 0.6–0.8 左右的 weight 值。更高的值可能会开始从参考图片中迁移内容元素。 + + +### 风格与构图控制 + +IP-Adapter 默认同时迁移风格和内容。如果主要关注风格: + +- 使用较低的 `weight` 值(0.5–0.7) +- 编写详细的提示词描述你想要的构图 +- 提示词引导构图,而 IP-Adapter 引导风格 + +## 方法三:ControlNet + IP-Adapter + +要实现最大程度的控制,可以将 ControlNet(用于结构)与 IP-Adapter(用于风格)结合使用。这样你可以精确定义构图,同时应用参考风格。 + +### 工作原理 + +- **ControlNet** 从输入图片中提取结构信息(边缘、深度、姿态),并在生成过程中强制执行该结构 +- **IP-Adapter** 从另一张参考图片提供风格引导 +- 两者结合,你可以说:"生成一张具有 *这个* 结构、*那个* 风格的图像" + +### 额外需要的模型 + +除了上面的 IP-Adapter 模型外,还需要一个 ControlNet 模型。例如: + +- **Canny ControlNet** — 下载 [control_v11p_sd15_canny_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11p_sd15_canny_fp16.safetensors) 并放入 `ComfyUI/models/controlnet` 文件夹 + +### 工作流概述 + +此工作流在 IP-Adapter 工作流基础上增加了 ControlNet: + +1. **Load Image(内容图)** — 你想保留其结构的图片 +2. **Canny 边缘检测** — 从内容图中提取边缘 +3. **ControlNet Apply** — 强制执行结构引导 +4. **Load Image(风格图)** — 你想应用其风格的参考图片 +5. **CLIP Vision Encode + IPAdapter Apply** — 应用参考图的风格 +6. **KSampler** — 生成结合两种控制的最终图像 + +### 推荐设置 + +| 参数 | 推荐值 | 说明 | +|------|--------|------| +| ControlNet `strength` | 0.7–1.0 | 较高值更严格地强制执行结构 | +| IP-Adapter `weight` | 0.6–0.8 | 在原始风格和参考风格之间取得平衡 | +| KSampler `denoise` | 1.0 | 完全去噪,因为 ControlNet 提供了结构 | + +## 获得更好效果的技巧 + +- **选择风格鲜明的参考图** — 具有明显、一致风格的图片效果最好。避免使用风格混杂或不明显的参考图。 +- **匹配模型** — SD 1.5 模型适用于一般风格迁移。SDXL 模型可产生更高质量的结果,但需要 SDXL 专用的 IP-Adapter 模型。 +- **反复调整权重** — IP-Adapter weight 的微小变化(0.05 的增量)可以显著影响结果。花时间找到最佳值。 +- **结合 LoRA 使用** — 如果需要在多张图片间保持一致风格,可以考虑在目标风格上训练一个 [LoRA](/zh-CN/tutorials/basic/lora),并与 IP-Adapter 结合使用以获得更强的风格一致性。 +- **使用负面提示词** — 描述你不想要的内容(如 "blurry, low quality, distorted")以提高输出质量。 + +## 自己动手试试 + +1. 从方法一(图生图)开始,了解去噪参数如何影响风格迁移 +2. 进阶到方法二(IP-Adapter),使用参考图片实现更精确的风格控制 +3. 当需要同时保证结构准确性和风格迁移时,结合使用 ControlNet 和 IP-Adapter(方法三) + +更多关于 ComfyUI 风格迁移技术的背景知识,请参阅 Comfy 博客上的[完整风格迁移手册](https://blog.comfy.org/p/the-complete-style-transfer-handbook)。