skills/mlops/models/stable-diffusion/SKILL.md

---
name: stable-diffusion-image-generation
description: State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting, or building custom diffusion pipelines.
version: 1.0.0
author: Orchestra Research
license: MIT
dependencies: [diffusers>=0.30.0, transformers>=4.41.0, accelerate>=0.31.0, torch>=2.0.0]
metadata:
  hermes:
    tags: [Image Generation, Stable Diffusion, Diffusers, Text-to-Image, Multimodal, Computer Vision]

---

# Stable Diffusion Image Generation

Comprehensive guide to generating images with Stable Diffusion using the HuggingFace Diffusers library.

## When to use Stable Diffusion

**Use Stable Diffusion when:**
- Generating images from text descriptions
- Performing image-to-image translation (style transfer, enhancement)
- Inpainting (filling in masked regions)
- Outpainting (extending images beyond boundaries)
- Creating variations of existing images
- Building custom image generation workflows

**Key features:**
- **Text-to-Image**: Generate images from natural language prompts
- **Image-to-Image**: Transform existing images with text guidance
- **Inpainting**: Fill masked regions with context-aware content
- **ControlNet**: Add spatial conditioning (edges, poses, depth)
- **LoRA Support**: Efficient fine-tuning and style adaptation
- **Multiple Models**: SD 1.5, SDXL, SD 3.0, Flux support

**Use alternatives instead:**
- **DALL-E 3**: For API-based generation without GPU
- **Midjourney**: For artistic, stylized outputs
- **Imagen**: For Google Cloud integration
- **Leonardo.ai**: For web-based creative workflows

## Quick start

### Installation

```bash
pip install diffusers transformers accelerate torch
pip install xformers  # Optional: memory-efficient attention
```

### Basic text-to-image

```python
from diffusers import DiffusionPipeline
import torch

# Load pipeline (auto-detects model type)
pipe = DiffusionPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
)
pipe.to("cuda")

# Generate image
image = pipe(
    "A serene mountain landscape at sunset, highly detailed",
    num_inference_steps=50,
    guidance_scale=7.5
).images[0]

image.save("output.png")
```

### Using SDXL (higher quality)

```python
from diffusers import AutoPipelineForText2Image
import torch

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

# Enable memory optimization
pipe.enable_model_cpu_offload()

image = pipe(
    prompt="A futuristic city with flying cars, cinematic lighting",
    height=1024,
    width=1024,
    num_inference_steps=30
).images[0]
```

## Architecture overview

### Three-pillar design

Diffusers is built around three core components:

```
Pipeline (orchestration)
├── Model (neural networks)
│   ├── UNet / Transformer (noise prediction)
│   ├── VAE (latent encoding/decoding)
│   └── Text Encoder (CLIP/T5)
└── Scheduler (denoising algorithm)
```

### Pipeline inference flow

```
Text Prompt → Text Encoder → Text Embeddings
                                    ↓
Random Noise → [Denoising Loop] ← Scheduler
                      ↓
               Predicted Noise
                      ↓
              VAE Decoder → Final Image
```

## Core concepts

### Pipelines

Pipelines orchestrate complete workflows:

| Pipeline | Purpose |
|----------|---------|
| `StableDiffusionPipeline` | Text-to-image (SD 1.x/2.x) |
| `StableDiffusionXLPipeline` | Text-to-image (SDXL) |
| `StableDiffusion3Pipeline` | Text-to-image (SD 3.0) |
| `FluxPipeline` | Text-to-image (Flux models) |
| `StableDiffusionImg2ImgPipeline` | Image-to-image |
| `StableDiffusionInpaintPipeline` | Inpainting |

### Schedulers

Schedulers control the denoising process:

| Scheduler | Steps | Quality | Use Case |
|-----------|-------|---------|----------|
| `EulerDiscreteScheduler` | 20-50 | Good | Default choice |
| `EulerAncestralDiscreteScheduler` | 20-50 | Good | More variation |
| `DPMSolverMultistepScheduler` | 15-25 | Excellent | Fast, high quality |
| `DDIMScheduler` | 50-100 | Good | Deterministic |
| `LCMScheduler` | 4-8 | Good | Very fast |
| `UniPCMultistepScheduler` | 15-25 | Excellent | Fast convergence |

### Swapping schedulers

```python
from diffusers import DPMSolverMultistepScheduler

# Swap for faster generation
pipe.scheduler = DPMSolverMultistepScheduler.from_config(
    pipe.scheduler.config
)

# Now generate with fewer steps
image = pipe(prompt, num_inference_steps=20).images[0]
```

## Generation parameters

### Key parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `prompt` | Required | Text description of desired image |
| `negative_prompt` | None | What to avoid in the image |
| `num_inference_steps` | 50 | Denoising steps (more = better quality) |
| `guidance_scale` | 7.5 | Prompt adherence (7-12 typical) |
| `height`, `width` | 512/1024 | Output dimensions (multiples of 8) |
| `generator` | None | Torch generator for reproducibility |
| `num_images_per_prompt` | 1 | Batch size |

### Reproducible generation

```python
import torch

generator = torch.Generator(device="cuda").manual_seed(42)

image = pipe(
    prompt="A cat wearing a top hat",
    generator=generator,
    num_inference_steps=50
).images[0]
```

### Negative prompts

```python
image = pipe(
    prompt="Professional photo of a dog in a garden",
    negative_prompt="blurry, low quality, distorted, ugly, bad anatomy",
    guidance_scale=7.5
).images[0]
```

## Image-to-image

Transform existing images with text guidance:

```python
from diffusers import AutoPipelineForImage2Image
from PIL import Image

pipe = AutoPipelineForImage2Image.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

init_image = Image.open("input.jpg").resize((512, 512))

image = pipe(
    prompt="A watercolor painting of the scene",
    image=init_image,
    strength=0.75,  # How much to transform (0-1)
    num_inference_steps=50
).images[0]
```

## Inpainting

Fill masked regions:

```python
from diffusers import AutoPipelineForInpainting
from PIL import Image

pipe = AutoPipelineForInpainting.from_pretrained(
    "runwayml/stable-diffusion-inpainting",
    torch_dtype=torch.float16
).to("cuda")

image = Image.open("photo.jpg")
mask = Image.open("mask.png")  # White = inpaint region

result = pipe(
    prompt="A red car parked on the street",
    image=image,
    mask_image=mask,
    num_inference_steps=50
).images[0]
```

## ControlNet

Add spatial conditioning for precise control:

```python
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
import torch

# Load ControlNet for edge conditioning
controlnet = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11p_sd15_canny",
    torch_dtype=torch.float16
)

pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    controlnet=controlnet,
    torch_dtype=torch.float16
).to("cuda")

# Use Canny edge image as control
control_image = get_canny_image(input_image)

image = pipe(
    prompt="A beautiful house in the style of Van Gogh",
    image=control_image,
    num_inference_steps=30
).images[0]
```

### Available ControlNets

| ControlNet | Input Type | Use Case |
|------------|------------|----------|
| `canny` | Edge maps | Preserve structure |
| `openpose` | Pose skeletons | Human poses |
| `depth` | Depth maps | 3D-aware generation |
| `normal` | Normal maps | Surface details |
| `mlsd` | Line segments | Architectural lines |
| `scribble` | Rough sketches | Sketch-to-image |

## LoRA adapters

Load fine-tuned style adapters:

```python
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

# Load LoRA weights
pipe.load_lora_weights("path/to/lora", weight_name="style.safetensors")

# Generate with LoRA style
image = pipe("A portrait in the trained style").images[0]

# Adjust LoRA strength
pipe.fuse_lora(lora_scale=0.8)

# Unload LoRA
pipe.unload_lora_weights()
```

### Multiple LoRAs

```python
# Load multiple LoRAs
pipe.load_lora_weights("lora1", adapter_name="style")
pipe.load_lora_weights("lora2", adapter_name="character")

# Set weights for each
pipe.set_adapters(["style", "character"], adapter_weights=[0.7, 0.5])

image = pipe("A portrait").images[0]
```

## Memory optimization

### Enable CPU offloading

```python
# Model CPU offload - moves models to CPU when not in use
pipe.enable_model_cpu_offload()

# Sequential CPU offload - more aggressive, slower
pipe.enable_sequential_cpu_offload()
```

### Attention slicing

```python
# Reduce memory by computing attention in chunks
pipe.enable_attention_slicing()

# Or specific chunk size
pipe.enable_attention_slicing("max")
```

### xFormers memory-efficient attention

```python
# Requires xformers package
pipe.enable_xformers_memory_efficient_attention()
```

### VAE slicing for large images

```python
# Decode latents in tiles for large images
pipe.enable_vae_slicing()
pipe.enable_vae_tiling()
```

## Model variants

### Loading different precisions

```python
# FP16 (recommended for GPU)
pipe = DiffusionPipeline.from_pretrained(
    "model-id",
    torch_dtype=torch.float16,
    variant="fp16"
)

# BF16 (better precision, requires Ampere+ GPU)
pipe = DiffusionPipeline.from_pretrained(
    "model-id",
    torch_dtype=torch.bfloat16
)
```

### Loading specific components

```python
from diffusers import UNet2DConditionModel, AutoencoderKL

# Load custom VAE
vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse")

# Use with pipeline
pipe = DiffusionPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    vae=vae,
    torch_dtype=torch.float16
)
```

## Batch generation

Generate multiple images efficiently:

```python
# Multiple prompts
prompts = [
    "A cat playing piano",
    "A dog reading a book",
    "A bird painting a picture"
]

images = pipe(prompts, num_inference_steps=30).images

# Multiple images per prompt
images = pipe(
    "A beautiful sunset",
    num_images_per_prompt=4,
    num_inference_steps=30
).images
```

## Common workflows

### Workflow 1: High-quality generation

```python
from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
import torch

# 1. Load SDXL with optimizations
pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()

# 2. Generate with quality settings
image = pipe(
    prompt="A majestic lion in the savanna, golden hour lighting, 8k, detailed fur",
    negative_prompt="blurry, low quality, cartoon, anime, sketch",
    num_inference_steps=30,
    guidance_scale=7.5,
    height=1024,
    width=1024
).images[0]
```

### Workflow 2: Fast prototyping

```python
from diffusers import AutoPipelineForText2Image, LCMScheduler
import torch

# Use LCM for 4-8 step generation
pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16
).to("cuda")

# Load LCM LoRA for fast generation
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.fuse_lora()

# Generate in ~1 second
image = pipe(
    "A beautiful landscape",
    num_inference_steps=4,
    guidance_scale=1.0
).images[0]
```

## Common issues

**CUDA out of memory:**
```python
# Enable memory optimizations
pipe.enable_model_cpu_offload()
pipe.enable_attention_slicing()
pipe.enable_vae_slicing()

# Or use lower precision
pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
```

**Black/noise images:**
```python
# Check VAE configuration
# Use safety checker bypass if needed
pipe.safety_checker = None

# Ensure proper dtype consistency
pipe = pipe.to(dtype=torch.float16)
```

**Slow generation:**
```python
# Use faster scheduler
from diffusers import DPMSolverMultistepScheduler
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)

# Reduce steps
image = pipe(prompt, num_inference_steps=20).images[0]
```

## References

- **[Advanced Usage](references/advanced-usage.md)** - Custom pipelines, fine-tuning, deployment
- **[Troubleshooting](references/troubleshooting.md)** - Common issues and solutions

## Resources

- **Documentation**: https://huggingface.co/docs/diffusers
- **Repository**: https://github.com/huggingface/diffusers
- **Model Hub**: https://huggingface.co/models?library=diffusers
- **Discord**: https://discord.gg/diffusers
Sync all skills and memories 2026-04-14 07:27 2026-04-14 07:27:20 +09:00			`---`
			`name: stable-diffusion-image-generation`
			`description: State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting, or building custom diffusion pipelines.`
			`version: 1.0.0`
			`author: Orchestra Research`
			`license: MIT`
			`dependencies: [diffusers>=0.30.0, transformers>=4.41.0, accelerate>=0.31.0, torch>=2.0.0]`
			`metadata:`
			`hermes:`
			`tags: [Image Generation, Stable Diffusion, Diffusers, Text-to-Image, Multimodal, Computer Vision]`

			`---`

			`# Stable Diffusion Image Generation`

			`Comprehensive guide to generating images with Stable Diffusion using the HuggingFace Diffusers library.`

			`## When to use Stable Diffusion`

			`Use Stable Diffusion when:`
			`- Generating images from text descriptions`
			`- Performing image-to-image translation (style transfer, enhancement)`
			`- Inpainting (filling in masked regions)`
			`- Outpainting (extending images beyond boundaries)`
			`- Creating variations of existing images`
			`- Building custom image generation workflows`

			`Key features:`
			`- Text-to-Image: Generate images from natural language prompts`
			`- Image-to-Image: Transform existing images with text guidance`
			`- Inpainting: Fill masked regions with context-aware content`
			`- ControlNet: Add spatial conditioning (edges, poses, depth)`
			`- LoRA Support: Efficient fine-tuning and style adaptation`
			`- Multiple Models: SD 1.5, SDXL, SD 3.0, Flux support`

			`Use alternatives instead:`
			`- DALL-E 3: For API-based generation without GPU`
			`- Midjourney: For artistic, stylized outputs`
			`- Imagen: For Google Cloud integration`
			`- Leonardo.ai: For web-based creative workflows`

			`## Quick start`

			`### Installation`

			```bash
			`pip install diffusers transformers accelerate torch`
			`pip install xformers # Optional: memory-efficient attention`
			```

			`### Basic text-to-image`

			```python
			`from diffusers import DiffusionPipeline`
			`import torch`

			`# Load pipeline (auto-detects model type)`
			`pipe = DiffusionPipeline.from_pretrained(`
			`"stable-diffusion-v1-5/stable-diffusion-v1-5",`
			`torch_dtype=torch.float16`
			`)`
			`pipe.to("cuda")`

			`# Generate image`
			`image = pipe(`
			`"A serene mountain landscape at sunset, highly detailed",`
			`num_inference_steps=50,`
			`guidance_scale=7.5`
			`).images[0]`

			`image.save("output.png")`
			```

			`### Using SDXL (higher quality)`

			```python
			`from diffusers import AutoPipelineForText2Image`
			`import torch`

			`pipe = AutoPipelineForText2Image.from_pretrained(`
			`"stabilityai/stable-diffusion-xl-base-1.0",`
			`torch_dtype=torch.float16,`
			`variant="fp16"`
			`)`
			`pipe.to("cuda")`

			`# Enable memory optimization`
			`pipe.enable_model_cpu_offload()`

			`image = pipe(`
			`prompt="A futuristic city with flying cars, cinematic lighting",`
			`height=1024,`
			`width=1024,`
			`num_inference_steps=30`
			`).images[0]`
			```

			`## Architecture overview`

			`### Three-pillar design`

			`Diffusers is built around three core components:`

			```
			`Pipeline (orchestration)`
			`├── Model (neural networks)`
			`│ ├── UNet / Transformer (noise prediction)`
			`│ ├── VAE (latent encoding/decoding)`
			`│ └── Text Encoder (CLIP/T5)`
			`└── Scheduler (denoising algorithm)`
			```

			`### Pipeline inference flow`

			```
			`Text Prompt → Text Encoder → Text Embeddings`
			`↓`
			`Random Noise → [Denoising Loop] ← Scheduler`
			`↓`
			`Predicted Noise`
			`↓`
			`VAE Decoder → Final Image`
			```

			`## Core concepts`

			`### Pipelines`

			`Pipelines orchestrate complete workflows:`

			`\| Pipeline \| Purpose \|`
			`\|----------\|---------\|`
			\| `StableDiffusionPipeline` \| Text-to-image (SD 1.x/2.x) \|
			\| `StableDiffusionXLPipeline` \| Text-to-image (SDXL) \|
			\| `StableDiffusion3Pipeline` \| Text-to-image (SD 3.0) \|
			\| `FluxPipeline` \| Text-to-image (Flux models) \|
			\| `StableDiffusionImg2ImgPipeline` \| Image-to-image \|
			\| `StableDiffusionInpaintPipeline` \| Inpainting \|

			`### Schedulers`

			`Schedulers control the denoising process:`

			`\| Scheduler \| Steps \| Quality \| Use Case \|`
			`\|-----------\|-------\|---------\|----------\|`
			\| `EulerDiscreteScheduler` \| 20-50 \| Good \| Default choice \|
			\| `EulerAncestralDiscreteScheduler` \| 20-50 \| Good \| More variation \|
			\| `DPMSolverMultistepScheduler` \| 15-25 \| Excellent \| Fast, high quality \|
			\| `DDIMScheduler` \| 50-100 \| Good \| Deterministic \|
			\| `LCMScheduler` \| 4-8 \| Good \| Very fast \|
			\| `UniPCMultistepScheduler` \| 15-25 \| Excellent \| Fast convergence \|

			`### Swapping schedulers`

			```python
			`from diffusers import DPMSolverMultistepScheduler`

			`# Swap for faster generation`
			`pipe.scheduler = DPMSolverMultistepScheduler.from_config(`
			`pipe.scheduler.config`
			`)`

			`# Now generate with fewer steps`
			`image = pipe(prompt, num_inference_steps=20).images[0]`
			```

			`## Generation parameters`

			`### Key parameters`

			`\| Parameter \| Default \| Description \|`
			`\|-----------\|---------\|-------------\|`
			\| `prompt` \| Required \| Text description of desired image \|
			\| `negative_prompt` \| None \| What to avoid in the image \|
			\| `num_inference_steps` \| 50 \| Denoising steps (more = better quality) \|
			\| `guidance_scale` \| 7.5 \| Prompt adherence (7-12 typical) \|
			\| `height`, `width` \| 512/1024 \| Output dimensions (multiples of 8) \|
			\| `generator` \| None \| Torch generator for reproducibility \|
			\| `num_images_per_prompt` \| 1 \| Batch size \|

			`### Reproducible generation`

			```python
			`import torch`

			`generator = torch.Generator(device="cuda").manual_seed(42)`

			`image = pipe(`
			`prompt="A cat wearing a top hat",`
			`generator=generator,`
			`num_inference_steps=50`
			`).images[0]`
			```

			`### Negative prompts`

			```python
			`image = pipe(`
			`prompt="Professional photo of a dog in a garden",`
			`negative_prompt="blurry, low quality, distorted, ugly, bad anatomy",`
			`guidance_scale=7.5`
			`).images[0]`
			```

			`## Image-to-image`

			`Transform existing images with text guidance:`

			```python
			`from diffusers import AutoPipelineForImage2Image`
			`from PIL import Image`

			`pipe = AutoPipelineForImage2Image.from_pretrained(`
			`"stable-diffusion-v1-5/stable-diffusion-v1-5",`
			`torch_dtype=torch.float16`
			`).to("cuda")`

			`init_image = Image.open("input.jpg").resize((512, 512))`

			`image = pipe(`
			`prompt="A watercolor painting of the scene",`
			`image=init_image,`
			`strength=0.75, # How much to transform (0-1)`
			`num_inference_steps=50`
			`).images[0]`
			```

			`## Inpainting`

			`Fill masked regions:`

			```python
			`from diffusers import AutoPipelineForInpainting`
			`from PIL import Image`

			`pipe = AutoPipelineForInpainting.from_pretrained(`
			`"runwayml/stable-diffusion-inpainting",`
			`torch_dtype=torch.float16`
			`).to("cuda")`

			`image = Image.open("photo.jpg")`
			`mask = Image.open("mask.png") # White = inpaint region`

			`result = pipe(`
			`prompt="A red car parked on the street",`
			`image=image,`
			`mask_image=mask,`
			`num_inference_steps=50`
			`).images[0]`
			```

			`## ControlNet`

			`Add spatial conditioning for precise control:`

			```python
			`from diffusers import StableDiffusionControlNetPipeline, ControlNetModel`
			`import torch`

			`# Load ControlNet for edge conditioning`
			`controlnet = ControlNetModel.from_pretrained(`
			`"lllyasviel/control_v11p_sd15_canny",`
			`torch_dtype=torch.float16`
			`)`

			`pipe = StableDiffusionControlNetPipeline.from_pretrained(`
			`"stable-diffusion-v1-5/stable-diffusion-v1-5",`
			`controlnet=controlnet,`
			`torch_dtype=torch.float16`
			`).to("cuda")`

			`# Use Canny edge image as control`
			`control_image = get_canny_image(input_image)`

			`image = pipe(`
			`prompt="A beautiful house in the style of Van Gogh",`
			`image=control_image,`
			`num_inference_steps=30`
			`).images[0]`
			```

			`### Available ControlNets`

			`\| ControlNet \| Input Type \| Use Case \|`
			`\|------------\|------------\|----------\|`
			\| `canny` \| Edge maps \| Preserve structure \|
			\| `openpose` \| Pose skeletons \| Human poses \|
			\| `depth` \| Depth maps \| 3D-aware generation \|
			\| `normal` \| Normal maps \| Surface details \|
			\| `mlsd` \| Line segments \| Architectural lines \|
			\| `scribble` \| Rough sketches \| Sketch-to-image \|

			`## LoRA adapters`

			`Load fine-tuned style adapters:`

			```python
			`from diffusers import DiffusionPipeline`

			`pipe = DiffusionPipeline.from_pretrained(`
			`"stable-diffusion-v1-5/stable-diffusion-v1-5",`
			`torch_dtype=torch.float16`
			`).to("cuda")`

			`# Load LoRA weights`
			`pipe.load_lora_weights("path/to/lora", weight_name="style.safetensors")`

			`# Generate with LoRA style`
			`image = pipe("A portrait in the trained style").images[0]`

			`# Adjust LoRA strength`
			`pipe.fuse_lora(lora_scale=0.8)`

			`# Unload LoRA`
			`pipe.unload_lora_weights()`
			```

			`### Multiple LoRAs`

			```python
			`# Load multiple LoRAs`
			`pipe.load_lora_weights("lora1", adapter_name="style")`
			`pipe.load_lora_weights("lora2", adapter_name="character")`

			`# Set weights for each`
			`pipe.set_adapters(["style", "character"], adapter_weights=[0.7, 0.5])`

			`image = pipe("A portrait").images[0]`
			```

			`## Memory optimization`

			`### Enable CPU offloading`

			```python
			`# Model CPU offload - moves models to CPU when not in use`
			`pipe.enable_model_cpu_offload()`

			`# Sequential CPU offload - more aggressive, slower`
			`pipe.enable_sequential_cpu_offload()`
			```

			`### Attention slicing`

			```python
			`# Reduce memory by computing attention in chunks`
			`pipe.enable_attention_slicing()`

			`# Or specific chunk size`
			`pipe.enable_attention_slicing("max")`
			```

			`### xFormers memory-efficient attention`

			```python
			`# Requires xformers package`
			`pipe.enable_xformers_memory_efficient_attention()`
			```

			`### VAE slicing for large images`

			```python
			`# Decode latents in tiles for large images`
			`pipe.enable_vae_slicing()`
			`pipe.enable_vae_tiling()`
			```

			`## Model variants`

			`### Loading different precisions`

			```python
			`# FP16 (recommended for GPU)`
			`pipe = DiffusionPipeline.from_pretrained(`
			`"model-id",`
			`torch_dtype=torch.float16,`
			`variant="fp16"`
			`)`

			`# BF16 (better precision, requires Ampere+ GPU)`
			`pipe = DiffusionPipeline.from_pretrained(`
			`"model-id",`
			`torch_dtype=torch.bfloat16`
			`)`
			```

			`### Loading specific components`

			```python
			`from diffusers import UNet2DConditionModel, AutoencoderKL`

			`# Load custom VAE`
			`vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse")`

			`# Use with pipeline`
			`pipe = DiffusionPipeline.from_pretrained(`
			`"stable-diffusion-v1-5/stable-diffusion-v1-5",`
			`vae=vae,`
			`torch_dtype=torch.float16`
			`)`
			```

			`## Batch generation`

			`Generate multiple images efficiently:`

			```python
			`# Multiple prompts`
			`prompts = [`
			`"A cat playing piano",`
			`"A dog reading a book",`
			`"A bird painting a picture"`
			`]`

			`images = pipe(prompts, num_inference_steps=30).images`

			`# Multiple images per prompt`
			`images = pipe(`
			`"A beautiful sunset",`
			`num_images_per_prompt=4,`
			`num_inference_steps=30`
			`).images`
			```

			`## Common workflows`

			`### Workflow 1: High-quality generation`

			```python
			`from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler`
			`import torch`

			`# 1. Load SDXL with optimizations`
			`pipe = StableDiffusionXLPipeline.from_pretrained(`
			`"stabilityai/stable-diffusion-xl-base-1.0",`
			`torch_dtype=torch.float16,`
			`variant="fp16"`
			`)`
			`pipe.to("cuda")`
			`pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)`
			`pipe.enable_model_cpu_offload()`

			`# 2. Generate with quality settings`
			`image = pipe(`
			`prompt="A majestic lion in the savanna, golden hour lighting, 8k, detailed fur",`
			`negative_prompt="blurry, low quality, cartoon, anime, sketch",`
			`num_inference_steps=30,`
			`guidance_scale=7.5,`
			`height=1024,`
			`width=1024`
			`).images[0]`
			```

			`### Workflow 2: Fast prototyping`

			```python
			`from diffusers import AutoPipelineForText2Image, LCMScheduler`
			`import torch`

			`# Use LCM for 4-8 step generation`
			`pipe = AutoPipelineForText2Image.from_pretrained(`
			`"stabilityai/stable-diffusion-xl-base-1.0",`
			`torch_dtype=torch.float16`
			`).to("cuda")`

			`# Load LCM LoRA for fast generation`
			`pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl")`
			`pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)`
			`pipe.fuse_lora()`

			`# Generate in ~1 second`
			`image = pipe(`
			`"A beautiful landscape",`
			`num_inference_steps=4,`
			`guidance_scale=1.0`
			`).images[0]`
			```

			`## Common issues`

			`CUDA out of memory:`
			```python
			`# Enable memory optimizations`
			`pipe.enable_model_cpu_offload()`
			`pipe.enable_attention_slicing()`
			`pipe.enable_vae_slicing()`

			`# Or use lower precision`
			`pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)`
			```

			`Black/noise images:`
			```python
			`# Check VAE configuration`
			`# Use safety checker bypass if needed`
			`pipe.safety_checker = None`

			`# Ensure proper dtype consistency`
			`pipe = pipe.to(dtype=torch.float16)`
			```

			`Slow generation:`
			```python
			`# Use faster scheduler`
			`from diffusers import DPMSolverMultistepScheduler`
			`pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)`

			`# Reduce steps`
			`image = pipe(prompt, num_inference_steps=20).images[0]`
			```

			`## References`

			`- [Advanced Usage](references/advanced-usage.md) - Custom pipelines, fine-tuning, deployment`
			`- [Troubleshooting](references/troubleshooting.md) - Common issues and solutions`

			`## Resources`

			`- Documentation: https://huggingface.co/docs/diffusers`
			`- Repository: https://github.com/huggingface/diffusers`
			`- Model Hub: https://huggingface.co/models?library=diffusers`
			`- Discord: https://discord.gg/diffusers`