Files

4 lines
191 B
Markdown
Raw Permalink Normal View History

---
description: Specific model architectures and tools — computer vision (CLIP, SAM, Stable Diffusion), speech (Whisper), audio generation (AudioCraft), and multimodal models (LLaVA).
---