12 KiB
12 KiB
Unsloth Documentation
Unsloth Documentation
- Unsloth Docs: Train your own model with Unsloth, an open-source framework for LLM fine-tuning and reinforcement learning.
- Beginner? Start here!
- Unsloth Requirements: Here are Unsloth's requirements including system and GPU VRAM requirements.
- FAQ + Is Fine-tuning Right For Me?: If you're stuck on if fine-tuning is right for you, see here! Learn about fine-tuning misconceptions, how it compared to RAG and more:
- Unsloth Notebooks: Explore our catalog of Unsloth notebooks:
- All Our Models
- Install & Update: Learn to install Unsloth locally or online.
- Updating: To update or use an old version of Unsloth, follow the steps below:
- Pip Install: To install Unsloth locally via Pip, follow the steps below:
- Docker: Install Unsloth using our official Docker container
- Windows Installation: See how to install Unsloth on Windows with or without WSL.
- AMD: Fine-tune with Unsloth on AMD GPUs.
- Conda Install: To install Unsloth locally on Conda, follow the steps below:
- Google Colab: To install and run Unsloth on Google Colab, follow the steps below:
- Fine-tuning LLMs Guide: Learn all the basics and best practices of fine-tuning. Beginner-friendly.
- What Model Should I Use?
- Datasets Guide: Learn how to create & prepare a dataset for fine-tuning.
- LoRA Hyperparameters Guide: Optimal lora rank. alpha, number of epochs, batch size & gradient accumulation, QLoRA vs LoRA, target modules and more!
- Tutorial: How to Finetune Llama-3 and Use In Ollama: Beginner's Guide for creating a customized personal assistant (like ChatGPT) to run locally on Ollama
- Reinforcement Learning (RL) Guide: Learn all about Reinforcement Learning (RL) and how to train your own DeepSeek-R1 reasoning model with Unsloth using GRPO. A complete guide from beginner to advanced.
- Tutorial: Train your own Reasoning model with GRPO: Beginner's Guide to transforming a model like Llama 3.1 (8B) into a reasoning model by using Unsloth and GRPO.
- Advanced RL Documentation: Advanced documentation settings when using Unsloth with GRPO.
- Memory Efficient RL
- RL Reward Hacking: Learn what is Reward Hacking in Reinforcement Learning and how to counter it.
- GSPO Reinforcement Learning: Train with GSPO (Group Sequence Policy Optimization) RL in Unsloth.
- Reinforcement Learning - DPO, ORPO & KTO: To use the reward modelling functions for DPO, GRPO, ORPO or KTO with Unsloth, follow the steps below:
- DeepSeek-OCR: How to Run & Fine-tune: Guide on how to run and fine-tune DeepSeek-OCR locally.
- How to Fine-tune LLMs with Unsloth & Docker: Learn how to fine-tune LLMs or do Reinforcement Learning (RL) with Unsloth's Docker image.
- Vision Reinforcement Learning (VLM RL): Train Vision/multimodal models via GRPO and RL with Unsloth!
- gpt-oss Reinforcement Learning
- Tutorial: How to Train gpt-oss with RL: Learn to train OpenAI gpt-oss with GRPO to autonomously beat 2048 locally or on Colab.
- Unsloth Dynamic GGUFs on Aider Polyglot: Performance of Unsloth Dynamic GGUFs on Aider Polyglot Benchmarks
- Qwen3-VL: How to Run & Fine-tune: Learn to fine-tune and run Qwen3-VL locally with Unsloth.
- gpt-oss: How to Run & Fine-tune: Run & fine-tune OpenAI's new open-source models!
- Tutorial: How to Fine-tune gpt-oss: Learn step-by-step how to train OpenAI gpt-oss locally with Unsloth.
- Long Context gpt-oss Training
- GLM-4.6: How to Run Locally: A guide on how to run Z.ai's new GLM-4.6 model on your own local device!
- IBM Granite 4.0: How to run IBM Granite-4.0 with Unsloth GGUFs on llama.cpp, Ollama and how to fine-tune!
- DeepSeek-V3.1: How to Run Locally: A guide on how to run DeepSeek-V3.1 and Terminus on your own local device!
- Qwen3-Coder: How to Run Locally: Run Qwen3-Coder-30B-A3B-Instruct and 480B-A35B locally with Unsloth Dynamic quants.
- Gemma 3: How to Run & Fine-tune: How to run Gemma 3 effectively with our GGUFs on llama.cpp, Ollama, Open WebUI and how to fine-tune with Unsloth!
- Gemma 3n: How to Run & Fine-tune: Run Google's new Gemma 3n locally with Dynamic GGUFs on llama.cpp, Ollama, Open WebUI and fine-tune with Unsloth!
- Qwen3: How to Run & Fine-tune: Learn to run & fine-tune Qwen3 locally with Unsloth + our Dynamic 2.0 quants
- Qwen3-2507: Run Qwen3-30B-A3B-2507 and 235B-A22B Thinking and Instruct versions locally on your device!
- Tutorials: How To Fine-tune & Run LLMs: Learn how to run and fine-tune models for optimal performance 100% locally with Unsloth.
- DeepSeek-R1-0528: How to Run Locally: A guide on how to run DeepSeek-R1-0528 including Qwen3 on your own local device!
- Magistral: How to Run & Fine-tune: Meet Magistral - Mistral's new reasoning models.
- Llama 4: How to Run & Fine-tune: How to run Llama 4 locally using our dynamic GGUFs which recovers accuracy compared to standard quantization.
- Kimi K2: How to Run Locally: Guide on running Kimi K2 and Kimi-K2-Instruct-0905 on your own local device!
- Grok 2: Run xAI's Grok 2 model locally!
- Devstral: How to Run & Fine-tune: Run and fine-tune Mistral Devstral 1.1, including Small-2507 and 2505.
- DeepSeek-V3-0324: How to Run Locally: How to run DeepSeek-V3-0324 locally using our dynamic quants which recovers accuracy
- DeepSeek-R1: How to Run Locally: A guide on how you can run our 1.58-bit Dynamic Quants for DeepSeek-R1 using llama.cpp.
- DeepSeek-R1 Dynamic 1.58-bit: See performance comparison tables for Unsloth's Dynamic GGUF Quants vs Standard IMatrix Quants.
- QwQ-32B: How to Run effectively: How to run QwQ-32B effectively with our bug fixes and without endless generations + GGUFs.
- Phi-4 Reasoning: How to Run & Fine-tune: Learn to run & fine-tune Phi-4 reasoning models locally with Unsloth + our Dynamic 2.0 quants
- Running & Saving Models: Learn how to save your finetuned model so you can run it in your favorite inference engine.
- Saving to GGUF: Saving models to 16bit for GGUF so you can use it for Ollama, Jan AI, Open WebUI and more!
- Saving to Ollama
- Saving to vLLM for deployment: Saving models to 16bit for vLLM deployment and serving
- Saving to SGLang for deployment: Saving models to 16bit for SGLang for deployment and serving
- Unsloth Inference: Learn how to run your finetuned model with Unsloth's faster inference.
- Troubleshooting Inference: If you're experiencing issues when running or saving your model.
- vLLM Engine Arguments
- LoRA Hot Swapping Guide
- Text-to-Speech (TTS) Fine-tuning: Learn how to fine-tune TTS & STT voice models with Unsloth.
- Unsloth Dynamic 2.0 GGUFs: A big new upgrade to our Dynamic Quants!
- Vision Fine-tuning: Learn how to fine-tune vision/multimodal LLMs with Unsloth
- Fine-tuning LLMs with NVIDIA DGX Spark and Unsloth: Tutorial on how to fine-tune and do reinforcement learning (RL) with OpenAI gpt-oss on NVIDIA DGX Spark.
- Fine-tuning LLMs with Blackwell, RTX 50 series & Unsloth: Learn how to fine-tune LLMs on NVIDIA's Blackwell RTX 50 series and B200 GPUs with our step-by-step guide.
- Multi-GPU Training with Unsloth: Learn how to fine-tune LLMs on multiple GPUs and parallelism with Unsloth.
- Finetuning from Last Checkpoint: Checkpointing allows you to save your finetuning progress so you can pause it and then continue.
- Troubleshooting & FAQs: Tips to solve issues, and frequently asked questions.
- Chat Templates: Learn the fundamentals and customization options of chat templates, including Conversational, ChatML, ShareGPT, Alpaca formats, and more!
- Quantization-Aware Training (QAT): Quantize models to 4-bit with Unsloth and PyTorch to recover accuracy.
- Unsloth Environment Flags: Advanced flags which might be useful if you see breaking finetunes, or you want to turn stuff off.
- Continued Pretraining: AKA as Continued Finetuning. Unsloth allows you to continually pretrain so a model can learn a new language.
- Unsloth Benchmarks: Unsloth recorded benchmarks on NVIDIA GPUs.