Fine-Tuning
Also known as: fine tuning, model fine-tuning, supervised fine-tuning, SFT
What is Fine-Tuning?
Fine-tuning is the process of taking a pre-trained model and further training it on a smaller, task-specific dataset to specialize its behavior. While pre-training teaches a model general language understanding from trillions of tokens, fine-tuning adapts that broad knowledge to a particular domain, style, or capability. For example, a general-purpose LLM can be fine-tuned on medical literature to improve clinical reasoning, on a company’s internal documents to match its terminology and processes, or on examples of high-quality code to become a better programming assistant.
Methods and Approaches
The most common form is supervised fine-tuning (SFT), where the model trains on curated input-output pairs that demonstrate desired behavior. Reinforcement learning from human feedback (RLHF) further refines the model by having humans rank outputs and training the model to prefer higher-ranked responses. Parameter-efficient methods like LoRA (Low-Rank Adaptation) reduce the cost of fine-tuning by modifying only a small subset of the model’s weights, making it practical to customize even very large models on modest hardware. Full fine-tuning updates all parameters and requires more compute but can achieve deeper behavioral changes.
When to Fine-Tune (and When Not To)
Fine-tuning is appropriate when you need consistent behavioral changes that prompting alone cannot achieve: adopting a specific output format, mastering domain-specific terminology, or reliably performing a specialized task. However, it is expensive, requires high-quality training data, and risks degrading the model’s general capabilities (catastrophic forgetting). For many use cases, prompt engineering or RAG achieves comparable results at a fraction of the cost. The decision between fine-tuning, RAG, and prompting is one of the most consequential architectural choices in AI application development.
Related Reading
- Pre-training - The foundation that fine-tuning builds on
- Reinforcement Learning - Used in RLHF fine-tuning