supervised fine-tuning (SFT)

Definition

Supervised fine-tuning (SFT) is a foundational technique where a pre-trained language model is further trained on a dataset of high-quality, human-curated input-output pairs. This process aligns the model's behavior with desired instructions and response formats, serving as a crucial step before advanced alignment methods like RLHF.

At a glance

Executive summary

Supervised fine-tuning (SFT) trains a pre-trained AI model on specific examples to teach it how to follow instructions and generate desired responses. It's a fundamental step that makes large language models useful for tasks like answering questions or writing text, often preceding more advanced training methods.

TL;DR

SFT teaches a big AI model to follow instructions by showing it many examples of what to do, making it more useful for specific tasks.

Key points

Trains a pre-trained LLM on a dataset of instruction-response pairs using supervised learning.
Solves the problem of making general LLMs follow specific instructions and generate desired outputs.
Used by virtually all developers of instruction-tuned LLMs, including OpenAI, Google, and open-source communities.
Unlike pre-training which learns general language, SFT focuses on aligning model behavior to specific tasks.
Research trend involves combining SFT with advanced alignment techniques like RLHF and process-level feedback.

Use cases

Developing conversational AI chatbots that respond coherently to user queries.

Creating instruction-following models for code generation or debugging.

Fine-tuning models for specific writing styles or personas in content creation.

Adapting LLMs for specialized domains like legal or medical text generation.

Building models capable of performing multi-step mathematical reasoning or problem-solving.

Definition

At a glance

Executive summary

TL;DR

Key points

Use cases

Also known as

Related papers

Related topics