Instruction fine-tuning
Instruction fine-tuning is similar to supervised fine-tuning (SFT), where the dataset consists of input-output pairs specific to the task. However, the key difference is that in instruction fine-tuning, the input for each data point includes not just the context but also an explicit task instruction, while the model is trained using the same language modeling objective. This contrasts with SFT, where the dataset consists of input-output pairs, and the training objective is tailored to the specific task (e.g., using cross-entropy for training a classifier). Instruction tuning helps the model generalize and align better with tasks while retaining its language modeling capabilities. Figure 5.3 contrasts examples of SFT and instruction tuning.

Figure 5.3: Comparing the dataset setup between supervised fine-tuning and instruction tuning
The authors of the InstructGPT paper demonstrated that incorporating instructions enables the model to better understand...