MIT Develops Self-Distillation Technique That Teaches AI New Skills Without Erasing Old Ones

Feb 11, 2026
Venturebeat
Article image for MIT Develops Self-Distillation Technique That Teaches AI New Skills Without Erasing Old Ones

Summary

MIT researchers unveil a breakthrough AI training technique called self-distillation fine-tuning that allows large language models to continuously learn new skills without forgetting old ones, potentially eliminating the need for companies to maintain multiple specialized AI models.

Key Points

  • MIT researchers, alongside teams from the Improbable AI Lab and ETH Zurich, introduce self-distillation fine-tuning (SDFT), a new technique that allows large language models to learn new skills without forgetting previously acquired capabilities.
  • SDFT uses a model's own in-context learning abilities to create a teacher-student feedback loop within a single model, enabling on-policy learning from expert demonstrations without requiring a reward function, outperforming traditional supervised fine-tuning methods.
  • In testing on Qwen 2.5, SDFT successfully accumulates multiple enterprise skills sequentially without performance regression, offering companies a path to maintain one model instead of separate specialized models, though it requires approximately 2.5 times more compute than standard fine-tuning.

Tags

Read Original Article