Apple Study Finds AI Models Hallucinate Facts, But New Training Method Boosts Accuracy by 30%

Apr 14, 2026
Apple Machine Learning Research
Article image for Apple Study Finds AI Models Hallucinate Facts, But New Training Method Boosts Accuracy by 30%

Summary

Apple researchers discover AI models hallucinate due to poor fact memorization, but a new training method using data selection and loss-based pruning boosts fact accuracy by 30%, allowing a small 110M-parameter model to match the performance of a model 12 times its size.

Key Points

  • A new study from Apple Machine Learning Research reveals that large language models struggle to memorize factual knowledge, leading to hallucinations and poor performance on knowledge-intensive tasks.
  • Researchers propose data selection schemes based on training loss that limit the number of facts in training data and flatten their frequency distribution, effectively boosting fact accuracy to the capacity limit on semi-synthetic datasets.
  • Using an annotated Wikipedia corpus, a GPT2-Small model with 110 million parameters trained with this pruning method memorizes 1.3 times more entity facts than standard training, matching the performance of a 1.3 billion parameter model trained on the full dataset.

Tags

Read Original Article