AI Language Models Fail to Distinguish Belief from Knowledge in Major Study of 24 Systems

Nov 13, 2025

Nature

Article image for AI Language Models Fail to Distinguish Belief from Knowledge in Major Study of 24 Systems

Summary

New research testing 24 advanced AI language models reveals they systematically fail to distinguish between belief and knowledge, with accuracy plummeting from over 90% to as low as 14.4% when processing first-person false beliefs, exposing critical flaws in their reasoning capabilities.

Key Points

Researchers evaluate 24 cutting-edge language models using a new KaBLE benchmark of 13,000 questions, revealing critical failures in distinguishing belief from knowledge and fact
All tested models systematically fail to acknowledge first-person false beliefs, with GPT-4o accuracy dropping from 98.2% to 64.4% and DeepSeek R1 plummeting from over 90% to 14.4%
Models demonstrate troubling attribution bias by processing third-person false beliefs with 95% accuracy for newer models versus only 62.6% for first-person false beliefs, suggesting superficial pattern matching rather than robust epistemic understanding

AI Language Models Fail to Distinguish Belief from Knowledge in Major Study of 24 Systems

Summary

Key Points

Tags