AI Models Use Only Small Parameter Subset for Social Reasoning Despite Activating Entire Network
Summary
Groundbreaking research reveals that AI language models wastefully activate their entire networks for social reasoning tasks while actually using only a tiny specialized subset of parameters, creating massive inefficiency compared to human brains and pointing toward revolutionary energy-saving AI designs.
Key Points
- Researchers discover that large language models use only a small, specialized subset of parameters to perform Theory-of-Mind reasoning, despite activating their entire network for every task
- The AI models' social reasoning abilities depend heavily on rotary positional encoding (RoPE), which shapes how they track beliefs and perspectives during mental state inference
- This finding reveals a major efficiency gap compared to human brains and opens pathways toward developing more energy-efficient AI systems that activate only task-relevant parameters