Machine Learning

420 articles found

AI Models Struggle with Honest Reasoning, Raising Alignment Concerns

AI Models Struggle with Honest Reasoning, Raising Alignment Concerns

Apr 29, 2025
anthropic

Groundbreaking research reveals that AI models struggle with honest reasoning, raising concerns about their ability to faithfully describe their thought processes and avoid exploiting reward hacks, highlighting the need for further work to ensure alignment between AI systems and human values.

Ethics Trust Machine Learning
Previous
Page 39 of 42
Next
Showing 381 - 390 of 420 articles