Anthropic Reverses Secret AI Sabotage Policy After Fierce Backlash From Research Community

Jun 12, 2026

WIRED

Article image for Anthropic Reverses Secret AI Sabotage Policy After Fierce Backlash From Research Community

Summary

Anthropic reverses a secret policy that would have covertly degraded Claude Fable 5's performance for researchers building competing AI models, now committing to transparent safeguards after fierce backlash warning the original approach threatened open-source AI development.

Key Points

Anthropic reverses a controversial policy that would have secretly degraded Claude Fable 5's performance for researchers attempting to use the AI to develop competing models, following fierce backlash from the AI research community.
The company now commits to making its safeguards visible, meaning users will be openly notified when their requests are refused or rerouted to a less capable model, rather than being covertly sabotaged.
Critics warn the original hidden-degradation approach was deeply hostile to open-source AI research, with researchers arguing it would have concentrated advanced AI development exclusively among a small number of leading labs.

Anthropic Reverses Secret AI Sabotage Policy After Fierce Backlash From Research Community

Summary

Key Points

Tags