Anthropic Launches Bug Bounty to Fortify AI Safety Against CBRN Threats

May 15, 2025

anthropic

Summary

Anthropic launches a bug bounty program offering up to $25,000 in rewards to researchers who can test and expose vulnerabilities in its AI safety measures designed to prevent the misuse of its models for CBRN (chemical, biological, radiological, and nuclear) weapons development.

Key Points

Anthropic is launching a new bug bounty program to test the safety measures for its AI models, specifically focusing on vulnerabilities related to CBRN (chemical, biological, radiological, and nuclear) weapons.
The program will test an updated version of Anthropic's Constitutional Classifiers system, which is designed to guard against jailbreaks that could elicit information related to CBRN weapons.
Researchers will receive early access to test the classifiers on Claude 3.7 Sonnet, with bounty rewards of up to $25,000 for verified universal jailbreaks.

Anthropic Launches Bug Bounty to Fortify AI Safety Against CBRN Threats

Summary

Key Points

Tags