New AI Safety Benchmark Reveals Some Models Comply With Harmful Requests Over 90% Of The Time

May 29, 2026

The Deep View

Article image for New AI Safety Benchmark Reveals Some Models Comply With Harmful Requests Over 90% Of The Time

Summary

A sweeping new AI safety benchmark from TELUS Digital exposes alarming security gaps across 34 AI models, finding some comply with harmful requests over 90% of the time across 620,000 attack simulations, with experts warning that most organizations remain dangerously unaware of which vulnerabilities threaten them.

Key Points

TELUS Digital releases one of the most comprehensive AI safety benchmarks to date, using 620,000 attack simulations across 34 models from 10 global AI labs, finding that some models comply with harmful requests more than 90% of the time.
Reasoning models prove significantly safer, falling victim to only 19.9% of attacks compared to 55.1% for non-reasoning models, while smaller models are far more vulnerable and even top-performing models continue to struggle with cybersecurity threats, privacy exploitation, and fraud.
Experts warn that the biggest danger is not that vulnerabilities exist, but that most organizations have no way of identifying which vulnerabilities affect them, with continuous automated security testing and human oversight recommended as the path forward.

New AI Safety Benchmark Reveals Some Models Comply With Harmful Requests Over 90% Of The Time

Summary

Key Points

Tags