OpenAI's o3 AI Model Underperforms Initial Claims, but Newer Models Show Promise

Apr 20, 2025

TechCrunch

Article image for OpenAI's o3 AI Model Underperforms Initial Claims, but Newer Models Show Promise

Summary

While OpenAI's o3 AI model underperformed initial claims with a 10% score on a benchmark test, the company's newer models like o3-mini-high and o4-mini show promise by outperforming the public o3 release, indicating progress in AI capabilities.

Key Points

OpenAI's o3 AI model scored around 10% on a benchmark test by Epoch AI, lower than the 25% score initially claimed by OpenAI.
The discrepancy is likely due to differences in testing setups, computing power used, and versions of the benchmark problems.
While OpenAI's initial claims about o3's performance were overstated, the company's newer models like o3-mini-high and o4-mini outperform the public o3 release.

OpenAI's o3 AI Model Underperforms Initial Claims, but Newer Models Show Promise

Summary

Key Points

Tags