Google's Gemini 3 Deep Think Shatters AI Benchmarks, Scoring 84.6% on Human Reasoning Test and Achieving Grandmaster Programming Level

Feb 13, 2026

MarkTechPost

Article image for Google's Gemini 3 Deep Think Shatters AI Benchmarks, Scoring 84.6% on Human Reasoning Test and Achieving Grandmaster Programming Level

Summary

Google's Gemini 3 Deep Think AI model achieves unprecedented 84.6% score on human reasoning tests, surpassing average human performance by 24 percentage points while reaching Legendary Grandmaster programming status and earning gold medal-level results on international science and math competitions.

Key Points

Google's Gemini 3 Deep Think achieves 84.6% on ARC-AGI-2 benchmark, surpassing the human average of 60% and representing a massive leap from previous AI models that struggled to break 20%
The model reaches Legendary Grandmaster level in competitive programming with a 3455 Elo score on Codeforces and scores 48.4% on Humanity's Last Exam without external tools
Gemini 3 Deep Think earns gold medal-level performance on 2025 International Physics, Chemistry, and Math Olympiads while demonstrating practical engineering capabilities like converting sketches into 3D-printable objects

Google's Gemini 3 Deep Think Shatters AI Benchmarks, Scoring 84.6% on Human Reasoning Test and Achieving Grandmaster Programming Level

Summary

Key Points

Tags