Google Launches Agentic Vision in Gemini 3 Flash, Enabling AI to Actively Investigate Images Through Code Execution

Jan 29, 2026

Google

Summary

Google launches Agentic Vision in Gemini 3 Flash, revolutionizing AI image analysis by enabling the model to actively investigate photos through Python code execution, cropping, and annotation in a Think-Act-Observe loop that delivers 5-10% quality improvements across vision benchmarks.

Key Points

Google launches Agentic Vision in Gemini 3 Flash, a new capability that transforms static image processing into an active investigation process using visual reasoning combined with code execution
The technology follows a Think-Act-Observe loop where the model analyzes queries, generates Python code to manipulate images through cropping and annotation, then observes transformed results for better context
Agentic Vision delivers 5-10% quality improvements across vision benchmarks and enables new use cases including zooming into fine details, image annotation with visual scratchpads, and visual math with plotting capabilities

Google Launches Agentic Vision in Gemini 3 Flash, Enabling AI to Actively Investigate Images Through Code Execution

Summary

Key Points

Tags