Google Launches Agentic Vision in Gemini 3 Flash, Enabling AI to Actively Investigate Images Through Code Execution

Jan 29, 2026
Google
Article image for Google Launches Agentic Vision in Gemini 3 Flash, Enabling AI to Actively Investigate Images Through Code Execution

Summary

Google launches Agentic Vision in Gemini 3 Flash, revolutionizing AI image analysis by enabling the model to actively investigate photos through Python code execution, cropping, and annotation in a Think-Act-Observe loop that delivers 5-10% quality improvements across vision benchmarks.

Key Points

  • Google launches Agentic Vision in Gemini 3 Flash, a new capability that transforms static image processing into an active investigation process using visual reasoning combined with code execution
  • The technology follows a Think-Act-Observe loop where the model analyzes queries, generates Python code to manipulate images through cropping and annotation, then observes transformed results for better context
  • Agentic Vision delivers 5-10% quality improvements across vision benchmarks and enables new use cases including zooming into fine details, image annotation with visual scratchpads, and visual math with plotting capabilities

Tags

Read Original Article