Google Launches Gemini 2.5 Computer Use Model That Controls Devices Like Humans
Summary
Google releases Gemini 2.5 Computer Use model in public preview, allowing developers to create AI agents that control devices by clicking, typing, and scrolling like humans, outperforming competitors on benchmarks while featuring comprehensive safety measures to prevent misuse.
Key Points
- Google releases the Gemini 2.5 Computer Use model in public preview, enabling developers to build AI agents that interact with user interfaces by clicking, typing, and scrolling like humans
- The specialized model outperforms leading alternatives on web and mobile control benchmarks while delivering lower latency, and is accessible via the Gemini API on Google AI Studio and Vertex AI
- Google implements comprehensive safety measures including per-step safety services and system instructions to prevent misuse, with early testers already using it for UI testing, workflow automation, and personal assistants