Google Launches Gemini 3.1 Flash TTS With Natural Language Voice Control and AI Watermarking Across 70+ Languages

Apr 16, 2026

Google

Summary

Google launches Gemini 3.1 Flash TTS, its most expressive text-to-speech model yet, featuring natural language voice control via audio tags, support for 70+ languages, and built-in SynthID watermarking to detect AI-generated audio and combat misinformation.

Key Points

Google launches Gemini 3.1 Flash TTS, its most natural and expressive text-to-speech model yet, now available for developers via the Gemini API, Google AI Studio, Vertex AI, and Google Vids.
The model introduces audio tags that allow users to control vocal style, pacing, and delivery using natural language commands, giving developers granular creative control over AI-generated speech across 70+ languages.
All audio generated by Gemini 3.1 Flash TTS is watermarked with SynthID, an imperceptible embedded watermark designed to detect AI-generated content and help prevent misinformation.

Google Launches Gemini 3.1 Flash TTS With Natural Language Voice Control and AI Watermarking Across 70+ Languages

Summary

Key Points

Tags