Google Launches Gemini 3.1 Flash TTS With Natural Language Voice Control and AI Watermarking Across 70+ Languages
Summary
Google launches Gemini 3.1 Flash TTS, its most expressive text-to-speech model yet, featuring natural language voice control via audio tags, support for 70+ languages, and built-in SynthID watermarking to detect AI-generated audio and combat misinformation.
Key Points
- Google launches Gemini 3.1 Flash TTS, its most natural and expressive text-to-speech model yet, now available for developers via the Gemini API, Google AI Studio, Vertex AI, and Google Vids.
- The model introduces audio tags that allow users to control vocal style, pacing, and delivery using natural language commands, giving developers granular creative control over AI-generated speech across 70+ languages.
- All audio generated by Gemini 3.1 Flash TTS is watermarked with SynthID, an imperceptible embedded watermark designed to detect AI-generated content and help prevent misinformation.