Mistral Launches Voxtral TTS, Claims Best Open-Source Text-to-Speech Model That Runs on Smartwatches

Mar 26, 2026

The Deep View

Article image for Mistral Launches Voxtral TTS, Claims Best Open-Source Text-to-Speech Model That Runs on Smartwatches

Summary

Mistral launches Voxtral TTS, a groundbreaking 3-billion parameter open-weight text-to-speech model that outperforms ElevenLabs, adapts to any voice in five seconds, and runs directly on smartwatches.

Key Points

Mistral launches Voxtral TTS, its first open-weight text-to-speech model, claiming it is the best open-source TTS model to date, outperforming ElevenLabs v2.5 Flash in human naturalneness evaluations.
Voxtral TTS is a 3-billion parameter model that supports nine languages, adapts to a voice in just five seconds, produces audio within 90 milliseconds, and is compact enough to run on-device on a smartphone or even a smartwatch.
The model is now available for testing in Mistral Studio and as open weights on Hugging Face, targeting use cases such as customer support, real-time translation, and personal voice agents.

Mistral Launches Voxtral TTS, Claims Best Open-Source Text-to-Speech Model That Runs on Smartwatches

Summary

Key Points

Tags