Mistral AI Launches Voxtral Transcribe 2 with 480ms Ultra-Low Latency Speech-to-Text Across 13 Languages
Summary
Mistral AI unveils Voxtral Transcribe 2, a breakthrough speech-to-text system delivering lightning-fast 480ms latency across 13 languages with on-device privacy capabilities, outperforming OpenAI and Gemini while offering open-source accessibility for enterprise applications in healthcare, finance, and customer service.
Key Points
- Mistral AI launches Voxtral Transcribe 2, a new family of speech-to-text models featuring state-of-the-art transcription quality, speaker diarization, and ultra-low latency capabilities that can run on-device for enhanced privacy and cost efficiency
- The launch includes Voxtral Realtime with 480ms latency across 13 languages and Voxtral Mini Transcribe V2 offering high-quality transcriptions at lower costs, both outperforming competitors like Gemini and OpenAI on key benchmarks
- Mistral releases the models under Apache 2.0 open-source license, targeting enterprise applications in healthcare, finance, customer service, and multilingual subtitles while maintaining their commitment to developer accessibility