NVIDIA Unveils Groundbreaking AI Models and Dataset for Multilingual Speech Recognition
Summary
NVIDIA unveils Granary, a 1 million hour dataset spanning 25 languages, along with Canary-1b-v2 for multilingual speech recognition and translation, and Parakeet-tdt-0.6b-v3 for real-time transcription, pushing boundaries in AI language models.
Key Points
- NVIDIA releases Granary, an open dataset containing around 1 million hours of audio for 25 European languages
- NVIDIA unveils Canary-1b-v2, a high-accuracy model for multilingual speech recognition and translation, topping accuracy leaderboards
- NVIDIA introduces Parakeet-tdt-0.6b-v3, a high-throughput model for real-time transcription of Granary's supported languages