Audio - News Articles

WisprFlow Lets Mac Users Dictate Text at 179 WPM With AI-Powered Voice Transcription

Jun 27, 2026

Zachary Proser

WisprFlow is revolutionizing Mac productivity by enabling users to dictate text at 179 WPM using AI-powered transcription that removes filler words, corrects grammar, and adapts tone across all apps — though its macOS exclusivity, 6-minute cap, subscription cost, and cloud-based processing present real trade-offs.

Audio Natural Language Generative AI Privacy Workplace

AI SDK 7 Launches With Reasoning Control, Durable Agents, and Real-Time Voice Support

Jun 26, 2026

Vercel

AI SDK 7 launches with groundbreaking features including reasoning control, durable agents, real-time voice support, and experimental video generation, solidifying its position as a powerhouse tool for AI development with over 16 million weekly downloads.

Generative AI Natural Language Audio Autonomous Systems Deep Learning

NVIDIA Launches Open Source XR AI Beta, Bringing Real-Time Visual and Voice Intelligence to AR Glasses and XR Headsets

Jun 18, 2026

NVIDIA Technical Blog

NVIDIA launches an open source XR AI beta platform enabling developers to build real-time visual and voice AI agents for AR glasses and XR headsets, powered by GPU-accelerated services including Cosmos, Nemotron, and NeMo Agent Toolkit.

Augmented Reality Generative AI Computer Vision Audio Hardware

Google Launches Gemini 3.5 Live Translate, Delivering Real-Time Speech Translation Across 70+ Languages

Jun 10, 2026

Google

Google launches Gemini 3.5 Live Translate, a groundbreaking AI model delivering near real-time speech-to-speech translation across 70+ languages while preserving speakers' natural voice characteristics, now available to developers, enterprise users, and the public via Google Translate on Android and iOS.

Audio Natural Language Generative AI Deep Learning Human-Computer Interaction

Google's Magenta Releases Real-Time AI Music Generation Model With Up To 2.4B Parameters

Jun 10, 2026

GitHub

Google's Magenta team releases Magenta RealTime 2, a cutting-edge open-weights AI music generation model available in 230M and 2.4B parameter sizes, enabling real-time music streaming on Apple Silicon Macs and offline inference on NVIDIA GPUs.

Audio Generative AI Deep Learning Hardware Machine Learning

Miso Labs Launches MisoTTS: Open-Source 8B-Parameter Text-to-Speech Model With Voice Cloning Hits Hugging Face

Jun 04, 2026

GitHub

Miso Labs launches MisoTTS, an open-source 8-billion-parameter text-to-speech model now live on Hugging Face, featuring voice cloning, highly emotive conversational speech, and built-in audio watermarking for safety.

Audio Generative AI Deep Learning Natural Language Security

Google DeepMind Launches Gemma 4 12B, a Multimodal AI Model Built to Run on Consumer Laptops

Jun 04, 2026

Google

Google DeepMind launches Gemma 4 12B, a multimodal AI model that runs locally on consumer laptops with just 16GB of VRAM, featuring a groundbreaking encoder-free architecture that processes vision and audio directly through its LLM backbone for lower latency and memory use, available now on Hugging Face and Kaggle under …

Generative AI Deep Learning Hardware Natural Language Audio

Spotify Launches Verified Badges for Podcasts and Cracks Down on AI Impersonation

May 22, 2026

Spotify

Spotify launches 'Verified by Spotify' green checkmark badges for authenticated podcasts while cracking down on AI voice cloning and impersonation, with eligibility based on listener activity, policy compliance, and audience authenticity.

Social Media Generative AI Audio Security Trust

Spotify Rolls Out AI Music and Podcast Tools Amid Growing Creator Backlash

May 22, 2026

The Deep View

Spotify launches bold new AI music and podcast tools, including a Universal Music Group partnership for artist-consented AI remixes, even as musicians and podcasters fight back with protest albums and class-action lawsuits over the platform's growing AI ambitions.

Generative AI Audio Entertainment Ethics Media

Meta AI Releases WavFlow, a Multimodal Audio Generation Framework That Produces Synchronized High-Fidelity Audio from Video and Text

May 21, 2026

GitHub

Meta AI unveils WavFlow, a groundbreaking multimodal framework that generates synchronized, high-fidelity audio directly from video and text inputs, matching top latent-based models on major benchmarks while making its codebase publicly available on GitHub.

Audio Generative AI Deep Learning Computer Vision Natural Language