OpenAI Overhauls WebRTC Architecture To Scale Real-Time Voice AI For 900 Million Users Worldwide

May 05, 2026

OpenAI

Summary

OpenAI overhauled its WebRTC architecture with a split relay-plus-transceiver model, enabling real-time voice AI to scale globally across Kubernetes for over 900 million weekly active users while minimizing latency without requiring custom client modifications.

Key Points

OpenAI rearchitects its WebRTC stack using a split relay-plus-transceiver model, allowing real-time voice AI to scale globally across Kubernetes without exposing thousands of UDP ports.
A lightweight relay layer routes incoming media packets to the correct stateful transceiver by decoding routing metadata embedded in the ICE username fragment, eliminating the need for per-session port allocation while preserving standard WebRTC behavior for clients.
Global relay ingress points combined with geo-steered signaling minimize first-hop latency for over 900 million weekly active users, keeping voice interactions fast and natural without requiring kernel-bypass frameworks or custom client modifications.

OpenAI Overhauls WebRTC Architecture To Scale Real-Time Voice AI For 900 Million Users Worldwide

Summary

Key Points

Tags