OpenAI Overhauls WebRTC Architecture To Scale Real-Time Voice AI For 900 Million Users Worldwide

May 05, 2026
OpenAI
Article image for OpenAI Overhauls WebRTC Architecture To Scale Real-Time Voice AI For 900 Million Users Worldwide

Summary

OpenAI overhauled its WebRTC architecture with a split relay-plus-transceiver model, enabling real-time voice AI to scale globally across Kubernetes for over 900 million weekly active users while minimizing latency without requiring custom client modifications.

Key Points

  • OpenAI rearchitects its WebRTC stack using a split relay-plus-transceiver model, allowing real-time voice AI to scale globally across Kubernetes without exposing thousands of UDP ports.
  • A lightweight relay layer routes incoming media packets to the correct stateful transceiver by decoding routing metadata embedded in the ICE username fragment, eliminating the need for per-session port allocation while preserving standard WebRTC behavior for clients.
  • Global relay ingress points combined with geo-steered signaling minimize first-hop latency for over 900 million weekly active users, keeping voice interactions fast and natural without requiring kernel-bypass frameworks or custom client modifications.

Tags

Read Original Article