OpenAI Launches Three Realtime API Voice Models Including GPT-Realtime-2

OpenAI released three new voice models in its Realtime API: GPT-Realtime-2 (GPT-5-class reasoning for production voice agents), GPT-Realtime-Translate (streaming translation across 70+ input languages), and GPT-Realtime-Whisper (real-time audio transcription).

1 min read|agenticonsult Intelligence

OpenAI Launches Three Realtime API Voice Models Including GPT-Realtime-2

OpenAI shipped a simultaneous triple release in its Realtime API: GPT-Realtime-2 (the company's most intelligent voice model yet, with GPT-5-class reasoning capable of thinking and handling interruptions), GPT-Realtime-Translate (streaming translation across 70+ input and 13 output languages), and GPT-Realtime-Whisper (real-time audio transcription for live captions). Sam Altman called GPT-Realtime-2 "a pretty big step forward" and noted growing voice usage, particularly among younger users.

Why It Matters

The coordinated triple launch signals a deliberate industry push to make voice the dominant AI interface of H2 2026. Voice-native agent architectures will likely see accelerated adoption for customer service, accessibility, and real-time translation use cases.

Primary source

OpenAI

This breaking-news item was assembled from the cited primary source with AI assistance. It is intended for rapid situational awareness — refer to the original publication for the definitive statement.