GPT Realtime is Now Available in Vapi

Vapi raises $50M Series B to power the next generation of enterprise voice AI

Vapi raises $50M Series B

Abhishek Sharma • Aug 28, 2025

2 min read

OpenAI’s new GPT Realtime model is now live in Vapi’s dashboard and API.

We’ve been testing it ahead of launch and it’s a noticeable step forward for real-time, production-grade use cases. Conversations feel more natural, with sharper turn-taking and clearer audio quality.

What’s New in GPT Realtime

Compared to earlier real-time models, GPT Realtime delivers:

Lower latency: The response time feels natural. It’s fast enough for real-time, production-grade use cases where back-and-forth conversation is critical.
Sharper instruction following: Handles 3–5 turn transactional flows with reliability.
Multilingual flexibility: Switches languages mid-sentence with high accuracy.
Improved tool‑calling design: Prompt-based creation is more intuitive and powerful.
Voice upgrades: Cedar delivers a warm, conversational tone and strong accent emulation; Marin adds clarity for structured communication.
Structured data handling: Names, phone numbers, and emails are repeated back with natural pacing, without sounding robotic.
Audio quality boost:: Clearer, crisper sound with fewer distortions compared to earlier versions.

Why This Matters

These improvements open up applications where timing, tone, and nuance matter.

Here are a few use cases already being explored with early adopters:

Healthcare triage: Collecting and repeating structured data like names and insurance IDs, with tone and pacing that patients trust.
Coaching and training: Real-time back-and-forth with natural pauses and encouragement instead of robotic interruptions.
Customer support: Transactional flows such as rescheduling appointments or troubleshooting accounts, handled in 3 to 5 turns reliably.
Global services: A user can start in English, switch to Spanish, and continue the conversation naturally.

Availability

The GPT Realtime model is available to all Vapi users now. You can try it directly from the Vapi dashboard or integrate it in your agents via API.

We’re excited to see what you build with it.

Join the Newsletter

JUN 17, 2026

Audio Preprocessing for Speech-to-Text: Definition, Implementation, and Use Cases

JUN 27, 2025

What Is Signal Processing? Voice AI Definition Guide

JUN 23, 2025

Speech Latency Solutions: Complete Guide to Sub-500ms Voice AI

JUN 20, 2025

Building a Grok-2 Voice Agent on Vapi

JUN 20, 2025

DeepSeek R1: Open-Source Reasoning for Voice Chat

JUN 20, 2025

How Sampling Rate Works in Voice AI

JUN 20, 2025

How to Use Grok 3 in a Voice Agent

JUN 19, 2025

Unpacking LLM Temperature

JUN 12, 2025

How to Build a GPT-4.1 Voice Agent

JUN 10, 2025

Building a Mistral Medium Voice Agent with Vapi

JUN 10, 2025

Multi-turn Conversations: Definition, Benefits, & Examples

JUN 10, 2025

Building a Llama 3 Voice Assistant with Vapi

JUN 09, 2025

Building GPT-4 Phone Agents with Vapi

JUN 09, 2025

What Is Gemma 3? Google's Open-Weight AI Model

JUN 05, 2025

Introducing Vapi Workflows

JUN 04, 2025

11 Great ElevenLabs Alternatives: Vapi-Native TTS Models

JUN 04, 2025

Tortoise TTS v2: Quality-Focused Voice Synthesis

MAY 30, 2025

How to Create Natural Audio Using Concatenative Synthesis

MAY 30, 2025

Why Word Error Rate Matters for Your Voice Applications

MAY 30, 2025

Parallel WaveGAN: Fast Neural Speech Synthesis for Modern Voice AI

MAY 30, 2025

Flow-Based Models: A Developer''s Guide to Advanced Voice AI

MAY 30, 2025

What Are IoT Devices? A Developer's Guide to Connected Hardware

MAY 29, 2025

Choosing Between Gemini Models for Voice AI

MAY 28, 2025

DeepSeek R1 vs V3 for Voice AI Developers

MAY 28, 2025

Building a GPT-4.1 Mini Phone Agent with Vapi

MAY 26, 2025

What Is GPT? Understanding A Core Technology for Voice AI

MAY 26, 2025

MMLU: The Ultimate Report Card for Voice AI

MAY 26, 2025

Homograph Disambiguation in Voice AI: Solving Pronunciation Puzzles

MAY 26, 2025

Env Files and Environment Variables for Voice AI Projects

MAY 26, 2025

Understanding VITS: Revolutionizing Voice AI With Natural-Sounding Speech

MAY 26, 2025

Text Normalization for Voice AI: Complete Guide to Speech Preprocessing in 2025

MAY 26, 2025

LLMs Benchmark Guide: Complete Evaluation Framework for Voice AI

MAY 23, 2025

A Developer's Guide to Optimizing Latency Reduction Through Audio Caching

MAY 23, 2025

Mastering SSML: Unlock Advanced Voice AI Customization

MAY 23, 2025

WaveNet Unveiled: Advancements and Applications in Voice AI

MAY 23, 2025

Glow-TTS: A Reliable Speech Synthesis Solution for Production Applications

MAY 23, 2025

A Developer’s Guide to Using WaveGlow in Voice AI Solutions

MAY 23, 2025

Mastering Environment Variables: Set Up for Vapi Voice AI Integration

MAY 23, 2025

Understanding Graphemes and Why They Matter in Voice AI

MAY 23, 2025

Revolutionize Voice Clarity with Vapi’s AI-Driven Noise Reduction Tools

MAY 23, 2025

LPCNet in Action: Accelerating Voice AI Solutions for Developers and Innovators

MAY 22, 2025

Understanding Dynamic Range Compression in Voice AI

MAY 22, 2025

Diffusion Models in AI: Explained

MAY 22, 2025

What is a Phoneme? An In-Depth Look for Technologists

MAY 22, 2025

Launching the Vapi for Creators Program

MAY 12, 2025

Speech-to-Text: What It Is, How It Works, & Why It Matters

MAY 09, 2025

Text-to-Speech: What It Is, How It Works, and Why It Matters

MAY 01, 2025

New in Vapi: Version Preview, Version History and Role-Based Access Control

APR 18, 2025

Bring Vapi Voice Agents into Your Workflows With The New Vapi MCP Server

APR 15, 2025

Vapi x Deepgram Aura-2 — The Most Natural TTS for Enterprise Voice AI

APR 01, 2025

Scaling Client Intake Engine with Vapi Voice AI agents

MAR 13, 2025

Introducing Vapi Voices

MAR 11, 2025

Vapi x Cartesia: Ultra-Realistic Voice AI with Sonic 2.0

MAR 06, 2025

AI Call Centers are changing Customer Support Industry

MAR 04, 2025

Voice AI is eating the world

FEB 25, 2025

Free Telephony with Vapi

FEB 20, 2025

Test Suites for Vapi agents

FEB 19, 2024

Let's Talk - Voicebots, Latency, and Artificially Intelligent Conversation

Start Building

Contact Sales Sign Up

OpenAI’s new GPT Realtime model is now live in Vapi’s dashboard and API.

What’s New in GPT Realtime

Compared to earlier real-time models, GPT Realtime delivers:

Lower latency: The response time feels natural. It’s fast enough for real-time, production-grade use cases where back-and-forth conversation is critical.
Sharper instruction following: Handles 3–5 turn transactional flows with reliability.
Multilingual flexibility: Switches languages mid-sentence with high accuracy.
Improved tool‑calling design: Prompt-based creation is more intuitive and powerful.
Voice upgrades: Cedar delivers a warm, conversational tone and strong accent emulation; Marin adds clarity for structured communication.
Structured data handling: Names, phone numbers, and emails are repeated back with natural pacing, without sounding robotic.
Audio quality boost:: Clearer, crisper sound with fewer distortions compared to earlier versions.

Why This Matters

These improvements open up applications where timing, tone, and nuance matter.

Here are a few use cases already being explored with early adopters:

Healthcare triage: Collecting and repeating structured data like names and insurance IDs, with tone and pacing that patients trust.
Coaching and training: Real-time back-and-forth with natural pauses and encouragement instead of robotic interruptions.
Customer support: Transactional flows such as rescheduling appointments or troubleshooting accounts, handled in 3 to 5 turns reliably.
Global services: A user can start in English, switch to Spanish, and continue the conversation naturally.

Availability

The GPT Realtime model is available to all Vapi users now. You can try it directly from the Vapi dashboard or integrate it in your agents via API.

We’re excited to see what you build with it.

GPT Realtime is Now Available in Vapi

What’s New in GPT Realtime

Why This Matters

Availability

Table of Contents

Read More

Built for the Ear: Designing Conversations for Voice

How we Bootstrapped the Voice Agents on the Vapi Homepage

AGI is here. Why am I still on hold?

Introducing Vapi Monitoring

Composer Webinar: Your Most-Asked Questions, Answered

Your AI Coding Assistant Just Learned to Build Voice Agents

Vibe code voice agents

Announcing Vapi Voices Beta: Lower Cost, Lower Latency for High-volume Voice AI

Your Voice Agents Need Tests. Now They Have Them.

GPT-5.1 Just Fixed the Thing That's Been Bugging Me for Years

Introducing Squads: Teams of Assistants

Build Using Free Cartesia Sonic 3 TTS All Week on Vapi

Build with Free, Unlimited MiniMax TTS All Week on Vapi

GPT-5 Now Live in Vapi

How We Solved DTMF Reliability in Voice AI Systems

How We Built Adaptive Background Speech Filtering at Vapi

How we solved latency at Vapi

Audio Preprocessing for Speech-to-Text: Definition, Implementation, and Use Cases

What Is Signal Processing? Voice AI Definition Guide

Speech Latency Solutions: Complete Guide to Sub-500ms Voice AI

Building a Grok-2 Voice Agent on Vapi

DeepSeek R1: Open-Source Reasoning for Voice Chat

How Sampling Rate Works in Voice AI

How to Use Grok 3 in a Voice Agent

Unpacking LLM Temperature

How to Build a GPT-4.1 Voice Agent

Building a Mistral Medium Voice Agent with Vapi

Multi-turn Conversations: Definition, Benefits, & Examples

Building a Llama 3 Voice Assistant with Vapi

Building GPT-4 Phone Agents with Vapi

What Is Gemma 3? Google's Open-Weight AI Model

Introducing Vapi Workflows

11 Great ElevenLabs Alternatives: Vapi-Native TTS Models

Tortoise TTS v2: Quality-Focused Voice Synthesis

How to Create Natural Audio Using Concatenative Synthesis

Why Word Error Rate Matters for Your Voice Applications

Parallel WaveGAN: Fast Neural Speech Synthesis for Modern Voice AI

Flow-Based Models: A Developer''s Guide to Advanced Voice AI

What Are IoT Devices? A Developer's Guide to Connected Hardware

Choosing Between Gemini Models for Voice AI

DeepSeek R1 vs V3 for Voice AI Developers

Building a GPT-4.1 Mini Phone Agent with Vapi

What Is GPT? Understanding A Core Technology for Voice AI

MMLU: The Ultimate Report Card for Voice AI

Homograph Disambiguation in Voice AI: Solving Pronunciation Puzzles

Env Files and Environment Variables for Voice AI Projects

Understanding VITS: Revolutionizing Voice AI With Natural-Sounding Speech

Text Normalization for Voice AI: Complete Guide to Speech Preprocessing in 2025

LLMs Benchmark Guide: Complete Evaluation Framework for Voice AI

A Developer's Guide to Optimizing Latency Reduction Through Audio Caching

Mastering SSML: Unlock Advanced Voice AI Customization

WaveNet Unveiled: Advancements and Applications in Voice AI

Glow-TTS: A Reliable Speech Synthesis Solution for Production Applications

A Developer’s Guide to Using WaveGlow in Voice AI Solutions

Mastering Environment Variables: Set Up for Vapi Voice AI Integration

Understanding Graphemes and Why They Matter in Voice AI

Revolutionize Voice Clarity with Vapi’s AI-Driven Noise Reduction Tools

LPCNet in Action: Accelerating Voice AI Solutions for Developers and Innovators

Understanding Dynamic Range Compression in Voice AI

Diffusion Models in AI: Explained

What is a Phoneme? An In-Depth Look for Technologists

Launching the Vapi for Creators Program

Speech-to-Text: What It Is, How It Works, & Why It Matters

Text-to-Speech: What It Is, How It Works, and Why It Matters

New in Vapi: Version Preview, Version History and Role-Based Access Control

Bring Vapi Voice Agents into Your Workflows With The New Vapi MCP Server

Vapi x Deepgram Aura-2 — The Most Natural TTS for Enterprise Voice AI

Scaling Client Intake Engine with Vapi Voice AI agents

Introducing Vapi Voices

Vapi x Cartesia: Ultra-Realistic Voice AI with Sonic 2.0

AI Call Centers are changing Customer Support Industry

Voice AI is eating the world

Free Telephony with Vapi

Test Suites for Vapi agents