• Custom Agents
  • Pricing
  • Docs
  • Resources
    Blog
    Product updates and insights from the team
    Video Library
    Demos, walkthroughs, and tutorials
    Community
    Get help and connect with other developers
    Events
    Stay updated on upcoming events.
  • Careers
  • Enterprise
Sign Up
Loading footer...
←BACK TO BLOG /Comparison... / /Deepgram Nova-3 vs Nova-2: STT Evolved

Deepgram Nova-3 vs Nova-2: STT Evolved

Deepgram Nova-3 vs Nova-2: STT Evolved'
Vapi Editorial Team • Jun 17, 2025
4 min read
Share
Vapi Editorial Team • Jun 17, 20254 min read
0LIKE
Share

In-Brief

When Deepgram released Nova-3, they didn't just iterate on Nova-2; they rebuilt the engine. The results speak for themselves: streaming word error rates dropped from 8.4% to 6.84%, and batch processing hit an impressive 5.26% WER. That's a 54.3% reduction in streaming errors.

The real breakthrough is Nova-3's ability to handle multilingual conversations in real-time. While Nova-2 transcribes different languages well, Nova-3 can seamlessly follow conversations that jump between languages mid-sentence. Both models work through Vapi's platform with identical endpoints, so switching between them is as simple as changing a parameter.

This article breaks down the differences between Deepgram Nova-3 and Nova-2 so you can decide which speech-to-text model to use in your next build.

» Read more about using Deepgram models on Vapi.

Deepgram Nova-3 vs. Nova-2 - At-a-Glance

Feature
Release Timeline
Streaming WER
Batch WER
Latency
Multilingual Support
Customization
Advanced Features
Price Tier

Nova-3’s Architectural Upgrades

Nova-2 built its reputation on specialization: different models for meetings, finance calls, phone conversations, and automotive environments. Each one excelled in its lane, but switching contexts meant switching models entirely. It worked, but it was like having a different wrench for every bolt.

Nova-3 flips this approach. Instead of juggling multiple specialized models, it uses a single, more adaptable neural network that adjusts to different contexts on the fly. The technical upgrade centers on improved long-range attention and dynamic contextual adaptation. Now we have a model that remembers longer conversations and adapts more effectively to what it hears.

Practically, you get superior accuracy without the headache of managing multiple model variants. For developers building on Vapi's platform, this means more straightforward integration with better results.

Despite adding significant functionality, Nova-3 maintains the same lightning-fast inference speeds that made Nova-2 reliable for real-time applications. Through more efficient processing and improved parallelization, it delivers reduced latency, expanded concurrency, and improved reliability when handling multiple conversations simultaneously.

Language Support Compared

This is where Nova-3 truly shines. It's the first TTS AI model that can process multilingual conversations in real-time, something Nova-2 simply can't do. Unlike Gladia, which requires you to pick a language upfront, Nova-3 handles conversations that switch between languages without missing a beat.

The performance data backs this up. Nova-3 consistently outperformed Nova-2 and competing models, such as OpenAI's Whisper, across all tested languages, with a user preference advantage of up to 8-to-1 in some cases. For global applications like customer service, emergency response, and international collaboration, this changes everything.

Nova-2 takes a different path with pre-built variants optimized for specific regions and domains. These work well for focused use cases but lack the dynamic flexibility that makes Nova-3 special.

How Customizable are Nova-3 and Nova-2?

Nova-3 offers real-time, self-serve customization, allowing you to add up to 100 domain-specific terms without requiring any retraining. Need it to recognize your company's product names or industry jargon? Add them instantly and watch the model adapt.

Beyond customization, Nova-3 includes features that solve real-world problems: enhanced numeric recognition for financial applications, real-time redaction for up to 50 entities to handle privacy compliance, improved timestamp precision for captioning workflows, and better performance in challenging audio environments with background noise or distant microphones.

Nova-2 offers solid pre-built variants for specific domains, but it can't match the flexibility and feature depth of the newer model.

Deploying Nova-3 and Nova-2 with Vapi

Both models are Vapi-native integrations, built directly into the platform without requiring external API management or authentication setup. You can switch between models through Vapi's interface by simply selecting your preferred Deepgram model in your voice agent configuration.

This native integration eliminates the complexity of managing separate API keys, endpoints, or infrastructure concerns. Both models are immediately available for testing and deployment, making it effortless to compare performance and determine which better suits your specific requirements. The ability to switch between models instantly allows for real-time A/B testing and optimization of your voice applications.

» Speak to a demo AI Voice Agent for Sales Follow-Ups with Nova-3.

Performance vs. Budget

Let's talk numbers. Nova-3 operates at Deepgram's premium pricing tier, while Nova-2 remains at standard rates. Since Vapi passes through Deepgram's pricing, you'll pay more for Nova-3's advanced capabilities.

Is the upgrade worth it? That depends on your needs. The enhanced accuracy, multilingual support, and customization features deliver a measurable return on investment (ROI) for businesses where transcription quality directly impacts outcomes. Fewer errors mean less manual correction, better user experiences, and more reliable voice applications.

Nova-2 remains an excellent choice for straightforward projects where budget matters more than cutting-edge features. The model's proven reliability and lower cost make it perfect for basic transcription needs without the premium frills.

In Summary

Nova-3 consistently outperforms Nova-2 in terms of accuracy, multilingual capabilities, and advanced features: there’s a reason Deepgram upgraded. It's the superior choice for demanding applications where precision and flexibility are most crucial. The premium pricing reflects genuine technological advancement: this isn't just marketing fluff.

Nova-2 remains an excellent option for budget-conscious projects and straightforward transcription needs. Its proven track record and accessible pricing make it perfect for teams prioritizing cost-effectiveness over cutting-edge features.

Now, instead of doing more research, why don’t you start comparing them yourself? 

» Start building with Nova-3 and Nova-2 on Vapi.


\

Table of contents

Join the newsletter

Build your own
voice agent.

sign up
read the docs
Join the newsletter
0LIKE
Share
Vosk Alternatives for Medical Speech Recognition
MAY 21, 2025Comparison

Vosk Alternatives for Medical Speech Recognition

Gemini Flash vs Pro: Understanding the Differences Between Google’s Latest LLMs
JUN 19, 2025Comparison

Gemini Flash vs Pro: Understanding the Differences Between Google’s Latest LLMs

Claude vs ChatGPT: The Complete Comparison Guide'
JUN 18, 2025Comparison

Claude vs ChatGPT: The Complete Comparison Guide

8 Alternatives to Azure for Voice AI STT
JUN 23, 2025Comparison

8 Alternatives to Azure for Voice AI STT

Choosing Between Gemini Models for Voice AI
MAY 29, 2025Comparison

Choosing Between Gemini Models for Voice AI

Top 5 Character AI Alternatives for Seamless Voice Integration
MAY 23, 2025Comparison

Top 5 Character AI Alternatives for Seamless Voice Integration

Amazon Lex Vs Dialogflow: Complete Platform Comparison Guide'
MAY 23, 2025Comparison

Amazon Lex Vs Dialogflow: Complete Platform Comparison Guide

Medical AI for Healthcare Developers: Vosk vs. DeepSpeech'
MAY 20, 2025Comparison

Medical AI for Healthcare Developers: Vosk vs. DeepSpeech

ElevenLabs vs OpenAI TTS: Which One''s Right for You?'
JUN 04, 2025Comparison

ElevenLabs vs OpenAI TTS: Which One''s Right for You?

Narakeet: Turn Text Into Natural-Sounding Speech'
MAY 23, 2025Comparison

Narakeet: Turn Text Into Natural-Sounding Speech

Best Speechify Alternative: 5 Tools That Actually Work Better'
MAY 30, 2025Comparison

Best Speechify Alternative: 5 Tools That Actually Work Better

GPT-4.1 vs Claude 3.7: Which AI Delivers Better Voice Agents?'
JUN 05, 2025Comparison

GPT-4.1 vs Claude 3.7: Which AI Delivers Better Voice Agents?

The 10 Best Open-Source Medical Speech-to-Text Software Tools
MAY 22, 2025Comparison

The 10 Best Open-Source Medical Speech-to-Text Software Tools

Mistral vs Llama 3: Complete Comparison for Voice AI Applications'
JUN 24, 2025Comparison

Mistral vs Llama 3: Complete Comparison for Voice AI Applications

11 Great ElevenLabs Alternatives: Vapi-Native TTS Models '
JUN 04, 2025Comparison

11 Great ElevenLabs Alternatives: Vapi-Native TTS Models

Vapi vs. Twilio ConversationRelay
MAY 07, 2025Comparison

Vapi vs. Twilio ConversationRelay

DeepSeek R1 vs V3 for Voice AI Developers
MAY 28, 2025Agent Building

DeepSeek R1 vs V3 for Voice AI Developers