eastern-cyan•2mo ago

Call Latency Issue

In my VAPI assistant, the latency is around 0.7 second (≈700 ms). However, when I test it on a live call through Twilio, the latency increases to 4 seconds. Why does this happen, and how can it be addressed?

18 Replies

Vapi•2mo ago

Suggestion:
Twilio conferencing and call bridging can introduce significant latency. To reduce this, try minimizing the use of conference rooms or on-hold steps in your call flow. For example, directly connect the user to the Vapi assistant without intermediate Twilio conference logic:

// Directly update the inbound call to fetch Vapi TwiML
await client.calls(callSid).update({
  url: 'https://your-server/vapi-twiml-endpoint',
  method: 'POST'
});

// Directly update the inbound call to fetch Vapi TwiML
await client.calls(callSid).update({
  url: 'https://your-server/vapi-twiml-endpoint',
  method: 'POST'
});

Review your call flow to ensure the user is not unnecessarily placed on hold or in a conference, as this can help bring latency closer to Vapi's native ~800 ms.
See more in the Vapi+Twilio call handling guide. Sources: - Call Handling with Vapi and Twilio - FAQ: Latency

⤵️ Mention @Vapi Support Bot for follow-ups 🐥

eastern-cyanOP•2mo ago

@Vapi Team @Vapi Moderator Any Solution ? @Vapi @Shubham Bajaj @Mason

conventional-tan•2mo ago

Getting the exact same issue Considering switching telephony provider, but not sure if that will help

eastern-cyanOP•2mo ago

Yes switching provider might be a solution

conventional-tan•2mo ago

Have you tried Telnyx?

eastern-cyanOP•2mo ago

@Pedro Lourenco No, but have you given it a try?

Shubham Bajaj•2mo ago

Hey! To help track down this issue, could you share: - The call ID - When exactly this happened (the timestamp) - What response you expected to get - What response you actually got instead This would really help us figure out what went wrong!

eastern-cyanOP•2mo ago

Call ID: 2d6f72da-c4e2-496f-97dc-56cdbe8c326f When this happened: It’s not a one-time issue — it happens on every call I make. I’ve already tested this across dozens of calls. Expected behavior: I expect the conversation to flow naturally with low latency, similar to how it performs on VAPI (around 800ms). Actual behavior: On Twilio, the assistant’s responses are delayed by about 3–4 seconds, which makes the conversation feel unnatural and laggy. Impact: The high latency creates a poor user experience and breaks the natural back-and-forth flow of the conversation Thanks.

Shubham Bajaj•2mo ago

Please adjust your endpointing settings:

   "transcriptionEndpointingPlan": {
     "onPunctuationSeconds": 0.1,
     "onNoPunctuationSeconds": 0.8,  // Reduce from 1.5s
     "onNumberSeconds": 0.3          // Reduce from 0.5s
   }

   "transcriptionEndpointingPlan": {
     "onPunctuationSeconds": 0.1,
     "onNoPunctuationSeconds": 0.8,  // Reduce from 1.5s
     "onNumberSeconds": 0.3          // Reduce from 0.5s
   }

Also consider eleven_turbo_v2_5 instead of eleven_flash_v2_5 for lower latency on the voice config.

eastern-cyanOP•2mo ago

Thanks Kyle I will update and share the feedback with you soon. Kyle I have been doing call from UAE to USA . Still latency is high. Can you tell me that by BYO UAE number it get reduced or not? Because, Currently its a international call.

Shubham Bajaj•2mo ago

It won't be reduced much since VAPI servers are located in US. International calls will always result in higher latency with VAPI until we get more server locations

eastern-cyanOP•2mo ago

Ok Thanks

rival-black•2mo ago

why exactly would turbo v2.5 have lower latency than flash v2.5?

eastern-cyanOP•2mo ago

Flash Has the Lower Latency

rival-black•2mo ago

i know, i asked bc kyle said to use turbo rather than flash for lower latency - i assume this was a mistake, but want to check

Shubham Bajaj•2mo ago

Yes, that was a mistake on my part. I realize that flash is slightly faster than turbo

conventional-tan•2mo ago

Would self-hosting in a region close to me through AWS lead to a significant improvement?

Shubham Bajaj•2mo ago

Yes, but currently our team is reworking the on-prem deployment and has a rough ETA of end of q4

Call Latency Issue

Did you find this page helpful?