When I am using Groq Llama4 Maverick on test call it works great but when I use it on real call the latency goes to 5000ms whereas it shows 640 ms on Web and 1140 ms on Twilio ( I am using VAPI Number)
Vapi helps developers build, test, and deploy voice agents at scale. We enable everything in between the raw models and production, including telephony, test suites, and real-time analytics.
That may be part of the plan you are using for groq? are you using the free api access? that has good latency, but a very long time to first token (TTFT), from what I have seen