harsh-harlequin•4d ago
(Urgent) Vapi is passing Agent transcription model causing numerous spelling errors
I'm seeing user speech transcription + agent speech transcription fed to model (using open ai 5.1) as the messages context. Can we use the model's output directly instead of agent speech transcription which can have errors?
For example: in this call https://dashboard.vapi.ai/calls?callId=019ae00b-36d1-7117-b9e5-1a04fbad0033
- User says "So please send the email directly to me at triston at with pace dot com"
- Model outputs "I heard you say triston at with pace dot com"
- When agent says this^ it gets transcribed to "I heard you say tristin at with pace dot com" (INCORRECT)
- Next OpenAI request sends "I heard you say tristin at with pace dot com"
Result is that the email is completely wrong which is a common problem for us. Why is Vapi default using transcription of agent speech instead of direct model output? This isn't how providers like LiveKit work.
Is there a configuration I can use to change this to only use user speech transcription + model output?
5 Replies
fascinating-indigo•4d ago
Hello @Jessica
I see the issue, agent speech transcription often introduces errors, like in your email example, which then feed back into the model. I can help set up a workflow using user transcription + model output directly, skipping agent transcription. This approach improves accuracy and aligns with how providers like LiveKit handle context. Are you looking to adjust your VAPI pipeline or stay within the default setup?
harsh-harlequinOP•4d ago
@Tremix im open to adjustments, whatever will allow us to use model output not agent transcription as input to the next conversation turn
fascinating-indigo•4d ago
Great! Since you’re open to adjustments, I can help implement a workflow where we bypass agent speech transcription and use user transcription + model output directly for the next turn. This will significantly reduce errors like misheard emails or numbers. We can discuss the technical approach and pipeline changes privately to make it work seamlessly for your setup.
@Jessica
harsh-harlequinOP•4d ago
Oh, no thank you. Prefer to keep everything out in the open
fascinating-indigo•4d ago
No problem at all @Jessica , we can keep everything right here. From what you shared, I can help set up a workflow that uses user transcription + the model’s direct output, so you avoid the agent-transcription errors entirely. If you want, I can also handle the configuration for you and make sure everything runs smoothly. Would you like someone like me to take this on directly for you?