Inconsistent Transcript Times Compared to Audio Times
Hi, I am trying to align the transcript to the speech in audio (using
For example, you can already see in the screenshot that when I say "Hello?", the audio starts on roughly 0.9s, but message #2
I tried calculating using just
Note this happens to every single call I make, the call id for this example is
duration, endTime and startTime), but all of the values returned by Get Call VAPI API are very off to the audio file's times.For example, you can already see in the screenshot that when I say "Hello?", the audio starts on roughly 0.9s, but message #2
secondsFromStart returns 1.637s, off by 0.7 seconds. Even bigger difference, is message #4 for which audio starts on roughly 6.9s but secondsFromStart value is on 7.927s, resulting in 1s delay.I tried calculating using just
time too, offsetting from the very first time from the first system message, but delays are still present.Note this happens to every single call I make, the call id for this example is
019a1b1c-c2bb-7338-b57c-a21dce505ff2.