Custom LLM + tool calls = stream chunks are getting enqueued but not spoken
We have a Custom LLM implementation that VAPI calls over an SSE connection.
We want this sequence:
However, on the voice side, VAPI just queues all of the chunks up and speaks them together. Which sorta ruins the point of the waiting message.
How do you get VAPI to speak at designated chunks rather than wait for the entire stream to finish?
Conversely, I've tried using the controlClient to
We want this sequence:
- POST /vapi/llm to start
- agent generate sequence
- tool call happens, we speak "hang on let me check that for you"
- tool call occurs
- VAPI speaks response to user
data: {"id": ...} chunks being streamed through fine.However, on the voice side, VAPI just queues all of the chunks up and speaks them together. Which sorta ruins the point of the waiting message.
How do you get VAPI to speak at designated chunks rather than wait for the entire stream to finish?
Conversely, I've tried using the controlClient to
say a response when the tool starts, but VAPI breaks the connection when this happens.