VAPI•4d ago

Custom LLM + tool calls = stream chunks are getting enqueued but not spoken

We have a Custom LLM implementation that VAPI calls over an SSE connection.

We want this sequence:

POST /vapi/llm to start
agent generate sequence
tool call happens, we speak "hang on let me check that for you"
tool call occurs
VAPI speaks response to user

When I test the endpoint locally, I can see the

data: {"id": ...}

data: {"id": ...}

data: {"id": ...}

data: {"id": ...} chunks being streamed through fine.

However, on the voice side, VAPI just queues all of the chunks up and speaks them together. Which sorta ruins the point of the waiting message.

How do you get VAPI to speak at designated chunks rather than wait for the entire stream to finish?

Conversely, I've tried using the controlClient to

say

say

say

say a response when the tool starts, but VAPI breaks the connection when this happens.

Custom LLM + tool calls = stream chunks are getting enqueued but not spoken

Similar Threads

Similar Threads

Similar Threads