other-emerald•17h ago

Does Custom LLM support streaming?

I'd like to send an initial "working on it" message back to the user, then call our LLM to generate text. Is this possible? It looks liek you have to return the entire response

1 Reply

Vapi Support Bot•17h ago

Yes, Custom LLM supports streaming, and you can send an initial "working on it" message before streaming the full response.
Here's a code snippet to illustrate this pattern:

res.setHeader("Content-Type", "text/event-stream");
res.write(`data: ${JSON.stringify({ delta: "Working on it..." })}\n\n`);
// Call your LLM and stream its output
for await (const chunk of openAIResponse) {
  res.write(`data: ${JSON.stringify(chunk)}\n\n`);
}
res.write("data: [DONE]\n\n");
res.end();

res.setHeader("Content-Type", "text/event-stream");
res.write(`data: ${JSON.stringify({ delta: "Working on it..." })}\n\n`);
// Call your LLM and stream its output
for await (const chunk of openAIResponse) {
  res.write(`data: ${JSON.stringify(chunk)}\n\n`);
}
res.write("data: [DONE]\n\n");
res.end();

This lets you immediately inform the user, then stream the LLM's response as it becomes available.
For more details, see the Vapi Custom LLM tool-calling integration guide. Source: - Custom LLM Tool Calling Integration

⤵️ Mention @Vapi Support Bot for follow-ups 🐥

Does Custom LLM support streaming?

Did you find this page helpful?