Do you have to use streaming for custom llm url?

The following code where I stream the response works:
response = openai.chat.completions.create(
            model="gpt-4o",
            messages=request.messages,
            stream=True,
    )
async def stream_response(stream):
    for r in stream:
        yield f"data: {r.model_dump_json()}\n\n"
    yield "data: [DONE]"

return StreamingResponse(stream_response(response), media_type="text/event-stream")

However, this code does not work. The vapi assistant does not say anything.
res = openai.chat.completions.create(
        model="gpt-4o",
        messages=request.messages,
    )
  return res.model_dump_json()
Was this page helpful?