fair-rose•5mo ago

Deploying Custom LLM

Where (in which region) should I deploy custom LLM so that latency from vapi to my custom hosted LLM is lower? Thanks

2 Replies

To achieve the lowest latency between Vapi and your custom LLM, deploy your LLM server in the same cloud region or as geographically close as possible to Vapi's infrastructure.
If you are an enterprise customer or have strict data requirements, consider Vapi's on-premise deployment option, which allows you to host all components within your own cloud for maximum control over latency and data flow.
Tip: If you are unsure of Vapi's hosting region, reach out to Vapi support to confirm the optimal region for your deployment before proceeding. Sources: - On-Prem Deployments - How Vapi works

⤵️ Mention @Vapi Support Bot for follow-ups 🐥

Kyle Brunker•5mo ago

Deploy your model in the U.S. West region (where Vapi’s infrastructure primarily runs), ensure your endpoint is accessible and secure, and implement best practices like caching, load balancing, and monitoring..

Deploying Custom LLM

Did you find this page helpful?