fair-rose
fair-rose5mo ago

Deploying Custom LLM

Where (in which region) should I deploy custom LLM so that latency from vapi to my custom hosted LLM is lower? Thanks
2 Replies
Vapi Support Bot
To achieve the lowest latency between Vapi and your custom LLM, deploy your LLM server in the same cloud region or as geographically close as possible to Vapi's infrastructure.
If you are an enterprise customer or have strict data requirements, consider Vapi's on-premise deployment option, which allows you to host all components within your own cloud for maximum control over latency and data flow.
Tip: If you are unsure of Vapi's hosting region, reach out to Vapi support to confirm the optimal region for your deployment before proceeding. Sources: - On-Prem Deployments - How Vapi works
⤵️ Mention @Vapi Support Bot for follow-ups 🐥
Kyle Brunker
Kyle Brunker5mo ago
Deploy your model in the U.S. West region (where Vapi’s infrastructure primarily runs), ensure your endpoint is accessible and secure, and implement best practices like caching, load balancing, and monitoring..

Did you find this page helpful?