few-sapphireF

latency

I Use GPT 4o mini Cluster as LLM model. U advertise 400ms. Well i experience 3500ms latency from the model alone.
Was this page helpful?