Diacritic Sign Support for Improved Voice Conversations

I am creating an assistant that speaks Hebrew. Hebrew is one of the languages that uses diacritic signs to vocalise words correctly. A well-known example of diacritic signs can be found in German, such as the umlaut letters. The most common umlauts are ä, ö, and ü, which represent modified vowel sounds compared to their non-umlaut counterparts (a, o, and u). Another notable language using diacritics is Arabic.

We have tried instructing the LLM to return its results with diacritic signs in Hebrew so that the TTS produces better results. However, even when instructing the LLM to include full diacritics, I am unable to see them in the call log (neither in the messages tab nor the logs). Please see Call ID 5427f3af-25a6-496a-8f31-4c47ce89239f as an example (you can check this call’s system prompt on your side, which generates answers with diacritics). We also tried providing instructions to the LLM in Hebrew, but observed similar results, except for the system message inside the messages tab, which does show diacritic signs. See Call IDs ff27397f-0af8-4c51-a0cc-6f0f9fadcda7 and 50261828-071c-4a67-bca9-fb865596bfe5 with short and long prompts in Hebrew only.

Importantly, when we instruct the LLM (e.g., via the ChatGPT Playground, official website, Azure Foundry Playground, etc.) directly, without Vapi involved, it returns answers with full diacritics.

Could you please clarify if this is a limitation on the Vapi side—that is, are diacritic signs from the LLM not supported or passed through as expected? Or is there anything I can do to ensure the LLM passes TTS words with diacritic signs?

Thanks in advance for your assistance.
Was this page helpful?