SSML Not Working with ElevenLabs Custom Voice Despite enableSsmlParsing Enabled
Description:
I'm experiencing issues with SSML parsing not being respected when using a ElevenLabs voice, despite having successfully enabled enableSsmlParsing: true via PATCH request.
Configuration:
Assistant ID: 844995ac-4928-45b1-a262-3fd634b88658
Voice Provider: ElevenLabs (11labs)
Voice: "Raquel pro v1"
Model: eleven_multilingual_v2
SSML Parsing: Enabled via PATCH to /assistant/{id} endpoint (confirmed with GET request showing enableSsmlParsing: true)
Issue:
SSML tags like <break time="1.5s"/> and <speak></speak> are being completely ignored. The assistant reads through text without any pauses, making it impossible to control pacing when reading lists of available appointment times.
What I've Tried:
Successfully enabled enableSsmlParsing via PATCH request (200 OK response)
Verified setting with GET request - shows enableSsmlParsing: true in voice config
Tested with both single quotes <break time='1.5s'/> and double quotes <break time="1.5s"/>
Ensured SSML is wrapped in <speak></speak> tags
Verified webhook is sending correct SSML string to VAPI
Tested with different break times (500ms, 1s, 5s) - none work
Expected Behavior:
SSML tags should be processed and pauses should occur when specified, allowing control over speech pacing and rhythm.
Actual Behavior:
All SSML tags are ignored. The voice reads continuously without any pauses, making the output sound rushed and unnatural when listing multiple items.
I'm experiencing issues with SSML parsing not being respected when using a ElevenLabs voice, despite having successfully enabled enableSsmlParsing: true via PATCH request.
Configuration:
Assistant ID: 844995ac-4928-45b1-a262-3fd634b88658
Voice Provider: ElevenLabs (11labs)
Voice: "Raquel pro v1"
Model: eleven_multilingual_v2
SSML Parsing: Enabled via PATCH to /assistant/{id} endpoint (confirmed with GET request showing enableSsmlParsing: true)
Issue:
SSML tags like <break time="1.5s"/> and <speak></speak> are being completely ignored. The assistant reads through text without any pauses, making it impossible to control pacing when reading lists of available appointment times.
What I've Tried:
Expected Behavior:
SSML tags should be processed and pauses should occur when specified, allowing control over speech pacing and rhythm.
Actual Behavior:
All SSML tags are ignored. The voice reads continuously without any pauses, making the output sound rushed and unnatural when listing multiple items.