Page 1 of 1
Uncontrollable Prosody
Posted: Thu Aug 30, 2007 10:16 am
by moshe
I've tried setting prosody for both Cepstral (David) and Tom (RealSpeak) voices; neither one works correctly.
<prosody rate="1.0"> produces output that's ludicrously slow.
<prosoday rate="medium"> produces output that's ludicrously fast.
I'm setting the voicename property to change voices globally, I'm running this on my hosted developer account, and I'm not explicitly setting the language.
This problem smells as if I'm only changing the gender of the speaker but I'm not actually getting the Cepstral or RealSpeak TTS engines, and instead I'm getting generic known-problem AT&T.
IVR issue for Uncontrollable Prosody
Posted: Thu Aug 30, 2007 10:35 am
by support
Hi,
When you first log in to your IVR account and go into your "Account" page, it should tell you what TTS engine you are using. If your TTS engine is not set to either Realspeak or Cepstral, you cannot test the voices of David (Cepstral) or Tom (Realspeak).
Also, Realspeak and Cepstral are only offered for on-site IVR hosting systems.
For more information on the IVR tag, <prosody>, click here:
http://www.plumvoice.com/docs/dev/voicexml:tags:prosody
Regards,
Plum Support
Posted: Thu Aug 30, 2007 11:02 am
by moshe
"on-site" hosting means -- what, exactly? My premises, or your hosting service?
Also, is it possible to request a switch to Cepstral or RealSpeak for use on the developer's platform?
IVR hosted account
Posted: Thu Aug 30, 2007 11:46 am
by support
Hi,
Sorry for the confusion. Let me try to clarify. I meant to say "on-site IVR system" earlier instead of "on-site IVR hosting systems".
AT&T Natural Voices is the only TTS engine supported on the Plum IVR Hosting site. It is not possible to get this TTS engine switched to Cepstral or Realspeak if you have an hosted account.
However, we do offer alternate TTS options for on-site IVR systems. If you are interested in on-site IVR systems, you should contact your sales representative.
Regards,
Plum Support
Posted: Thu Aug 30, 2007 2:31 pm
by moshe
Even with the AT&T voices, I'm having quite a bit of trouble:
* The prosody seems to ignore the rate I set. If i use
<prosody rate="5">
I get very slow; if I use rate="medium" I get very fast; if I set rate="slow" I get slow (but not as slow as rate="1", which is supposed to set the rate to its base rate.)
* If I set prosody, the voice is always male. Even if I set the voicename property to an AT&T female voice (e.g., "julia"), I still get a male voice.
Any suggestions?
Use commas before each digit to slow down speech in IVR scr
Posted: Thu Aug 30, 2007 3:22 pm
by support
Hi,
If you had digits in your IVR script and wanted digits to be read slower, you could use commas before each digit to slow the speech down.
AT&T Natural Voices has set its
<prosody> tag in this way and we have no control over the rate of speed that the speech comes back at. However, as for the voice coming back male when using the <prosody> tag, we were not able to reproduce this IVR problem. We have tested the <prosody> tag with "julia" and "crystal" and were getting back female voices.
Regards,
Plum Support
Posted: Thu Aug 30, 2007 5:19 pm
by moshe
Hi,
When I set the name of the voice using the voicename property, I get this effect of only male voices. I haven't tried the name attribute of the prosody element; I'll try that next, thanks.
IVR code for testing names with <prosody>
Posted: Fri Aug 31, 2007 8:49 am
by support
Hi,
When testing voice names with
<prosody>, we implemented this IVR code:
Code: Select all
<?xml version="1.0"?>
<vxml version="2.0">
<form>
<block>
<prompt>
This sentence uses the default prosody settings.
<prosody volume="25.0" rate="slow">
<voice name="julia">
This sentence is slow and quiet.
</voice>
</prosody>
<prosody volume="100.0" rate="medium">
<voice name="julia">
This sentence is medium and medium.
</voice>
</prosody>
<prosody volume="200.0" rate="fast">
<voice name="julia">
This sentence is fast and loud.
</voice>
</prosody>
</prompt>
</block>
</form>
</vxml>
This was able to produce female voices within the IVR tag, <prosody>.
Regards,
Plum Support
Posted: Fri Aug 31, 2007 9:35 am
by moshe
Thank you very much for this example.
According to plumVoice's Voice manual,
http://www.plumvoice.com/docs/dev/devel ... erence:tts, the prosody element is a child of voice, not the other way around as in your example. If I recall correctly, and I will have to check, when I use voice as a parent and prosody as a child I ran tinto trouble.
IVR clarification for <voice> and <prosody> tags
Posted: Fri Aug 31, 2007 10:11 am
by support
Hi,
<voice> and
<prosody> are both IVR parent and children tags of each other. For an IVR example, I could add a <voice> tag inside of the prompt tag like so:
Code: Select all
<?xml version="1.0"?>
<vxml version="2.0">
<form>
<block>
<prompt>
<voice name="julia">
This sentence uses the default prosody settings.
<prosody volume="25.0" rate="slow">
<voice name="julia">
This sentence is slow and quiet.
</voice>
</prosody>
<prosody volume="100.0" rate="medium">
<voice name="julia">
This sentence is medium and medium.
</voice>
</prosody>
<prosody volume="200.0" rate="fast">
<voice name="julia">
This sentence is fast and loud.
</voice>
</prosody>
</voice>
</prompt>
</block>
</form>
</vxml>
Now, the <voice> tag is the parent of the <prosody> tags. However, because of limitations within the AT&T Natural Voices TTS engine, you still have to put <voice> tags within the <prosody> tags in order to hear the female voice.
Hope this sample IVR code clears up any confusion.
Regards,
Plum Support
IVR code for better way to use <prosdoy> tag
Posted: Wed Sep 12, 2007 9:21 am
by support
Hi,
There's a better way to control your voice rate using the
<prosody> tag. The "rate" attribute can also be set to an integer value such as "100.0" or "50.0". A normal voice rate should be set to around "150.0". Additionally, you can also adjust the voice rate by using percentages. To increase the rate you could type "+50%" to make the voice rate 50% faster or "-50%" to make the voice rate 50% slower.
For an IVR example:
Code: Select all
<?xml version="1.0"?>
<vxml version="2.0">
<form>
<block>
<prompt>
This sentence uses the default prosody settings.
<prosody volume="25.0" rate="50.0">
<voice name="julia">
This sentence is slow and quiet.
</voice>
</prosody>
<prosody volume="100.0" rate="150.0">
<voice name="julia">
This sentence is medium and medium.
</voice>
</prosody>
<prosody volume="200.0" rate="250.0">
<voice name="julia">
This sentence is fast and loud.
</voice>
</prosody>
</prompt>
</block>
</form>
</vxml>
This should help you control the voice rate better using the IVR tag, <prosody>.
Regards,
Plum Support
Rate is not according to spec
Posted: Wed Sep 12, 2007 9:50 am
by moshe
Just as a matter of record, according to the SSML spec, rates are specified relative to 1. (2 is 200%, 0.5 is 50%).
This is a gotcha for people who are new to AT&T's interpretation of the spec. I originally tried setting the rates using the SSML values, and as you might imagine the results were briefly entertaining.
Thank you for this update.