We've Moved! Please visit our new and improved forum over at our new portal: https://portal.plumvoice.com/hc/en-us/community/topics

Uncontrollable Prosody

Questions and answers about IVR programming for Plum DEV

Moderators: admin, support

Post Reply
moshe
Posts: 66
Joined: Wed Aug 15, 2007 5:36 pm
Location: Chicago
Contact:

Uncontrollable Prosody

Post by moshe »

I've tried setting prosody for both Cepstral (David) and Tom (RealSpeak) voices; neither one works correctly.

<prosody rate="1.0"> produces output that's ludicrously slow.

<prosoday rate="medium"> produces output that's ludicrously fast.

I'm setting the voicename property to change voices globally, I'm running this on my hosted developer account, and I'm not explicitly setting the language.

This problem smells as if I'm only changing the gender of the speaker but I'm not actually getting the Cepstral or RealSpeak TTS engines, and instead I'm getting generic known-problem AT&T.

support
Posts: 3632
Joined: Mon Jun 02, 2003 3:47 pm
Location: Boston, MA
Contact:

IVR issue for Uncontrollable Prosody

Post by support »

Hi,

When you first log in to your IVR account and go into your "Account" page, it should tell you what TTS engine you are using. If your TTS engine is not set to either Realspeak or Cepstral, you cannot test the voices of David (Cepstral) or Tom (Realspeak).

Also, Realspeak and Cepstral are only offered for on-site IVR hosting systems.

For more information on the IVR tag, <prosody>, click here:
http://www.plumvoice.com/docs/dev/voicexml:tags:prosody

Regards,
Plum Support
Last edited by support on Wed Feb 24, 2010 4:45 pm, edited 4 times in total.

moshe
Posts: 66
Joined: Wed Aug 15, 2007 5:36 pm
Location: Chicago
Contact:

Post by moshe »

"on-site" hosting means -- what, exactly? My premises, or your hosting service?

Also, is it possible to request a switch to Cepstral or RealSpeak for use on the developer's platform?

support
Posts: 3632
Joined: Mon Jun 02, 2003 3:47 pm
Location: Boston, MA
Contact:

IVR hosted account

Post by support »

Hi,

Sorry for the confusion. Let me try to clarify. I meant to say "on-site IVR system" earlier instead of "on-site IVR hosting systems".

AT&T Natural Voices is the only TTS engine supported on the Plum IVR Hosting site. It is not possible to get this TTS engine switched to Cepstral or Realspeak if you have an hosted account.

However, we do offer alternate TTS options for on-site IVR systems. If you are interested in on-site IVR systems, you should contact your sales representative.

Regards,
Plum Support
Last edited by support on Wed Feb 24, 2010 4:45 pm, edited 4 times in total.

moshe
Posts: 66
Joined: Wed Aug 15, 2007 5:36 pm
Location: Chicago
Contact:

Post by moshe »

Even with the AT&T voices, I'm having quite a bit of trouble:

* The prosody seems to ignore the rate I set. If i use

<prosody rate="5">

I get very slow; if I use rate="medium" I get very fast; if I set rate="slow" I get slow (but not as slow as rate="1", which is supposed to set the rate to its base rate.)

* If I set prosody, the voice is always male. Even if I set the voicename property to an AT&T female voice (e.g., "julia"), I still get a male voice.

Any suggestions?

support
Posts: 3632
Joined: Mon Jun 02, 2003 3:47 pm
Location: Boston, MA
Contact:

Use commas before each digit to slow down speech in IVR scr

Post by support »

Hi,

If you had digits in your IVR script and wanted digits to be read slower, you could use commas before each digit to slow the speech down.

AT&T Natural Voices has set its <prosody> tag in this way and we have no control over the rate of speed that the speech comes back at. However, as for the voice coming back male when using the <prosody> tag, we were not able to reproduce this IVR problem. We have tested the <prosody> tag with "julia" and "crystal" and were getting back female voices.

Regards,
Plum Support
Last edited by support on Fri Feb 19, 2010 4:46 pm, edited 3 times in total.

moshe
Posts: 66
Joined: Wed Aug 15, 2007 5:36 pm
Location: Chicago
Contact:

Post by moshe »

Hi,

When I set the name of the voice using the voicename property, I get this effect of only male voices. I haven't tried the name attribute of the prosody element; I'll try that next, thanks.

support
Posts: 3632
Joined: Mon Jun 02, 2003 3:47 pm
Location: Boston, MA
Contact:

IVR code for testing names with <prosody>

Post by support »

Hi,

When testing voice names with <prosody>, we implemented this IVR code:

Code: Select all

<?xml version="1.0"?>
<vxml version="2.0">
    <form>
        <block>
            <prompt>
                This sentence uses the default prosody settings.
                <prosody volume="25.0" rate="slow">
                    <voice name="julia">
                      This sentence is slow and quiet.
                    </voice>
                </prosody>
                <prosody volume="100.0" rate="medium">
                    <voice name="julia">
                      This sentence is medium and medium.
                    </voice>
                </prosody>
                <prosody volume="200.0" rate="fast">
                    <voice name="julia">
                      This sentence is fast and loud.
                    </voice>
                </prosody>
            </prompt>
        </block>
    </form>
</vxml>
This was able to produce female voices within the IVR tag, <prosody>.

Regards,
Plum Support
Last edited by support on Wed Feb 24, 2010 4:46 pm, edited 4 times in total.

moshe
Posts: 66
Joined: Wed Aug 15, 2007 5:36 pm
Location: Chicago
Contact:

Post by moshe »

Thank you very much for this example.

According to plumVoice's Voice manual, http://www.plumvoice.com/docs/dev/devel ... erence:tts, the prosody element is a child of voice, not the other way around as in your example. If I recall correctly, and I will have to check, when I use voice as a parent and prosody as a child I ran tinto trouble.

support
Posts: 3632
Joined: Mon Jun 02, 2003 3:47 pm
Location: Boston, MA
Contact:

IVR clarification for <voice> and <prosody> tags

Post by support »

Hi,
<voice> and <prosody> are both IVR parent and children tags of each other. For an IVR example, I could add a <voice> tag inside of the prompt tag like so:

Code: Select all

<?xml version="1.0"?>
<vxml version="2.0">
    <form>
        <block>
            <prompt>
              <voice name="julia">
                This sentence uses the default prosody settings.
                <prosody volume="25.0" rate="slow">
                    <voice name="julia">
                      This sentence is slow and quiet.
                    </voice>
                </prosody>
                <prosody volume="100.0" rate="medium">
                    <voice name="julia">
                      This sentence is medium and medium.
                    </voice>
                </prosody>
                <prosody volume="200.0" rate="fast">
                    <voice name="julia">
                      This sentence is fast and loud.
                    </voice>
                </prosody>
              </voice>
            </prompt>
        </block>
    </form>
</vxml> 
Now, the <voice> tag is the parent of the <prosody> tags. However, because of limitations within the AT&T Natural Voices TTS engine, you still have to put <voice> tags within the <prosody> tags in order to hear the female voice.

Hope this sample IVR code clears up any confusion.

Regards,
Plum Support
Last edited by support on Fri Feb 19, 2010 4:48 pm, edited 4 times in total.

support
Posts: 3632
Joined: Mon Jun 02, 2003 3:47 pm
Location: Boston, MA
Contact:

IVR code for better way to use <prosdoy> tag

Post by support »

Hi,

There's a better way to control your voice rate using the <prosody> tag. The "rate" attribute can also be set to an integer value such as "100.0" or "50.0". A normal voice rate should be set to around "150.0". Additionally, you can also adjust the voice rate by using percentages. To increase the rate you could type "+50%" to make the voice rate 50% faster or "-50%" to make the voice rate 50% slower.

For an IVR example:

Code: Select all

<?xml version="1.0"?>
<vxml version="2.0">
    <form>
        <block>
            <prompt>
                This sentence uses the default prosody settings.
                <prosody volume="25.0" rate="50.0">
                    <voice name="julia">
                      This sentence is slow and quiet.
                    </voice>
                </prosody>
                <prosody volume="100.0" rate="150.0">
                    <voice name="julia">
                      This sentence is medium and medium.
                    </voice>
                </prosody>
                <prosody volume="200.0" rate="250.0">
                    <voice name="julia">
                      This sentence is fast and loud.
                    </voice>
                </prosody>
            </prompt>
        </block>
    </form>
</vxml>
This should help you control the voice rate better using the IVR tag, <prosody>.

Regards,
Plum Support
Last edited by support on Wed Feb 24, 2010 4:47 pm, edited 5 times in total.

moshe
Posts: 66
Joined: Wed Aug 15, 2007 5:36 pm
Location: Chicago
Contact:

Rate is not according to spec

Post by moshe »

Just as a matter of record, according to the SSML spec, rates are specified relative to 1. (2 is 200%, 0.5 is 50%).

This is a gotcha for people who are new to AT&T's interpretation of the spec. I originally tried setting the rates using the SSML values, and as you might imagine the results were briefly entertaining.

Thank you for this update.

Post Reply