Can I convert arbitrary, free speech to text?

Posted: Fri Jun 13, 2003 1:55 pm
by support

The Plum Voice Platform supports state of the art automatic speech recognition. However, the state of the art -- from Plum or any vendor -- does not support reliable, speaker independent transcription of arbitrary speech.

Some speech recognition systems are tuned to a particular speaker and require extensive training by that speaker. The Plum Voice system and most telephony systems are speaker independent systems that are designed to work without training the system for each individual user.

Useful speaker independent speech recognition systems work by constraining and validating user speech. This is done by:

1. Carefully constructing the voice user interface to prompt the user to say things in a form that is expected by the system. e.g. "What month would you like to travel?" rather than "When would you like to travel?"

2. Validating user input against a finite list of responses. e.g. "Say the name of the person you would like to reach," is constrained to the list of names in the system, perhaps the list of employees at a company site.

Reliable, speaker independent recognition of arbitrary speech including names and addresses that is not constrained this way is not possible with today's state of the art. Note that even human operators have trouble with names and addresses and those with unusual names or spellings are accustomed to spelling their names for human operators. We can't expect machines to do better.

Also, note that letters are also easily confused by human listeners. It's no accident that pilots use words like "Victor Tango Charlie Bravo" to spell out letters.