Page 1 of 1
Audio files caching, audio formats and quality related query
Posted: Fri Sep 25, 2009 4:41 am
We find that the three 8khz file formats available for locally hosting audio files on Plum give inadequate sound quality, and we want to use mp3 format that reproduces voices better. We find that we can host audio files on our own server and that these play fine on Plum. We’ve been informed that, to counter concerns we have of latency in playing the files in an IVR solution, these files can be cached by Plum. The settings for this caching can be set to enable continuous caching for long periods of time, not just for a few days, but even months.
I want to make sure that the above are true. Further, I want to ensure that the number and size of files that we have—400 to 500 audio files that are up to 1 mb each—can be accommodated through caching on the Plum Voice system. If there are better solutions to get around the audio quality or play latency issues, these would also be of interest.
IVR audio quality, fetching, and caching issues
Posted: Fri Sep 25, 2009 12:55 pm
Let's first tackle the audio quality issue. Every single audio file played over the phone (PSTN or VOIP) must be encoded at 8Khz 8-bit u-law audio. In order for our IVR servers to start playing a high quality mp3 file we first have to download the file from your IVR server, convert it to a wav file, downsample the file to 8Khz, and then convert it to 8-bit u-law audio. Since what is playing over the phone is an 8Khz 8bit u-law audio file, if you are able to hear any difference between a ulaw file and an mp3 file when calling into our IVR platform it is because your audio conversion process is introducing artifacts that our IVR platform does not. Most often downsampling is the biggest source of audio quality issues. The best way to avoid this is to have all audio be recorded directly at 8Khz. If you do have preexisting audio files that need to be downsampled the software utility SoX performs high quality conversions.
Due to this fetch/convert/downsample/convert process, by using mp3 files you are creating the worst case scenario possible for audio latency. Encoding your audio as 8Khz 8bit u-law will give you the lowest possible latency with minimal file sizes and no conversion process. If low latency is your primary concern then mp3s should never be used.
Regarding caching, what you stated is not entirely accurate. The caching system is designed to minimize fetch latency for frequently used files, and there is a substantial amount of drive space given to each server to minimize exercising infrequently used files. However, there are no guarantees that all of your audio files will be stored in all of our server caches at all times. You are on a multi-server shared system that caches other customers files as well. If you call into a server and the next audio file that needs to be played is either not in the cache or has been expired from the cache then that server will fetch and cache your audio file.
The best way to minimize latency is to follow the audio format mentioned above and to make sure you set the IVR global property ,"audiomaxage
" to the highest possible value (currently 604800) in all of your documents. This will minimize latency for audio files not yet cached and tell the cache to only refetch the files that are more than a week old.