TMCnet News
UM and Speech Recognition Make Voice Messaging More Manageable“Unified messaging” (UM) was always viewed as simply enabling message recipients to manage and retrieve both text and voice messages across device interfaces. Email text messages could be proactively delivered to or retrieved from any phone through text-to-speech technology, giving such email messages some of the real-time flexibility and convenience of voice mail.
Similarly, recipients of voice mail messages benefited from UM through integration with email desktop clients, where the screen interface was extended to show all the voice (and fax) messages in the user’s voice mailbox. This visual interface for voice mail enabled more efficient random access message retrieval, rather than the time consuming, sequential playback of messages that the Telephone User Interface (TUI) offered. Voice message content, however, still had to be listened to through traditional playback controls and important information content still had to be transcribed manually for practical use.
Speech recognition technology, however, has now matured enough to enable voice mail messages to benefit from convergence with email visual interfaces, similar to how email benefited from text-to-speech (TTS) retrieval of email over the phone. Now, voice message retrieval can be managed with the productivity efficiencies of text and visual interfaces, rather than a voice-oriented TUI.
Using speech recognition to convert voice messages to text is starting to show up as new subscriber offerings by service providers such as CallWave (News - Alert) and Vonage, so it won’t be long before this capability will make its way into enterprise UM by traditional communication technology providers and new players like SpinVox (News - Alert) and TalkText,
Voice Message vs. Text Message Benefits to the Users
Voice mail systems provide important benefits to telephone callers by enabling them to leave a message if there is no answer or the line is busy. By recording a voice message, costs are minimized and message integrity is retained because it is not being transcribed or delivered through a third-party person.
While voice messages have been highly touted as being more “personal,” providing more emotional content (tone of voice) and accuracy (name pronunciation) to the recipient, informational content is really not greater than person-to-person text messages. So, just because a caller uses the convenience of voice to create a message, it doesn’t necessarily mean that they want the message to be delivered in voice.
As a message originator, a caller should have the option of having a voice message they create in voice delivered as text, especially if it will contain data that has to be converted to text to be useful. As business users become more mobile and start using multimodal “smartphones” with visual interfaces, the practicality of creating a message with voice that gets delivered as text and vice versa will become viral.
By the same token, the option to transcribe voice messages into text can also be controlled by the recipient depending upon a number of factors pertinent to the recipient’s needs. So, although the caller still leaves a voice message, it gets delivered as text.
Pros and Cons of Converting a Voice Message to Text
The recipient of a voice message that has been converted to text will realize a number of productivity benefits including:
To offset these benefits, there are some disadvantages to converting a voice message to text, including:
Reducing Enterprise Overhead for Voice Messages
Voice messaging has always paid a penalty in terms of system resources it consumes. These penalties include:
Since the overhead for recording, transmitting, storing, and retrieving a voice message is much higher than a text version of the message, we do see benefits to both end user communication productivity and to the enterprise TCO from exploiting speech recognition for voice to text messaging. Such capabilities will also integrate nicely with new UC capabilities for integrating with real-time IM or initiating telephone call back responses to such messages.
Voice to text message conversion won’t solve all the problems of efficient business communications and is highly dependent on speech recognition performance. The more critical the need for accurate and personalized contacts, the less likely that shortcuts will be acceptable.
What Do You Think?
Send your comments to me at [email protected].
News From UC Strategies
To get an idea of the different perspectives and issues involved with implementing UC technologies, go to the UC Strategies Web site for better insights on migrating the enterprise to UC.
You can also review the presentations given by the UC Strategies experts at TMC’s (News - Alert) IT Expo. |