MediaFile

I hear you, says AT&T

November 16, 2010

speechrecognition

Anybody who’s been at the wrong end of a automated customer service conversation may understandably have doubts about speech recognition technology. Personally I’ve been frustrated by systems that couldn’t understand something as basic as whether I’d answered “yes” or “no.”

But AT&T says that after working on speech recognition for more than 20 years, it’s come a long way, in improving  accuracy and in developing cool applications.

After years of profiting handsomely from touchscreen technology in the form of Apple Inc’s iPad, maybe voice will be the next hot mobile interface for the operator?

Of course it’s not saying if any of the ideas being cooked up in AT&T labs will actually become full-fledged services. One of its scientist types told me ”I don’t know and I don’t really care” in answer to such a question at a technology showcase today.

But maybe it’s telling that speech recognition was the main theme for the event. Here’s a sample of some demos:
    

What: Search by voice for video stored in your digital video recorder, or currently playing on TV
How: Using an iPad set up as a television remote control I searched for news videos by calling out keywords like Myanmar or “Irish debt crisis.” It didn’t always work but it mostly figured out what I was saying.
Why: Maybe it could make DVRs easier to use for the gadget-challenged or the lazy among us?

What: Search by voice for upcoming TV shows listed in your  programming guide. AT&T researcher Michael Johnston called out specific shows like “Deadliest Catch” or categories like “cooking shows” or the time specific “cooking shows on Wednesday evening” to come up with a list of options
How: Johnston made it look easy by simply talking into an app he had developed for the iPhone 4. Once the choices showed up the TV screen you could easily choose what you wanted.
Why: It seemed easier and more fun to use than this lazy reporter’s clunky and convoluted remote control. And, as Johnston pointed out, if designed correctly, it could help blind people navigate to hear shows that they like.
    
What: Translation of video speech into text and then to the language of your choice. The text from a TV show appeared as a crawler at the bottom of the screen. In English on the left and Spanish on the right.
Why: If you’re traveling abroad and want to understand what’s going on without having to learn a new language.  It’s not perfect said demonstrator David Thomson but he said, “It’s accurate enough you can get the gist.”

What: Analysing recordings of calls made to its customer service centersthat allows for searches of certain phrases or words in order to highlight worrying trends and eventually changing bad habits or identifying good practices that could be taught more widely.
Why: The company already uses this in order to help find ways to keep consumers happy with customer service and hopefully stop them from dropping AT& T services.

AT&T also promises that it is working hard on issues such as accuracy, by taking into account the differences between the voices of children, men and women as well as handling regional accents.  It also showed off instances where it might ask you a question twice so if it didn’t understand the first answer it might learn from the mistake and try again.
 Hopefully the customer sticks around long enough to answer the second question.

(Photo: Reuters of screen demonstration of TV guide search)

Post Your Comment

We welcome comments that advance the story through relevant opinion, anecdotes, links and data. If you see a comment that you believe is irrelevant or inappropriate, you can flag it to our editors by using the report abuse links. Views expressed in the comments do not represent those of Reuters. For more information on our comment policy, see http://blogs.reuters.com/fulldisclosure/2010/09/27/toward-a-more-thoughtful-conversation-on-stories/