free html hit counter It's the Data, Indeed | John Battelle's Search Blog

It's the Data, Indeed

By - April 13, 2007

Tim posits a theory as to why Google is building at 411 service; To get the data that TellMe and Nuance has. Indeed!

… it also seems to me that there’s a hidden story here about the speech recognition itself. I was talking recently to Eckart Walther of Yahoo!, who used to be at Tellme, and he pointed out that speech recognition took a huge leap in capability when automated speech recognition started being used for directory assistance. All of a sudden, there were millions of voices, millions of accents to train speech recognition systems on, and much less need for the individual user to train the system.

This is reminiscent of a comment that Peter Norvig, Director of Research at Google, made to me last year about automated translation, and why it’s getting better. “We don’t have better algorithms. We just have more data.”

In short, I’m speculating that the 1-800-GOOG-411 service is designed to harvest voice data to build Google’s own speech database, rather than licensing from Nuance or another player.


Related Posts Plugin for WordPress, Blogger...

2 thoughts on “It's the Data, Indeed

  1. komedi says:

    thanks,very good informations…

  2. Perry says:

    John, I picked up on the same comment. It is a much more strategic (and viable) picture of motive. Being “in DA” isn’t the point – being THE 411 in a truly holistic picture is much more to the point.