This paper “presents a high-level discussion of some problems in information retrieval that are unique to web search engines,” according to its abstract in the ACM library. (A reminder as to what this whole “Search Papers” thing is about: read this.) “The goal is to raise awareness and stimulate research in these areas,” it continues. How might such a lofty incitement be backed up? Well, it’s written by two senior employees of Google, Monika R. Henzinger and Craig Silverstein (I’ve met with Craig, he was employee #1 after Larry and Sergey, and a nice guy to boot), as well as Rajeev Motwani, a professor at Stanford (Craig was his graduate student).
The paper is dated September, 2002, so it does not rank as a missive from the early, more geeky phase of Google’s life, but rather a more corporate product – the two Google authors knew they bore the weight of “being Google” when they wrote this paper, and it’s worth keeping that in mind when reading through it.
This is particularly clear in the paper’s scope and focus. It lays out six challenges for search engines – and they read like a laundry list of Google’s headaches. The paper then goes on to offer suggested paths for more research on the topics, which I could imagine might read either as genuine or a tiny bit patronizing, depending on who you are. (The paper does not tackle a range of other issues it says are already the subject of abundant research – natural language queries, image/audio search, improving text-based retrieval, language issues, or interface/clustering, for example.)
(more in the extended entry, click link below)
]]>< 