Had a good talk today with Jon Kleinberg, professor at Cornell who some credit with work that inspired PageRank, though he’s far too modest to accept that mantle. He says he’s proud that in academic citations, his work on hubs and authorities is cited alongside PageRank as seminal to the current state of web search. While talking to Kleinberg was great for the historical perspective of my book (he was at IBM Almaden in the 96/97 timeframe, near Stanford, working on very similar stuff) it was also very interesting to hear his views on where search might be going.
He agrees with the consensus view that search is in its early days. The really hard problems – natural language queries, for example, have yet to be solved. “It’s kind of interesting to see how far search has gotten without actually understanding what’s in the document,” he noted. In other words, search has gotten pretty sophisticated using keyword matching, and link/pattern analysis. But search technology still has no idea what a document actually *means* – in the human sense.
Kleinberg outlined one of his core frustrations with search engines, one I am sure all readers have experienced: the inverse search. In this scenario, you know there is a core term or phrase that, if typed into Google, would yield exactly the set of pages you’re looking for. But you don’t know the term, and your attempts to divine it continually bring up frustrating and non-relevant results. Say, for example, you want to know more about that regulation that you’ve heard about, the one that says you have the right to fly – with no additional charge – on a different airline if the one you are on cancels your flight. You want to find out the specifics of that regulation, but how?
Read More


