Being Jon Kleinberg

Had a good talk today with Jon Kleinberg, professor at Cornell who some credit with work that inspired PageRank, though he's far too modest to accept that mantle. He says he's proud that in academic citations, his work on hubs and authorities is cited alongside PageRank as seminal to the…

Had a good talk today with Jon Kleinberg, professor at Cornell who some credit with work that inspired PageRank, though he’s far too modest to accept that mantle. He says he’s proud that in academic citations, his work on hubs and authorities is cited alongside PageRank as seminal to the current state of web search. While talking to Kleinberg was great for the historical perspective of my book (he was at IBM Almaden in the 96/97 timeframe, near Stanford, working on very similar stuff) it was also very interesting to hear his views on where search might be going.

He agrees with the consensus view that search is in its early days. The really hard problems – natural language queries, for example, have yet to be solved. “It’s kind of interesting to see how far search has gotten without actually understanding what’s in the document,” he noted. In other words, search has gotten pretty sophisticated using keyword matching, and link/pattern analysis. But search technology still has no idea what a document actually *means* – in the human sense.

Kleinberg outlined one of his core frustrations with search engines, one I am sure all readers have experienced: the inverse search. In this scenario, you know there is a core term or phrase that, if typed into Google, would yield exactly the set of pages you’re looking for. But you don’t know the term, and your attempts to divine it continually bring up frustrating and non-relevant results. Say, for example, you want to know more about that regulation that you’ve heard about, the one that says you have the right to fly – with no additional charge – on a different airline if the one you are on cancels your flight. You want to find out the specifics of that regulation, but how?

Read More
4 Comments on Being Jon Kleinberg

Queries Getting Denser

Via DMNews, saw this study from OneStat (a web analytics company) on query trends. It basically said that folks are starting to use more words in their queries. Why? They're not getting the results they want? They know more words will mean a better result? Little of both? Not much…

Via DMNews, saw this study from OneStat (a web analytics company) on query trends. It basically said that folks are starting to use more words in their queries. Why? They’re not getting the results they want? They know more words will mean a better result? Little of both? Not much here on that piece of the story.

7 Comments on Queries Getting Denser

Slowly, The Battleship Comes About

Verizon's Yellow Pages is finally getting into the pay-per-click game, Wonk reports via MediaPost. The massive phone co. plans to revamp its Superpages.com website to focus on the local advertising market. Funniest quote: the Verizon guy claiming this is not in response to Yahoo/Google handing them their ass. He does…

Verizon’s Yellow Pages is finally getting into the pay-per-click game, Wonk reports via MediaPost. The massive phone co. plans to revamp its Superpages.com website to focus on the local advertising market. Funniest quote: the Verizon guy claiming this is not in response to Yahoo/Google handing them their ass. He does have a pretty funny slap at the leader: “If you want to find out about the history of plumbing, you go to Google. But if your sink’s backed up at 2 a.m., we get you right to what you want to know.”

I dunno, but if my plumbing is backed up at midnight, something tells me turning on the computer is not first on my mind.

6 Comments on Slowly, The Battleship Comes About

Gigablast: One Man, 8 Machines, And Now Related Concepts

Gigablast now has related concepts at the top. In his blog, Matt Wells points out that his engine runs on eight desktop machines. Check it out. It's damn fast (and good) for eight machines….

Gigablast now has related concepts at the top. In his blog, Matt Wells points out that his engine runs on eight desktop machines. Check it out. It’s damn fast (and good) for eight machines.

3 Comments on Gigablast: One Man, 8 Machines, And Now Related Concepts

This Search Blows

Blowsearch aggregates 20 different engines and claims to be "fast as the wind." The site also has a toolbar that's got some buzz round the search community (link via Search Engine Lowdown)….

Blowsearch aggregates 20 different engines and claims to be “fast as the wind.” The site also has a toolbar that’s got some buzz round the search community (link via Search Engine Lowdown).

Leave a comment on This Search Blows

Winer: Orkut Is Google’s Identity System

Dave says with Orkut, Google is one step closer to being our trusted agent for ecommerce transactions, comment aggregation, and much more. He likens it to the Liberty Alliance, or MSFT Passport……

Dave says with Orkut, Google is one step closer to being our trusted agent for ecommerce transactions, comment aggregation, and much more. He likens it to the Liberty Alliance, or MSFT Passport…

Leave a comment on Winer: Orkut Is Google’s Identity System

Udell on Scylla and Charybdis

Over at Infoworld (thanks Matt) Jon Udell is working out what might be a neat hack between the full text approach to search found at most search engines, and the rather utopian approach of the fully structured semantic web. It involves, among other things, converting RSS feeds into XHMTL. Not…

Over at Infoworld (thanks Matt) Jon Udell is working out what might be a neat hack between the full text approach to search found at most search engines, and the rather utopian approach of the fully structured semantic web. It involves, among other things, converting RSS feeds into XHMTL. Not for the faint of heart, but an interesting angle in terms of grokking how useful search may evolve from the feed-o-sphere….

Leave a comment on Udell on Scylla and Charybdis

When Gary Price Writes…

…many folks listen. Gary is the Editor of Resourceshelf and a strong voice in cutting edge librarian/geek culture. In this piece, guest written for Pandia.com, Gary lists his top ten grips about Google. Many of them run along a theme which might best be summed up as failures to nurture…

…many folks listen. Gary is the Editor of Resourceshelf and a strong voice in cutting edge librarian/geek culture. In this piece, guest written for Pandia.com, Gary lists his top ten grips about Google. Many of them run along a theme which might best be summed up as failures to nurture the open, geek culture from which Google sprang.

Highlights:

1) Google needs to fix several advanced search problems. Many of them have been known for several months. These are things that should work….

Read More
1 Comment on When Gary Price Writes…

Sick of Orkut? Try Urkel

In his new book Steven Johnson points out how unusual it is for folks to laugh out loud when they are alone. This orkut parody did the job for me…thanks to Weinberger……

In his new book Steven Johnson points out how unusual it is for folks to laugh out loud when they are alone. This orkut parody did the job for me…thanks to Weinberger

6 Comments on Sick of Orkut? Try Urkel