free html hit counter The Reshuffle Button | John Battelle's Search Blog

The Reshuffle Button

By - February 14, 2004

The last entry on Yahoo’s new search got me thinking about search results, and in particular Google’s, which nearly everyone imitates in one form or another. We all know about the endless list of results, 10 to a page, stretching past what Tim Bray calls “the Google event horizon.” I used to think that horizon was 100 or so entries – no one will ever look further than that. But the truth is, it’s usually one page of listings, if not less.

I’ve gotten to thinking – what’s the use of having all those results? I mean, really, from a user interface point of view, the only information we gain from “Results 1 – 10 of about 3,950,000″ is the rather attenuated sense that the search engine is, in fact, pretty darn thorough. That used to be a big deal, back when engines were really crappy. But these days we expect engines to be thorough. What’s the point of giving me a list of more than 3 million results when I am never, ever, ever going to go through them?

Seems to me it’s time to change the interface. Clearly many others have thought about this, from Grokker to Mooter to Vivisimo and beyond. But it’s the big guys, Google and Yahoo, that make the standards, and I think we’re getting close to the point where a new user interface paradigm is needed for search. Danny talks about invisible tabs, and that’s a good idea. But I’m not talking about intuiting what the user wants – that’s the hard stuff, and I know there are plenty of PhDs working on that. I’m talking about something much less difficult – changing the way results we get are presented.

Here’s what I’d like to see, as a small step in a new direction: A button that I can hit when the results come up which reshuffles the search in an intelligent way. In a fit of originality, I’ll call it the “reshuffle” button. Show me the first ten pages, and only those first ten. Just as I do now, I’ll scan them. If there’s nothing there, I’ll hit “Reshuffle”, and the engine shows me another 10 results, only this time, it eliminates pages that are similar to the ones it showed me before. This way, you can quickly and intuitively sift through all those results, grokking and pruning your search as you go. This is not some massively new visual approach, it’s just a quick hack that allows me to drill down. It’s this kind of stuff, I think, the simple stuff, which ends up being the most elegant and useful. I know there’s much work to be done, there are plenty of NP-hard problems to solve in search (I know because I’m trying to grok them and write them up in plain english for my book). But solving those problems will take years.

In my discussions with folks at Google and elsewhere, I often hear a resistance to changing search approaches due to technical reasons – clustering, for example, is not used at Google because the results are not considered relevant enough. But what about the user interface for results? The most frustrating thing in the world is seeing “Results 1 – 10 of about 3,950,000″ and knowing that somewhere in that haystack is your needle. But why sift blindly through the event horizon? Maybe some UI innovation on top of the current results can help.

(While I’m ranting, I’d like the engine to suggest better query terms for me. It can’t be that hard to store user queries and cluster those which have similar constructions, query words, or results/paths taken. I’d like to hit a button that says “show me similar searches.” I think this exists somewhere, but I can’t remember where (yeah, I know about Direct Hit, that’s not what I mean exactly). It’s not quite collaborative filtering, but it points that way.)

Any readers out there know about tools or research that does some of what I’m on about, or a response as to why UI innovation is a bad/too difficult idea?

Related Posts Plugin for WordPress, Blogger...
  • http://www.stapleton-gray.com Ross Stapleton-Gray

    > While I’m ranting, I’d like the engine to suggest better query terms for me. It can’t be that hard to store user queries and cluster those which have similar constructions, query words, or results/paths taken.

    “You have chosen ‘fiscal responsibility.’ More Americans have chosen ‘war on terrorism.’ Use that to reform your query? (Y/N/Escape to Canada)”

  • http://battellemedia.com John Battelle

    Gary Price has wonderful comments on my initial post. If you read to the end, you’ll learn a lot…

    (reposted here with his permission)

    ohn, a couple of comments about your post. I hope they are useful to you. If
    you think it’s worthy, feel free to repost on your site.

    cheers,
    gary

    —-

    1) For most web search engine total number of hits are junk and mean very
    little.First, they are far from accurate. Second, they also include duplicate
    and near duplicates. Third, they also mean nothing because most search engines
    will only show you about 1,000 results (if you would even look that far down
    the list). In other words, they are really just a promotion tool to make the
    use think their search is “seeing it all.”

    2) Don’t forget that you’re only seeing what the web crawler can find. I don’t
    need to tell you that a great deal of material is not on the open web. It might
    be in a fee-based system or reside on the invisible or opaque webs. How much
    time are people wasting searching the open web for something that doesn’t exist
    or isn’t current?

    In many cases, the searcher can save time and aggravation by using a specialized
    database that is focused on a specific topic. Instead of searching the massive
    open web (feel free to take a guess at its size) but instead direct the user to
    a speciality database (smaller in size) and then run the search on it. This is
    another approach to increase the precision of your search and lower the recall.
    Said another way, a bigger database doesn’t always mean better results.

    In the library world these days a great deal of interest is in what was once
    called “federated search” but has been renamed meta search. This is not the
    same way the web world thinks of metasearch. These products will allow an
    organization to build categories of search tools, for example “Business” or
    “International Relations” or “Biomechanical Engineering.” Then, the search
    technology will parse the query for each underlying database, run the search,
    remove duplicates, and merge the results onto a results page that has also been
    designed by the local organization. In other words, you can present the results
    in a way that will best serve the user group.

    This paper (which uses the another term for the technology “common user
    interface”) reviews a great deal of the technology:
    http://www.natlib.govt.nz/files/CUI_Report_Final.pdf

    This technology will work with open web resources (Google), specialized and
    Invisible Web databases (ResearchIndex), and proprietary info systems (what a
    library or company might pay for (LexisNexis). Because some of these products
    are not dependent on any standard, advanced search features can be easily
    mapped to the search interface.

    3) A new user interface from Yahoo (http://search.yahoo.com) allows the user to
    pick which tabs/databases will be visible on their Yahoo search page. Of
    course, many people don’t use the tabs.

    4) I think something that is very interesting/exciting is Ask.Com’s Smart Answer
    technology. Where instead of just getting links it takes a stab at an answer
    and places it at the top of the results page. This is not using the Jeeves old
    idea of pre-supposing question/answer sets but rather by using NLP.

    For example:
    What is the capital of Spain.
    http://web.ask.com/web?q=what+is+the+capital+of+spain&o=0&qsrc=0

    Who won the academy award for best actor in 1972.
    http://tinyurl.com/2x5en

    5) Suggesting better query terms? AltaVista has had Prisma for a couple of
    years.
    http://www.altavista.com/web/results?q=battelle+media&kgs=0&kls=1&avkw=aapt

    Teoma does a good job by using the names of “user communities” to help narrow
    and focus.

    Gigablast just launched Giga Bits.
    http://www.gigablast.com/search?q=john+battelle+journalist

    Also, Yahoo’s SmartSort is very useful and works well at getting the user to
    what he or she is looking for.
    http://shopping.yahoo.com/smartsort

    Reading:
    This article by Silverstein and Henzinger of Google and a Stanford associate
    discusses several problems that engines face getting the material into the
    database in the first place.
    “Challenges in Web Search Engines”
    http://www.acm.org/sigs/sigir/forum/F2002/henzinger.pdf

    This White Paper from Vivisimo is also interesing reading.
    “Needed: A More Selective Ignorance”
    http://vivisimo.com/docs/overlook.pdf

    cheers,
    gary

  • http://www.constellationw3.com/carnet/ Anonymous

    At first thought I was thinking at Similar Pages at Google. But this functionnality is corrupt by the “Portfolio” or “My clients”‘s pages of sites where you can have a lot of links to very differents type of customers.

    I think you should look, and use for a while, http://www.Kartoo.com (one of the first engine to make clustering) and is new Kapitalyser who keep a history of your searches and learn about it. They should re-released a plain Html version soon.

  • Kevin

    I think the benefit of knowing that 3,000,000 versus say 300 items matched your query is it gives you a sense of how useful it would be to refine your query.

    In practice, if you search for “REM” and find 43,000,000 results, it’s an indication that maybe you should refine your search to “REM music”. On the other hand, if you search for “Giraffe osteoporosis” and get 34 results, you have a good idea that if the first 10 don’t suit you, refinement probably won’t help very much.

    Think of adding a second search term as a user-directed reshuffling.

  • Lorraine

    I’m currently trying to find a way to communicate your “reshuffle” function (it’s a very similar algorithm, except you keep one result and reshuffle the rest accordingly).
    Any suggestions for an – very small – icon?
    It needs to displayed per result.
    It’s a real nail-bighter.