I’ve just finished reading A Taxonomy of Web Search by Andrei Broder, written largely while the author was CTO of Alta Vista (and using AV query data), and published after he moved to IBM Research in 2001.
The paper has a trove of references to other papers, which is good for my work, and it has a singular thesis: that all web searches are not equal. Broder sets out to dispel the notion that all searches are “informational” in nature. He instead maintains that many are “transactional” or “navigational” in nature. These two seemingly obvious categories are in fact relatively new to the academic field of Information Retrieval (IR), which developed largely in the context of large islands of data (ie, in the 70s/80s), rather than in the web era.
What I like about this paper is the use of the word “intent” – which over the years I’ve come to use quite a bit (see my last column on video advertising over the internet, in which I rant once again on “intent over content”, or my post on The Database of Intentions). Intent is behind every kind of search, Broder says, but “there is no assumption … that this intent can be inferred with any certitude from the query.” Ay, there’s the rub….To get to that intent, Broder employed a short survey on the site.
A few fun facts from Broder’s analysis of response and related log data:
– nearly 15% of searchers wish for “a good collection of links on a subject” as opposed to “a good document.”
– 12% of queries in the log data used were sexual in nature
– nearly 25% of searchers were looking for “a specific website that I already had in mind.”
– An estimated 36% of searchers were looking for transactional information – what Broder calls “the intent to perform some web-mediated activity.”
Broder concludes that the next generation of search engines will need to take into account this new taxonomy of intent – transactions, navigation, as well as informational. Given that this paper was published in late 2001, it’s interesting to see how the major engines already are on that path – with Yahoo’s focus on shopping being one of the best examples.
7 thoughts on “The Search Papers: Defining Intent”
I think that web taxonomy has changed because war between web search engines for a high $market, i will check yesterday store that have support for it devices as terminals Barcode Scanners Printers but i searching a solution to generate barcode tags from flash interface.
I RECEIVED A COPYRIGHT FOR MY POEM fOOTPRINTS IN THE SAND. I THINK IT WAS FOR APRIL 25TH, 1978 OR 1979. THE COPYRIGHT PEOPLE KEEP SAYING THEY CANNOT FIND IT. I DISTINCTLY REMEMBERING I RECEIVED IT. NEED HELP TO FIND IT. IT IS LIKE THE ARCH OF THE COVENANT STORED AWAY SOMEWHERE IN THE BOWELS OF THE LIBRARY OF CONGRESS. PLEASE HELP ME.
The interesting thing I get from those statistics is that people who search really don’t know what the hell they really want (at least at the beginning). I think more and more searches are being geared towards generally educating the searcher as opposed to “getting the right answer now”.
thank you for interesting info
If i search someting in Venezuela, i use google.co.ve but almost the most site about venezuela i found are from us companies, without any intres in the county only making money with adsense and co. US dominaz everywhere in this world, I shot the sheriff 🙂
An estimated 36% of searchers were looking for transactional information – what Broder calls “the intent to perform some web-mediated activity.” im not sure that this is correct because, need more information statisticals. For example when a people search a term impresora pvc only wayback and another solution have the form to know the stats, because search engine drive more wanted text.
Being with the aim of this was intended as melody correlated drum blog, I by no means concept with the aim of I would be reviewing a forex indicator produce on behalf of the fabulous Metatrader 4 platform. But as an vigorous futures and currency day of the week trader, I often unearth correlations concerning the art of sound and the atmosphere in the marketplace.