I met Elizabeth van Couvering while working on the book. She’s published a paper titled Is Relevance Relevant? Market, Science, and War: Discourses of Search Engine Quality.
For your Friday reading pleasure. From the abstract:
Fairness and representativeness, core elements of the journalists’ definition of quality media content, are not key determiners of search engine quality in the minds of search engine producers. Rather, alternative standards of quality, such as customer satisfaction and relevance, mean that tactics to silence or promote certain websites or site owners (such as blacklisting, whitelisting, and index “cleaning”) are seen as unproblematic.
10 thoughts on “Search Paper: Is Relevance Relevant?”
Then how does she explain the rise of Wikipedia in SERPS? Wikipedia is known for trying to prompote a “Neutral Point of View,” which is geared directly towards fairness and representativeness. Google and other engines are very aware of this. I bet she would be surprised to find out the actual features that are used in determining site quality. The fact of the matter is that this information is Confidential, and she has no idea how they do it, except for snippets taken out of patents etc…
I’ve talked to folks inside Google, and it is my understanding that there is a certain amount of representativeness that they try to achieve in their rankings. In other words, they don’t just show the highest page-ranked sites. They try and preserve a certain degree of variety and diversity, alongside the popularity factor of the page-rank heuristic.
Or so I’ve heard. The problem is that there is so little openness in the whole process, that we really have no way of knowing. But I would like to believe that what I have heard is true.
The core premise of this paper is, in my view, somewhat elitist. Why should journalists be the ones determining what is “quality media content?” This should be (and in fact is) market-driven – which is precisely why New Media companies like Google have prospered in the last ten years while Old Media companies are getting their clocks cleaned. The bottom line is this: If anybody wishes to create a search engine returning only those search results that have been anointed by journalists as “fair and representative”, they are welcome to try – and let the marketplace judge the success or failure of their venture.
It was an interesting read, but I do disagree with several of Ms. van Couvering’s conclusions. Themes like diversity, fairness, and bias are mentioned in many of our discussions at Google.
Thanks for the comments, and thanks John for blogging. First of all I’d like to say that I don’t think that journalists should be choosing search content! Not at all – I simply think how results are constructed should be a matter for public discussion instead of being quite so confidential. Most people don’t understand at all how results are created, likening them to alphabetized lists, and they trust the results out of proportion to their relevance, as another paper in the same journal issue by Pan et. al. discusses. So making the process more transparent is an important issue.
I would like to take issue with Mr Hoskinson because I don’t think that the market operates to guarantee the best results for everyone. Market popularity doesn’t operate on a one person, one vote basis, but on a one dollar, one vote basis. The difficulty is especially apparent when you look outside the US – in countries without a strong advertising market, search engine provision is much weaker. Again another paper in the same journal (by Vaughan and Zhang) documents the differences in search results in different countries .
I’m glad to hear from Mr Cutts that these issues are being raised within Google. One of the the things that I think is most important is that people working within the search field develop a strong ethos as information professionals (such as those we see for example in librarians and journalists) and not only as engineers. I heard only a little of that in the interviews I conducted – perhaps it’s changing?
For what it’s worth, Oshoma Momoh, formerly from MSN Search, has also blogged about the article on his blog My Own Private Radio at and seems to agree with my conclusions .
Sorry about not being about to insert the links to the papers and blog above – I don’t have a typepad account.
I am going to have to respectfully disagree with Elizabeth on a few of her points. The inner workings of a search service are trade secrets, unless patented, in which case they are already available for worldwide public viewing and comment on uspto.gov (for US patents and published applications). Many inventors patent their innovations, and have thus already made the required public disclosure. Others choose the trade secret route to ensure their competitors do not steal and profit from their ideas.
Changing this process would turn 200 years of intellectual property law on its head with many undesirable side effects, such as reducing the incentives for inventors to create new innovations and venture capitalists to risk money on new breakthrough ideas. It is also unnecessary: there is sufficient transparency, in my view, through patent disclosure. In the case of trade secrets, the consumer is welcome to take his/her business elsewhere if he/she is not getting the desired outcome from a particular search service and doesn’t understand or feel confident in its heuristics for providing results.
Capitalism is not “one dollar, one vote.” Any consumer with Internet access (regardless of net worth) can, in essence, cast a vote by choosing to use an Internet search service such as Google (or not), thereby creating the “click-through” market for advertisers that ultimately fuels these innovations. Switching services is just a few browser clicks away if consumers are in any way dissatisfied with the service’s quality or performance.
The bottom line is this: The current system is working well, and tinkering with it as the author suggests would probably cause more harm than good. The free market does an excellent job of validating good ideas (and rejecting bad ones). Existing patent disclosure laws provide sufficient transparency into the inner workings of innovations like Internet search services while protecting the intellectual property rights of inventors and their assignees and incenting them to create further innovations. Inventors who choose the trade secret route have their investments in intellectual capital protected by reasonable intellectual property laws and policies. Consumers have the ultimate remedy of taking their business elsewhere if they are not satisfied with a single vendor or service.
Did Ms van Couvering define “search engine”? How did she select the people she interviewed? Did she, for example, interview anyone from DoubleClick.COM or YouTube.COM or Hotels.COM or eBay.COM or MySpace.COM or Download.COM or NYTimes.COM? Is she implying that such sites cannot be used to “search” for information? Which inclusion/exclusion criteria did she use? If the above sites were *excluded* from the study, why? Was Baidu.COM considered to be a “search engine” or not? Considering Baidu.COM ranks above Wikipedia.ORG on Alexa, would such an exclusion introduce bias into the results?
Consumers have the ultimate remedy of taking their business elsewhere if they are not satisfied with a single vendor or service.
Your points are well taken. But at the risk of putting words into Elizabeth’s mouth, I don’t think she is calling for the revelation of every single trade secret. All she is calling for is a little more transparency.
Let me make a little analogy. A search engine is like a can or package of food. Let’s say.. a can of chicken vegetable soup. Essentially what you are saying, Andy, is (1) the soup manufacturer should not have to disclose anything that goes into the can. Intellectual property and trade secrets override any public consumer good of knowing what, exactly, is in the soup. Furthermore, (2) the system is self-correcting and should not be tinkered with, right? To paraphrase you: “the consumer is welcome to take his/her business elsewhere if he/she is not getting the desired outcome from a particular [soup can] and doesn’t understand or feel confident in its [nutritional value or other dodgy contents].” Right? In other words, you would have told Upton Sinclair to go away and mind his own business.
I myself do not have the same confidence in rampant, unchecked capitalism as you do. I see no problem with food and drug laws that require producers to labels the contents and nutritional values of the material inside their cans. And the fact that a soup company has to list all its ingredients somehow does not hurt its ability to keep its exact recipe a trade secret.
I don’t see what is so wrong with wanting something analogous for information. Improperly labeled food can kill dozens of people (think: nut allergies). Improperly labeled information can lead entire nations astray and kill thousands (WMD and Iraq, anyone?) Since most of us have come to rely, more and more, upon search engines and the information we get from them to make important decisions in our lives, what is wrong with asking for labels disclosing how this information was arrived at?
Hi JG – I like the chicken soup analogy!
Mr Hoskinson – your argument is thoughtful, which I appreciate. Continuing the discussion, I would just point there aren’t many “elswheres” to take your business now, given the massive consolidation in the search industry over the past few years. All kinds of media markets, and I would say search is no different, are prone to “market failure” (ie failure of competition to deliver better product) and the development of oligopolies. There are oligopolies in radio, television, film, and newspapers (although newspapers are in some countries quite competitive). There is a lot of empirical evidence that unregulated media competition in all kinds of media doesn’t really work to deliver a better product.
A second point (I’ll call it the chicken soup point) is that customers don’t really know what’s in the can right now. To take a broadcasting example, if all you ever see is propaganda, how can you be sure what the real news is? (Apologies to the search engine workers who I know are *not* trying to create propaganda-style results – it’s just an analogy).
Finally, media are more than just another product, they provide the invaluable service in a democracy of giving citizens access to important information which allows them to thoughtfully choose the government. This idealistic position is one of the reasons that many democracies have put in place and supported public service broadcasters, for example. My view is that information providers – like search engines – need to be held to a higher standard than at present.
NMW – My interviewees were chosen on the basis that they were senior engineers in the major search engines of today and occasionally yesterday. That is to say, I specifically targeted engineers in Google, Yahoo, MSN, and Ask, and additionally was interested in talking to people from some of the old giants like Alta Vista, Excite, Webcrawler, etc. I make no claim that these are the *only* search engines, but in terms of global traffic to search, the first four are the ones that have the impact today. There are many many other interesting search engines/search engine-like entities including Baidu, SeekPort, Nutch, Digg, Delicio.us even. But (with the exception of Baidu in China) most people don’t use them. There are other places people can go for information – Wellman et. al.’s piece in the journal describes how people check with their friends first before going online, to give just one example. Unfortunately a researcher is always limited in what she can achieve in a particular project. I also don’t understand your seemingly irritated tone. Is my work the be all and end all on search engines? Absolutely not. You are of course welcome to undertake your own research, and perhaps your conclusions will be different than mine. But do I think it’s a topic worth studying? Yes, I do.
Ms. Van Couvering, thank you for that balanced response — and I am sorry if my earlier comment was a little “gruff”.
I find it rather upsetting that so many people are willing to use the same yardstick to measure everything — and hardly anyone seems to recognize that this is a temporary and rather extreme aberration. Instead, people behave almost as if it were sacreligous to question the results of this or that “one-size-fits-all” search engine.
I much prefer the opinion that a multiplicity of search engines will ultimately make these “major” (“one-size-fits-all”) search engines superfluous (except, perhaps, for the “naive” and/or “novice” user). The results of more targeted, vertical search engines will be far better (WRT relevance and/or reliability) than a handful of “one-size-fits-all” algorithmic search engines. Therefore, I feel that your supposition that “search” is essentially an algorithmic undertaking ought to be questioned (especially in light of the fact that search engine results such as “Googlebombs” are commonly hand-tweaked and/or manipulated to produce “desirable” results).
I commend you for taking on such a large and momentous topic — and look forward to hearing more about your continued progress WRT search and/or information retrieval. I expect that in the coming years you will need to use a more “inclusive” approach — perhaps not jumping right away towards the hundreds of million websites registered, but at least: considering the several hundred or thousand websites that are more promising springboards for “seasoned” and/or “advanced” users (than having to wade through more or less randomized advertisements and/or linking schemes).