Boing Boing has a good write up of the philosophy behind Wikia Search, though it’s light on any details. Wikia search will launch Jan 7, according to this WP article. From the BB post:
But ranking algorithms are editorial: they embody the biases, hopes, beliefs and hypotheses of the programmers who write and design them. What’s more, a tiny handful of search engines effectively control the prominence and viability of the majority of the information in the world.
And those search engines use secret ranking systems to systematically and secretly block enormous swaths of information on the grounds that it is spam, malware, or using deceptive “optimization” techniques. The list of block-ees is never published, nor are the criteria for blocking. This is done in the name of security, on the grounds that spammers and malware hackers are slowed down by the secrecy.
But “security through obscurity” is widely discredited in information security circles. Obscurity stops dumb attackers from getting through, but it lets the smart attackers clobber you because the smart defenders can’t see how your system works and point out its flaws.
Seen in this light, it’s positively bizarre: a few companies’ secret editorial criteria are used to control what information we see, and those companies defend their secrecy in the name of security-through-obscurity? Yikes!
3 thoughts on “Wikia Search: BB Review”
I like Boing Boing, but this discussion seems to have been written without much experience in search technology and IR. The insight that “security through obscurity” is often not the right approach is an important and useful one, but it cannot be blindly applied to all problems in security and beyond. In particular, I do believe that obscurity is of great benefit to existing engines for fighting spam, and I do not see how an open approach would work. Yes, the engines may also have other motives for obscurity (trade secrets, censorship), but I think the argument for some obscurity in ranking to fight spam is sound. An open engine is likely to be overrun by spam, unless it is either too insignificant to be spammed or focuses a large amount of work on keeping clean a limited amount of data.
More realistic and useful than openness might be to try to achieve more diversity in ranking functions, by opening up the playing field so that more organizations can offer their own large scale search engines. That is, remove the need to invest substantially in hardware and software infrastructure before entering the market. Amazon’s infrastructure might be very useful there esp. if it is augmented by the right kind of software and a good feed of data. Some of those new entrants might then choose an open approach, and we would see how that works. But diversity is the real issue. Concerning wikia, it is not clear to me at this point that they have any useful insight about search.
BTW, does anybody have a pointer to more details about their technology and ideas? Whatever I can find on the site has a certain “babes in the woods” flavor to it but I do hope there is something else out there that justifies some of the attention this seems to the getting. I know about grub a little, which by itself would not justify the attention, so what else is there?
Being a former Search alum, this is the first time I’ve seen someone capture the truth about Google and the broader search market. There is no Church vs. State. It’s all (ranking of pages) controlled by people and programmers with agendas. Now, it’s rare that these people actually do anything beyond kick out spammers, etc. but this isn’t always true. I know of a few cases where engineers didn’t like what was coming up on certain queries so they would “modify” the results in order to make the top three results look better. This, mind you, came from a company that claimed to NEVER do such things.
I like the commentary by both TS and stone — agree wholeheartedly. What about this review does not equally apply to all algorithmic engines? And would anyone really continue visiting a medical practitioner if he/she entered whatever question was asked into a one-size-fits-all search engine? (I doubt it — I might be somewhat reassured if it were some medical database, but I wouldn’t want my own health to depend on some ppc scheme 😉