I had a long chat today with folks from Yahoo about the ongoing “size matters” tempest, and it was once again enlightening. I’m planning a longer post on all this, but the upshot of our conversation was that Yahoo stands by its number, that it agrees with many that size alone does not matter, that any claims that any one company can accurately estimate another’s index are simply not defensible, and that, in the end, the proof will be in the results.
Yahoo also acknowledged that it was certainly aware of the PR angle when it made its announcement, and that given Google’s home page claim regarding index size, it was hardly a new tactic to tout that number.
I think there’s more to this story than meets the eye, in terms of a major, multi-billion dollar tussle for the hearts, minds, and pocket books of millions of web users. Sure, the math is hard, and the science even harder, but at the end of the day, I think size matters, a lot. Maybe not so much to the ultimate results one gets – that may well be a case of “it’s not the size of the wand, it’s how you wave it” – but in terms of bragging rights and marketing mojo. Perhaps the ultimate end game of this all will be a deeper cultural awareness of what constitutes good search, but then again, no one ever got rich overestimating the public’s taste for nuance.
BTW, several sources contacted me to remind me of a fact we all know to be true – that Google’s claimed size on its home page – of roughly 8 billion documents – is pretty out of date. Since they put that up, nearly a year ago (scroll to bottom), I’m pretty sure the discoverable web has grown by, oh, at least a few billion pages, and I’m also pretty sure Google knows about those pages. Recall that Google increased its index by roughly a factor of two back then, as a response, one would presume, to Microsoft’s claim to have trumped Google’s number, which had been reported at about 4 billion. I mean, heck, this new post will create a page, and I bet Google (and Yahoo and everyone else) will have found it within a week, if not a day. Blogs alone are adding millions of pages every week.
Would I be surprised if Google announced shortly that its index was magically up to, oh, 22 billion or so? No, I would not. I think if and when that day comes, the timbre of this debate will change. Clearly, such a change would not have occurred overnight.
Heck, if engines are going to do it anyway, I’d love to see the static numbers on home pages (Yahoo sometimes touts its Image Search index size on that service’s home page, by the way), replaced with a counter that is updated constantly. Kind of like that national debt billboard, but for the overall size of the web as discovered by each engine. Why not, at least it’d be more accurate….