Danny: Screw Size

A fine piece of Jesus Not Again writing from Danny. I'm deep in this as well, as those of you who've read my previous posts know. And more is coming, but I promise, I will be brief as can be. I'm waiting to talk with a couple more folks….

A fine piece of Jesus Not Again writing from Danny. I’m deep in this as well, as those of you who’ve read my previous posts know. And more is coming, but I promise, I will be brief as can be. I’m waiting to talk with a couple more folks. Danny notes he and Gary will also be posting more later in the week. I agree with Danny that relevance is key, but think it’s nearly impossible to set a standard for relevance – it’s too subjective. I disagree that size is not important. Once we can figure out how to audit and count size, it’s important, as important as UI, speed, or algorithms. It’s also important in a business sense – it’s a number that folks pay attention to and that marketers know works, and that the mainstream press will parrot. Even if you disagree with the tactics, and I do, it’s still important….

2 thoughts on “Danny: Screw Size”

  1. I’ve written a post pointing out

    Flaws in NCSA Yahoo/Google study

    I’ve dug into some of the study’s data, and written an initial
    quick blog post to point out two bad flaws. The methodology used does
    indeed have a selective bias, towards both:
    1) search-engine spam pages, and 2) large word lists.

    Briefly, by using searches for random words from a large
    wordlist, that created a tendency to select *large* *wordlists*, and
    also gibberish spam pages which happened to have those words (probably
    derived from the same large wordlists). Moreover, this effect applies
    (to some extent) to *every* *search* *sample*. In fact, many of the
    searches could be repeatedly selecting the *same wordlist file*,
    or similar. Since either Google had more large wordlists indexed, or
    Yahoo eliminated many of them as useless data, this results in an
    extremely misleading conclusion about the relative size of their databases.

    In effect, the outcome is that a relatively small number of
    dubious documents are being repeatedly sampled, rather than any sort
    of comprehensive examination.

Leave a Reply

Your email address will not be published. Required fields are marked *