Google And Pre-Fetching

Google has found itself in the midst of another tempest, whether this particular one is in a teapot or not depends on your point of view. The issue has to do with "pre fetching" – a practice for which Google got some heat back when it introduced its web…

MozillaGoogle has found itself in the midst of another tempest, whether this particular one is in a teapot or not depends on your point of view. The issue has to do with “pre fetching” – a practice for which Google got some heat back when it introduced its web accelerator.



I first saw note of this on Dave Farber’s IP list. From the original post, by privacy advocate Lauren Weinstein:

…about a month ago, Google started triggering “prefetch”

page data for the top listings in search results. This behavior is

reportedly currently limited to users on Mozilla-based browsers

(Mozilla, Netscape, Firefox…)

The goal of this procedure is to allow users of those browsers to see

the top link results faster, since they’d already be cached locally.

But there are big downsides to this process.

One obvious problem is that it can distort Web server statistics, by

creating “hits” from users who never actually chose to visit the

sites in question, but were prefetched when their search listed those

sites at the top of results. For some sites, this may be a mere

annoyance, for others it could be a significant problem that could

affect their revenue patterns. This also has the side-effect of

creating a sudden artificial boost in Mozilla-based browser usage

statistics.

A much more serious issue is that the prefetching causes users to

actually access sites without ever having touched the associated

links — and this includes the receiving of cookies. …..

….This means that your IP address and other typical connection data

have *already* been dropped into that site’s logs, even though you

never chose to access that site, and you may now already be holding

cookies from them as well.

… imagine if an innocent search returned results where the

top-listed site contained information you’d never want to be

associated with nor access in any way (child porn, browser exploit

sucker-bait sites, illicit files — you name it). Keep in mind that

such sites will often use various techniques specifically to boost

their rankings in search results….

…Bottom line: Creating a situation where users are “automatically”

accessing search-result sites without their having taken explicit

actions to do so is very bad policy. This problem is not the fault

of Google alone — the prefetching mechanism has been present in

Mozilla-based browsers for quite some time.

However, when the planet’s major search engine begins to routinely

use this technique in the manner that Google has done, it at the

very least suggests that they did not fully think through the

potentially serious anti-privacy ramifications of their actions, when

applied on the vast scale of their user base.

Tim posted on this issue as well, offering a clarification that Google only prefetches the first result.

I called Google and spoke to some folks there, they acknowledge that Google does the pre-fetching for Mozilla clients. But they argued that Google is doing it in a fashion that is compliant with web standards, and for a good reason: to speed up the web. Sophisticated webmasters can easily filter out pre-fetches from other kinds of requests, so logs won’t be inflated, and users can turn fetching off it they want. For more on this issue, Google pointed me to Mozilla’s link prefetching FAQ.

7 thoughts on “Google And Pre-Fetching”

  1. All the phony page accesss could be a blessing in disguise by making page accesses much less important.

  2. They may be using standards, but they aren’t letting people opt-in or even out, and that’s very, very bad. Identity thieves, such as phishers, are using “standards” too…

  3. Hmm, I’m using FF 1.0.4, and my SERPs don’t have rel=”prefetch”. What’m I missing?

    Is it a partial rollout for now?

  4. At current prices for Web hosting, it costs 0.003 cents to serve a pageview, so that’s the cost Google is imposing on other websites with this technique.

    On the other hand, let’s assume that the vaue of a pageview is 1 cent. (Averaged between sites that run low-value generic CPM advertising at maybe 0.1 cents per pageview and high-value B2B sites where a pageview may be worth as much as $1.)

    Of course, there’s only value from a pageview if it’s actually viewed by a human and not just downloaded. Downloads cost, regardless of viewing.

    Thus, the quesiton is whether prefetching will make users view an additional page more than 0.3% of the time. I can’t predict what the exact number will be without a huge, cumbersome study, but there is no doubt that faster response times will lead users to view more pages.

    By the way, WebTV did something similar in 1997: when the user was accessing a sequence of pages (say, paging through an article that was split over multiple pages), the next one in the sequence would be prefetched to the user’s device.

  5. Nice well balance areticle and comments.

    I agree that one should have to opt-in for such things. I also though it should apply to spam. The can spam act assured just the opposite.

    It seems that the government would rather have us opt out on e-mail spam. The congressmen and senators represented online marketers and not consumers with the passage of that law. I suppose an executive at Google might use that as justification.

    Personally, pre-fetching does not bother me. I have the option of allowing cookies or not, I can turn it off if I wish. I did a quick and dirty poll among some fraternity brothers and classmates. Most want the speed.

    Google is a savy and wise company when it comes to meeting the needs of the consumer. It could be that they already know that most people are oblivious to it or, possibly, don’t mind. Regardless, if enough people do object, I am pretty sure Google will reverse its decision.

  6. FWIW, another issue with web accelerator is with password/subscription sites. If you have web accelerator loaded you will likely find yourself re-enetering your U:P many times within a session. Finally got frustrated and uninstalled the app.

    Good blog.

    Thanks.

    Kevin

Leave a Reply

Your email address will not be published. Required fields are marked *