I first saw note of this on Dave Farber’s IP list. From the original post, by privacy advocate Lauren Weinstein:
…about a month ago, Google started triggering “prefetch”
page data for the top listings in search results. This behavior is
reportedly currently limited to users on Mozilla-based browsers
(Mozilla, Netscape, Firefox…)
The goal of this procedure is to allow users of those browsers to see
the top link results faster, since they’d already be cached locally.
But there are big downsides to this process.
One obvious problem is that it can distort Web server statistics, by
creating “hits” from users who never actually chose to visit the
sites in question, but were prefetched when their search listed those
sites at the top of results. For some sites, this may be a mere
annoyance, for others it could be a significant problem that could
affect their revenue patterns. This also has the side-effect of
creating a sudden artificial boost in Mozilla-based browser usage
statistics.
A much more serious issue is that the prefetching causes users to
actually access sites without ever having touched the associated
links — and this includes the receiving of cookies. …..
….This means that your IP address and other typical connection data
have *already* been dropped into that site’s logs, even though you
never chose to access that site, and you may now already be holding
cookies from them as well.
… imagine if an innocent search returned results where the
top-listed site contained information you’d never want to be
associated with nor access in any way (child porn, browser exploit
sucker-bait sites, illicit files — you name it). Keep in mind that
such sites will often use various techniques specifically to boost
their rankings in search results….
…Bottom line: Creating a situation where users are “automatically”
accessing search-result sites without their having taken explicit
actions to do so is very bad policy. This problem is not the fault
of Google alone — the prefetching mechanism has been present in
Mozilla-based browsers for quite some time.
However, when the planet’s major search engine begins to routinely
use this technique in the manner that Google has done, it at the
very least suggests that they did not fully think through the
potentially serious anti-privacy ramifications of their actions, when
applied on the vast scale of their user base.
Tim posted on this issue as well, offering a clarification that Google only prefetches the first result.
I called Google and spoke to some folks there, they acknowledge that Google does the pre-fetching for Mozilla clients. But they argued that Google is doing it in a fashion that is compliant with web standards, and for a good reason: to speed up the web. Sophisticated webmasters can easily filter out pre-fetches from other kinds of requests, so logs won’t be inflated, and users can turn fetching off it they want. For more on this issue, Google pointed me to Mozilla’s link prefetching FAQ.