free html hit counter NYT Finds An AOL Searcher | John Battelle's Search Blog

NYT Finds An AOL Searcher

By - August 09, 2006

Woman Times

A very important piece of reporting, and a powerful reminder of the data corporations and the Govt. have access to.

NYT: A Face Is Exposed For AOL Searcher No. 4417749.



From it:



And search by search, click by click, the identity of AOL user No. 4417749 became easier to discern. There are queries for “landscapers in Lilburn, Ga,” several people with the last name Arnold and “homes sold in shadow lake subdivision gwinnett county georgia.”

It did not take much investigating to follow that data trail to Thelma Arnold, a 62-year-old widow who lives in Lilburn, Ga., frequently researches her friends’ medical ailments and loves her three dogs. “Those are my searches,” she said, after a reporter read part of the list to her.

I spoke with one of the authors, Tom Zeller, late yesterday, and when he told me they’d easily found this woman, I was, in an odd sense, thrilled. In a way, this advances one of my goals – the silver lining of a data leak like this is that it allows the culture to have a conversation about what we’re getting into here by tracking all this data (the Times quotes me saying as much.) Kudos to the Times.

Related Posts Plugin for WordPress, Blogger...
  • Jean

    Excellent perspective – the chilling effect is needed to move the conversation forward with a clear understanding of the impact and risks. The grandmother in GA put it very elegantly describing AOL as “looking over her shoulder”, and her commitment to cancel AOL also calibrates the potential impact.

    The ability to definitively “find people” is there, but the more chilling scenario is when this kind of data is combined with the social web, which allows group behavior to form “on top of the data”. Think of the ability for a zealous web engineer to infer general characteristics of an individual whose search behavior they dislike, communicate it to a “growing mob” and organize CSI-like forensics to lynch the individual.

    Sobering thoughts that need to be brought to the forefront; congratulations, John on using your voice to propel the dialog. It is very important.

  • http://blogs.commerce360.com Craig Danuloff

    This is going to wake up a whole lotta people as to the risks (to them) and value (to the engines) of your ‘database of intentions’. As a minimum initial step, I think we should get an opt-out delimiter that the engines all standardize on to keep any search out any permanent tracking systems.

  • McCarthy

    This is why toolbars like G Desktop, Search History, Y! Toolbar are bad ideas. Sure, they provide some productivity, but the downside is pretty steep if the provider is evil or just plain stupid.

    Even if your searches are relatively benign like Ms. Arnold, who wants their neighbors snooping through your personal interests?

    What if the AOL data led us to some closet Neo-Nazi freak who is a well-regarded member of LinkedIn? I pity those connected to him… Digital McCarthyism, anyone?

  • http://sethf.com/ Seth Finkelstein

    Yes – conventional wisdom is changing right before our eyes. The punditoverse will now have a frame of reference for the discussion about log retention, privacy implications of search, data-mining, etc.

  • Gerald

    it could be even much simpler to collect data. the nytimes got 2462 unique visitors from aol-search. among them there should be a lot of people that are subscribers to one of their services. nyt uses cookies, take that cookie and match AOL-ID with User-Data and you are in. that’s shocking. search for a starlet, celebrity or politician, write your story …

    incredible – but not impossible.

  • http://www.webmetricsguru.com Webmetricsguru

    I posted about this story earlier today (knowing that John would also be covering it as he was mentioned in the AOL Story).

    One of things that has not been looked at yet is Google and Yahoo’s search partners have control of Search Query logs that pass through their own search engines. Sure, Google might be very careful with it’s Query Logs; but AOL also owns the part of Google’s Query Logs powering their search engine – same with the rest of the second tier search engines, esp those that are powered by Google or Yahoo.

    While the second tier search engines don’t represent much of a share of the total search usage (maybe 10% max), they are not as committed to privacy or as transparant with what the do with the information – they can do whatever they want – mistake or no mistake – and that’s what happened with AOL. Someone had a bright idea but did not think it through – happens all the time.

    On one side you have companies that want to geo-locate, segment and target searchers to they can better sell to them; you also have the goverment that wants to exercise control (for it’s own mixed reasons). On the other side, you have people who clearly see this as an invasion of their rights.

    It’s hard to say who will win, at the end of it – but if someone puts enough effort into query analysis – and they have the Search Query logs – they can probably tie the majority of search queries back to the the individuals who made them. It’s just a matter of how badly they want to do it and why.

  • Jean

    “If you give the moose a cookie…” (there’s got to be a haiku in here!)

    If you gave the consumer a simple uncluttered opportunity to opt out of having their search logs saved for use by the search engines, I guarantee the overwhelming majority would take it. The “consumer benefit arguments” will absolutely not balance out the consumer’s fundamental desire for privacy and fear of the unknown. This is heading towards the “do not call” momentum – the benefit is aligned to the gain of advertisers and engines and consumer intrusion (via risk and fear) is under-weighed and ignored.

    Which engine will choose to follow the consumer? It’s the philosophy that got Google their juice. Eric says he’s keeping the logs; there’s a potentially BIG opening to listen to the consumer and trump the engineering-centered thinking of Google.

  • http://ourfounder.typepad.com Jim Benson

    Hey John,

    Is there a major search engine that does not save our search results?

    I’ve looked at a few of the browser anonymizers – they work but slow you system down to such a crawl that it’s not worth using on a regular basis.

    Jim

  • http://searchengines.wordpress.com/ Search Engines WEB ۞

    Electronic Frontier Foundation Launches AOL Privacy help site….but, no one is going after Google.

    https
    secure.eff.org/site/Advocacy?alertId=243&rid=other.281&pg=logACall

    An excerpt from their Webpage:


    Here’s What to Say:
    First, ask to be informed if you were one of the AOL members affected by the leak.

    Second, say that you’d like AOL to stop keeping these kinds of logs.

    Third, say that AOL should work with Congress to make stronger laws to protect the privacy in data collected by Internet companies.

    Fourth, ask to be contacted when AOL decides to take action on these problems.