Recent Comment
Spotlight
- Reader Michael Megalli writes: It is difficult to engage in genuine conversations with the marketplace when you can't change the reality of how a company does business, what it sells, how it works with partners, etc. [go]
Recent Comments
- JG: " google lets you play with the machine ..." [go]
- Matt: " I don't believe in it and don't accept i ..." [go]
- JG: " I will second tim's comments. And it wa ..." [go]
- nmw: " Good point, Doug -- and that doesn't< ..." [go]
- nmw: " Mr. Hittel, I think I understand what y ..." [go]
- Ian McAllister: " I couldn't agree more that brands should ..." [go]
- Doug Schumacher: " Their ads are delivering good informatio ..." [go]
- John Battelle: " So perhaps Facebook "borrowed" it from G ..." [go]
- Jason: " That's the same "Facebook blue" that Goo ..." [go]
- tim: " I've seen Google on TV -- after PBS show ..." [go]
- Ken Hittel: " Fair enough in re: the bright guys who l ..." [go]
- Michael Megalli: " One of the key advantages of a brand is ..." [go]
- bj: " I did as a previous poster suggested and ..." [go]
- Gerald Buckley: " John - Why at a ballpark when ALL eyes a ..." [go]
- Tom Nocera: " On my posting above back on May 8th I wr ..." [go]
- mrg: " as i begin to type this, i notice the ad ..." [go]
PERFECT FOR THAT PERSON WITH EVERYTHING
Order 'The Search'
Yup, it makes the perfect gift for that officemate or colleague who you thought had everything....including you! If you order here, I promise to sign it, assuming we can figure out the shipping...
You can also buy the audio version here.
Check my book page for more info.
Blogger's Rights
Top Posts
- The Database of Intentions (or how this all got started)
- From Pull to Point(or the first post where I riff on the "Point-To Economy")
- Google As Builder (or the point at which Google stopped being simply a search engine)
- On Google v. Yahoo
- TV and Search Merge
- On Sell Side Advertising
- Battelle Gets Searchstreams
- Search and Immortality
- Toward the Endemic (on endemic advertising)
More coming soon...
Active Topics
- 35 comments: WTF??!!! (04.17)
- 26 comments: Twitter. Oh God. (04.30)
- 16 comments: The Future of Search Series (05.08)
- 16 comments: The Music In Magazines (05.07)
- 13 comments: The Best Minds of the Web... (05.05)
Monthly Archives
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
- August 2006
- July 2006
- June 2006
- May 2006
- April 2006
- March 2006
- February 2006
- January 2006
- December 2005
- November 2005
- October 2005
- September 2005
- August 2005
- July 2005
- June 2005
- May 2005
- April 2005
- March 2005
- February 2005
- January 2005
- December 2004
- November 2004
- October 2004
- September 2004
- August 2004
- July 2004
- June 2004
- May 2004
- April 2004
- March 2004
- February 2004
- January 2004
- December 2003
- November 2003
- October 2003
About John Battelle
Searchblog Newsletter
Enter email to subscribe to Searchblog's newsletter:
Calendar
| Su | Mo | Tu | We | Th | Fr | Sa |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | ||||
| 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| 11 | 12 | 13 | 14 | 15 | 16 | 17 |
| 18 | 19 | 20 | 21 | 22 | 23 | 24 |
| 25 | 26 | 27 | 28 | 29 | 30 | 31 |
Syndicate
Powered by
August 7, 2006 8:27 AM
AOL: Dooooooh!
AOL has officially responded to the recent ruckus over data released by folks in its research group. The summary: Man, did we screw up.
I emailed my contacts there and got an early draft of the release:
"This was a screw up, and we're angry and upset about it. It was an innocent enough attempt to reach out to the academic community with new research tools, but it was obviously not appropriately vetted, and if it had been, it would have been stopped in an instant.
Although there was no personally-identifiable data linked to these accounts, we're absolutely not defending this. It was a mistake, and we apologize. We've launched an internal investigation into what happened, and we are taking steps to ensure that this type of thing never happens again.
Here was what was mistakenly released:
* Search data for roughly 658,000 anonymized users over a three month period from March to May.
* There was no personally identifiable data provided by AOL with those records, but search queries themselves can sometimes include such information.
* According to comScore Media Metrix, the AOL search network had 42.7 million unique visitors in May, so the total data set covered roughly 1.4% of May search users.
* Roughly 20 million search records over that period, so the data included roughly 1/3 of one percent of the total searches conducted through the AOL network over that period.
* The searches included as part of this data only included U.S. searches conducted within the AOL client software."
- Posted by John Battelle on August 7, 2006 8:27 AM



Comments
Battelle check out this guys searches. Fucking nuts bro. Do you think they are going to release my- How to Kill Battelle and, battelle address searches!???. Im just kidding bro. Is yor book on CD? It takes me 1 hour to read 20-30 pages, so thats about 10 hours per book and it only takes about 2 hours on CD. I only listen to audibooks now.
Weirdo on Aol's searches:
17556639 how to kill your wife
17556639 how to kill your wife
17556639 wife killer
17556639 how to kill a wife
17556639 poop
17556639 dead people
17556639 pictures of dead people
17556639 killed people
17556639 dead pictures
17556639 dead pictures
17556639 dead pictures
17556639 murder photo
17556639 steak and cheese
17556639 photo of death
17556639 photo of death
17556639 death
17556639 dead people photos
17556639 photo of dead people
17556639 www.murderdpeople.com
17556639 decapatated photos
17556639 decapatated photos
17556639 car crashes3
17556639 car crashes3
17556639 car crash photo
Here's what people from the sample set search for to land at battellemedia.com (I would've removed personally identifiable information, though there wasn't any):
2006 predictions [2006-03-03 21:46:11]
class action lawsuit mircrosoft [2006-03-13 15:17:02]
tatas [2006-03-27 22:36:03]
google real estate [2006-04-25 11:45:12]
goog prospectus [2006-04-24 02:31:51]
earth google [2006-05-01 01:59:59]
first time models [2006-03-01 02:50:29]
predictions for 2006 [2006-03-01 18:11:42]
give me a site that will let me read books onlin for free [2006-05-25 15:39:41]
predictions 2006 [2006-03-13 14:23:42]
predictions 2006 [2006-03-13 14:23:42]
Can anyone point me to where I can still download the dataset or email it to me as it's now officially offline.
www.aolsearchdatabase.com/
if you do not wish to download the data - someone has created an online Search Service
There is personally identifying information there, as all data has a time stamp and link to what site they clicked through to. So for those searches above to here, John could look at the logs, get an IP address and see possibly 1000s of searches the person at that IP has done. And if they happened to leave a comment etc., as Phillip has pointed out, you would have there name.
I'm sure it would be possible to conclusivly identify about 5-10% of those AOL users if big websites did some searches on their logs for AOL referal headers. None of the major media seems to realise this is the Personally Identifying Information that AOL denies it has given out. Im sure the NYT could link thousands of AOL users who have registered with them to searches they have done - how is that not personally identifying? This is not a theoretical problem.
Search Engines Web that website isn't very useful.
Managed to get the DB from:
http://www.gregsadetsky.com/aol-data/
This makes me sick... I personally wouldn't want my entire google search history in the public domain - anonymous or not.
Not that I'd be affected by something like this; I use Tor so my queries don't all come from the same IP, and thus no one could "profile" my interests, but I'm wondering how all those unsuspecting people out there who were using AOL search feel about this? OMFG is the least of it, especially for the unfortunates whose queries are now a standing joke...
psych: wrong. AOL almost certainly correlates this data based on cookies, which Tor won't help you with.
If you're browsing and accepting cookies, you're not anonymous.
Security is hard. Tor is not a silver bullet.
Leave a comment