What Info Does Google Keep?

A reader asked me: Does Google keep logs of searches correlated with IP address or other personally identifiable information for users who have not logged in? I knew it kept parts of this data, but was not sure. So I pinged Google PR, which checked in for me (thanks!)….

A reader asked me:

Does Google keep logs of searches correlated with IP address or

other personally identifiable information for users who have not

logged in?

I knew it kept parts of this data, but was not sure. So I pinged Google PR, which checked in for me (thanks!). The response was to quote Google’s privacy FAQ:

Like most Web sites, our servers automatically record the page

requests made when users visit our sites. These “server logs”

typically include your web request, Internet Protocol address, browser

type, browser language, the date and time of your request and one or

more cookies that may uniquely identify your browser.

In other words, yes, Google does record this data. But, does it KEEP that data, I asked? The answer:

Yes, we do.

It’s simple to stop this, of course, just set your browser to not accept cookies. But if you do, you lose out on the services that cookies enable. I for one keep my cookies intact. But know that yes, your data is kept by Google and yes, your searches can be correlated to IP data.

26 thoughts on “What Info Does Google Keep?”

  1. As the reader who asked this question, this is, sadly, not the answer I was looking for.

    Ultimately, this is bad for users. If the information is kept, it’s available for request, abuse, or theft.

  2. Also, it’s probably largely irrelevant whether you accept the Google cookie or not, except that doing so will tie your accesses from different computers together. Even without it, the rest of the information is sufficient to uniquely identify certainly a residence or workplace, and most of the time a specific computer. Even if you’re on a dynamic IP connection, your ISP has logs of when you were on a particular IP address. That, coupled with a detailed list of the search terms and access times, may be enough to identify you anyway. The illusion that any of this information is anonymous in any way is a dangerous one.

  3. This process may be flawed… But, I use two different browser applications. One is fully cookied and basically all shields are down.

    The other one rejects darned near everything – cookies, popups, etc.. THAT is the one I use for semi sort of anonymous browsing (still presenting my IP and platform tho). Kind of like using a certain email acct for catching all the spam you know you’re going to get for having signed up for that certain little something online… So too is my little anonymous browser app. I have absolutely no reason to believe it’s effective for anything other than peace of mind (even if it is misguided). Hope I never have to find out.

  4. Nice clarification John. Few understand that almost everything except encrypted stuff is stored by Google. To be relevant the debate needs to focus from what the Govt or Google “has on you” to what they “do to you” with that info.

    As processing capabilities increase and “Total Information awareness” becomes a reality for both Govt and commercial entities we’ll see the potential for massive abuse of privacy. I think the solution is making clear what can and cannot be used against a person rather than what can and cannot be stored…cuz it’s already stored.

  5. You could go one step further, and install TOR/privoxy as the default proxy for your second browser. That would go a long way towards masking your IP address, though it’s still nor perfect.

  6. This is an eye openning expose’ for those who are not savy.

    Thanks for all your efforts!

    This story just HAD to be Digged…


  7. I believe it is going a bit far to ask “do you keep the data” and presume the answer yes means they can also reconstruct search history from IP address. For instance, they may retain all the data but separate the fields. Why not just ask if, given an IP address, they can reconstruct a search query history. That is the real question, and it is not quite the same as the question you asked.

  8. It seems to me that to the extent that they can determine click fraud, they can also reconstruct a clickstream for a given IP, HTTP user agent, cookie, or some subset of the aforementioned.

  9. IP based tracking could be sidestepped via an IP anonymizer gateway, if you really cared. If some privacy scare went mainstream maybe everyone would start using one in the same way everyone all of a suddon had a firewall on their local networks a year or two ago.

    BUT, if you have a Google personalized homepage or use GMail or in some other way are logged in to a Google account, they can tie all of your searches to the actual YOU. I would assume they do this and store this data forever.

    Any word on if they store the user account associated with each search in addition to the IP if you are logged in?

  10. I would imagine there is some time frame within which they must store all the data. Otherwise, they could not determine whether or not click fraud or some other type of abuse was occurring. Also, as a publicly-traded company, I believe they must retain the data for a certain amount of time in case they are audited.

  11. Now the question is whether this same thing also applies to all the AdSense ads I see. Because then, Google will know that the Guy that does X via AdSense, gets Y eMails a day, searched for geocoding stuff yesterday, lives in Cologne, has a bank account at Bank Z and blogs at blog.thylmann.net was just here.

    Then again, they should freaking show me some german ads then 🙂 But hey, they probably only do not have the translation engine done that does the context match via translation.

    Anyway, remember Doubleclick. They really got hit over the head for what Google might just be doing here and Google is getting a lot bigger than Doubleclick was.

  12. I’m betting that one of these days, someone will write an app that will (at random times) generate random queries and then click on random results. Rather than anonymize the IP, you can gum up the stored data and make real searches indistinguishable from the junk searches.

  13. While some of the recent Google bashing is justified, this is a bit of old news IMHO … and not specific to Google … all of the search engines keep log data … along with every other web server out there – heck, ask cnn.com if they keep their log records around … in fact, battellmedia.com has logs of the IP address I’m posting this from! 😉

    But yes, because of Google’s size, more “interesting” things can be done with this voluminous data.

    P.S. I’m not implying in any way John that you are/would abuse your log data.

  14. http://www.nzherald.co.nz/section/story.cfm?c_id=1&ObjectID=10398156

    Powerful, intrusive new technology is about to be used to spy on New Zealanders online.

    The software, developed to hunt movie pirates, can track internet searches in what an international privacy watchdog says is an alarming intrusion. It can trace Google searches and other download attempts back to the computer they came from.

    New Zealand anti-piracy investigators used the program in a recent trial, discovering 1153 attempts to illegally download hit children’s movie Chicken Little.

  15. Given that 1/2 of their revenue stream is in danger from botnet programs and that AJAX and FLASH are making their page analysis, Google is in a bad place in the coming 007 year. They are going to want to store as much data as possible.

  16. For how many years does YaHoo and AOL and GOOGLE ( etc ) keep logs of internet activity history … and in the case if Instant Messaging, for how long do they keep a record of what you’ve typed in Chat ?

  17. If you have a website and an ISP that hosts the website, then you also keep information about users (visitors) to your site, and keep their ip addresses in the access logs.

  18. There’s nothing wrong with that. As I webmaster myself I record all kinds of information of the visitor, namely IP-address (which can be used to look up your country, and in some cases your city), browser user agent (for browser version, operating system and stuff), screen resolution, screen colors, javascript, and flash support and a whole lot more. Of course I always send a “tracking cookie” that will identify that particular browser, that way I’ll be able to predict how many unique visitors my website has.

    Registering this kinds of information is standard on the web, even Battellemedia (this website) is doing it, using Google Analytics.

    Notice however that everything that’s recorded is regarded as “public information”. It’s not possible to record your name, home address or anything without your permission. It wouldn’t surprise me however if there’ll be a browser function in the future that will store this kind of information though…

    1. Except that THAT information would have to be on the client device OR it would have to be voluntarily or involuntarily provided by the user’s ISP.

  19. Google is more invasive that Obama, if thats possible.  Dumping google, chrome, and anything affiliated.

  20. My Personal data is more important than any thing else in the internet.

    I use AdBlock Plus, Disconnect.me, No script Firefox Addons and also configure the Firefox Cookie settings to accept only trusted cookies.The only thing that I want to make next is use of Proxy server or Fake IP software to keep completely Anonymous.

    Now I am not seeing any advertisements and not even been tracked by any websites.

  21. any website can collect all that data easily without your knowledge using php . so why blame Google???

Leave a Reply

Your email address will not be published. Required fields are marked *