free html hit counter More On What Google (and Probably A Lot of Others) Know | John Battelle's Search Blog

More On What Google (and Probably A Lot of Others) Know

By - January 30, 2006

Searchblog reader Adam asked me:

1) “Given a list of search terms, can Google produce a list of people

who searched for that term, identified by IP address and/or Google

cookie value?”

2) “Given an IP address or Google cookie value, can Google produce a

list of the terms searched by the user of that IP address or cookie

value?”

I put these to Google. To its credit, it rapidly replied that the answer in both cases is “yes.” Just FYI.

All I will add is this: If you are an agent of the US Government charged with tracking domestic terrorism, might you have an interest in answering questions like the ones posed above?

As the Chinese curse goes (oh, the irony), may we live in interesting times.

Related Posts Plugin for WordPress, Blogger...

18 thoughts on “More On What Google (and Probably A Lot of Others) Know

  1. Just like any other Server, Google’s has Raw LOG files ….and they probably use hi-tech Server Side traffic analyzers with highly customizable Search query string output…

    They can customize views and return output for ANYTHING that visits THEIR SERVERS – and Store the info for as long as they chose.

    Also, with their usage of Click Tracking Technology in ORGANIC SERPs to gauge the relevance of returned sites – they have a practical reason to continuously analzyze all entering data.

    But this brings up an extremely important question:

    ARE they using all of their available technology and potential to combat Click Fraud?????

    If Adwords and Adsense are responsible for over 90% of their revenue, is there a point where it does not return an ROI to totally kill Click Fraud – or – perfect Organic SERPs Relevancy????

    hmmmm

  2. nostromo says:

    I hope this doesn’t come across non sequiture but, If one were concerned about the visibility of their interests one could “fog”, is it were, the collective results by searching for all manner of things and effectivly blur the presumed interests of the searcher.

    IE: if one were concerned about their political interests they could add a number of searches for terms that match the opposite of their presumed ideology; “bush not so bad”, america is really not the luciferian empire that they appear to be”, “no such thing as bohemian grove” or “franklin cover-up based on pure speculation”. I would guess that These searches would have to match the number of opposing searching as would their newgroup activity, posts, ect.

    Other examples would not be hard to imagine. Perhaps if people were truly concerned (and perhaps they should be?) about gulags and “info-crimes” they could use such tactics as to secure themselves freedom from those supposed re-education camps they are supposedly building in the mideast or wherever.

    I wonder if one could create a bot to automatically create opposing searches for every search one makes.

    or perhaps I am just drunk? and I am drunk. I’m not paranoid tho.

  3. Richard says:

    That’s not actually a Chinese expression. It’s just been attributed to them since a 1966 Robert Kennedy speech :)

    It’s natural to assume that anything Google could log they do log, seeing as information is their primary business.

  4. Tarek says:

    What about this follow-up:

    Given a Google cookie value, can Google produce a list of IPs at which Google has seen that cookie, associated with when those cookies appeared?

    (i.e. Where your laptop has been, and when it was there.)

  5. Jason says:

    So – while theoretically anyone can write a search to poll Google’s data in any manner, to yield all kinds of results, the question remains – what type or types of data does Google have an obligation, under the law, to provide the Government – and why do they have an obligaiton to obscure any of this for us the end user?

    Search type one – Someone presents google with a search term, or set of terms, and Google then identifies all the people who like “whole wheat” +”peanut butter” +”jelly” is a fishing expedition. I know at least one major service that turns down government requests for this type of query. Its a fishing expedition, something I believe called a “John Doe Limitless” search. The government isn’t looking for a specific person, but more a list of people to go be suspicious of. I’m not sure how refined that search has to get before they (searchcos) are forced to comply as you have ceased to cast a big wide net and are instead looking for a specific group of people. How exactly you need to search for cesium + that other thing to become a potential nuclear criminal, I dunno.

    I doubt this type of search would ever be the starting point for an investigation into say meth labs. It might be corroborating evidence – but I’d go ask Ebay for a list of people who bought X, Y and Z in the last 3 months, that could all be used to build a meth lab. You better believe Ebay has a whole team of people who actively track that internally and proactively work with the government to get bad people to stop being bad. The fact that you read about something isn’t a key piece of evidence, you need to do something with it. (I am sure there are situations where looking at something is a crime, I just don’t know what they are.)

    Search number 2 is more interesting. If you are already under investigation, they know who you are and just want to know more about you – sorry charlie. I doubt anyone can turn down that type of query as it’d be easy to get a warrant. Sure warrants are old fashioned and we no longer really need them – but if there is an ongoing investigation into your beanie baby business the government can quickly get a warrant to find out if you’ve been searching anything to do with renting a factory in mainland China. If the government has your IP (or your specific cookie value) and they can get warrant, that data is fair game. Again, however, the fact that you read about something that could be illegal if acted upon isn’t proof you did anything illegal. Now if Google shows you clicked on the AdSense link for “Outsourced beanie baby counterfeiting” – that’d make me laugh.

    Now – do you want Google tracking that (and Yahoo, MSN, Ebay, Amazon and everyone else with a brain?) type of data? You probably don’t care if you aren’t a criminal. The data pool is too big for anyone to search through it just to mess around with you. If you are up to bad things, you should be worried. If you are, however, concerned about the growing amount of data about your search, purchasing, eating, travel and other habits – I don’t see an out.

    I remember a friend of mine, years ago, telling me she wouldn’t use the supermarket discount program because they track what you buy. She did, however, use credit cards to pay for nearly everything, as cash was a bother. You have to ask “what is the difference?” You’ve also got to ask your self if you care, and you need to look more deeply than which search engine you use, because tons of these little paper trails are being built.

    End of day I think we inherit a “and when they came for the purple people, I said nothing” kind of story. The threat model (I think) doesn’t show this as ever being invasive for 99.9% of the world, so they won’t change. Stopping the collection of this data breaks a lot of business. Legally detailing when can and can’t share that data will likely happen in court with precedents being set, rather than legislation to specifically detail it. This is why Google telling the government no is a good thing, it gets it into court, theoretically. Out of court cooperation with the government to find solutions that make the Search company or Store happy won’t work as well for the customer as the court should(I hope – I guess I am still showing trust for the “System” in America.)

  6. Hal says:

    I’d like to know a little more. How far back does this information go, and how long will Google retain it? 5 years? 10? 100 (eventually)? A lifetime of data?

    And second, is there any way to “opt out” on a data search? For example, if you sign up for their Search History feature, and then select Pause so that your searches are blocked from going into the history, could they arrange it so that in this mode, they don’t retain search data at all?

    Third, does Google record information on which links you click from their search results page? And as with the above, is this done in a personally identifying way, and how long is the data retained?

    One final point, aside from the many anonymizing web seervices, http://www.scroogle.org provides a proxy for anonymous google and yahoo searches. They seem to be decent people, their proxy is open source and they claim to flush their logs every 7 days.

  7. PaulM says:

    Jason:

    I guess I am still showing trust for the “System” in America

    You’re certainly showing a lot of naivety. The courts have proved ineffective in protecting Americans’ privacy – and the laws are written by the financial institutions and marketers. Just as it took the creation of SEC to prevent financial abuse, only formal regulatory oversight can protect privacy. Google knows this, too.

  8. David says:

    For happy, safe, anonymous googling, use tor. http://tor.eff.org

  9. We are on our way to “back to the future”: 1984

  10. DIGTech says:

    This topic reminds me of another article I read about whether Google was good or evil. I guess it all depends on the way you look at things.

    http://www.technologyreview.com/InfoTech-Search/wtr_16210,308,p1.html

    Hope you find this interesting as well.

  11. gregbo says:

    Rather than have G expend vast amounts of money trying to kill click fraud, I wish they’d switch to a business model (such as fixed fee ads) that is much harder to defraud.

  12. Kathy says:

    I am absolutely sure that Google tracks the search queries and search results a user chooses to make, according to their words “search results more sufficient and effective”. Tools we are using and installing, or that are registering in our system without our notice are a kind of spies. Google toolbar, designed to make our searches easier, ICQ search field and many others – all of them are aiming at collecting data about our web-navigation.

  13. Google was the first search engine to use a cookie that expires in 2038. This was at a time when federal websites were prohibited from using persistent cookies altogether. Now it’s years later, and immortal cookies are commonplace among search engines; Google set the standard because no one bothered to challenge them. This cookie places a unique ID number on your hard disk. Anytime you land on a Google page, you get a Google cookie if you don’t already have one. If you have one, they read and record your unique ID number.

  14. am absolutely sure that Google tracks the search queries and search results a user chooses to make, according to their words “search results more sufficient and effective”. Tools we are using and installing, or that are registering in our system without our notice are a kind of spies. Google toolbar, designed to make our searches easier, ICQ search field and many others – all of them are aiming at collecting data about our web-navigation.

  15. am absolutely sure that Google tracks the search queries and search results a user chooses to make, according to their words “search results more sufficient and effective”. Tools we are using and installing, or that are registering in our system without our notice are a kind of spies. Google toolbar, designed to make our searches easier, ICQ search field and many others – all of them are aiming at collecting data about our web-navigation.
    http://www.argticaret.com

  16. Google was the first search engine to use a cookie that expires in 2038. This was at a time when federal websites were prohibited from using persistent cookies altogether
    sernak

  17. argplywood says:

    Google tracks the search queries and search results a user chooses to make, according to their words
    sernakplywood