Privacy: The Frog Boils, Slowly

This article strikes me as another slow drumbeat on an issue that has to be both frustrating and impossible to own for Google. The headline: "Some Web Firms Say They Track Behavior Without Explicit Consent" implies ulterior motives and wrongdoing. In fact, it's standard operating procedure for companies who…

This article strikes me as another slow drumbeat on an issue that has to be both frustrating and impossible to own for Google. The headline: “Some Web Firms Say They Track Behavior Without Explicit Consent” implies ulterior motives and wrongdoing. In fact, it’s standard operating procedure for companies who run ad networks, and has been for a very long time. However, now that the guv’mint is involved, SOP is no longer AOK. The lede:

Several Internet and broadband companies have acknowledged using targeted-advertising technology without explicitly informing customers, according to letters released yesterday by the House Energy and Commerce Committee.

The kicker:

And Google, the leading online advertiser, stated that it has begun using Internet tracking technology that enables it to more precisely follow Web-surfing behavior across affiliated sites.

Or, put another way, Google bought DoubleClick, and DoubleClick uses tracking cookies. Yawn, right? Except….the rest of the world is catching on to the Database of Intentions, and the dialog as to what it means is just getting under way. The heat is being turned up, slowly but surely, and Google has to be careful to not be seen as the water in a boiling frog syndrome.

Here are the documents from the House Committee investigating online data practices.

26 thoughts on “Privacy: The Frog Boils, Slowly”

  1. Bother! I’m a Pooh of very little brain:

    1. does this have anything to do with why supposedly needs Google to buy Wikipedia?

    2. Why does Google have to be careful?

    2.1. If Google *IS* careful then X

    2.2. If Google is *NOT* careful then Y

    2.3. What are X and Y?

    William James, eat your heart out!

    ;P nmw

  2. This problem has a simple solution. Which Google will never, ever implement.

    The solution is for Google to become as transparent to us, the users, as we the users are transparent to Google.

    Hah, that’ll be the day. But it’d work.

  3. Yep — agree with JG — the solution is remarkably simple. Do just what we do on our ManyWorlds.com site — prominently display the following user selectable button:

    “Learning On/Off”

    If they choose “Learning Off” they will still get recommendations on our site or ads on ad-supported sites, the recommendations/ads will just be less personalized and useful.

    If enough sites do this, then if people don’t see such a button on a site, they should assume they are getting tracked anyway and they might choose to use another more privacy-transparent site. Over time market-driven transparency will then inevitably become the norm if enough people care about it.

  4. Good points, guys — here’s another question: Would increased transparency ultimately lead to the demise of “invisible” links (i.e. put an end to the “nofollow” tag?).

    How many users even know how search engines work at all as is, today?!?

    Do we know how they work? Are we certain that Google doesn’t know that there currently are “Ads by Google” for Banner Advertising (2X), Targeted Advertising, Google Advertising and Google Site Search in the right-hand column of this very page? (presumably Google is paying itself for the “Google Site Search” ad, so I figure they may be “onto something” :O πŸ˜‰

    I think (from simply following the discussions) that quite a few people might be concerned if cookie tracking system were “at risk”, but I think it’s actually quite overrated. For example: can cookies track which ads I sneeze at? (note my webcan is turned off πŸ˜‰

  5. The night after the conference ended, I decompressed in my hotel room with Jonathan Weber, my editorial partner in the Industry Standard, and Steve Ellis, who runs an innovative music company called Pump Audio. Talk turned to what constituted “quality writing” in a journalistic sense. I’m not without a dog in this particular hunt, as it’s been the central premise of both my previous magazine launches, and is at the center of a new venture I’m noodling now that the conference is over. Steve, who is British, asked Jonathan and I if we thought the Wall Street Journal represented the paragon of American newspaper feature writing. And I thought, Jesus, I haven’t read that paper for months. I pay for the online version of the paper, but given how my reading habits have shifted from pull to point*, the Journal simply has not crossed my radar enough to register.

  6. Steve, I’m advocating a little more than “Learning On/Off”, though I agree with you that what you propose is necessary at a bare minimum.

    Google knows a lot about us.. our desires, our hopes, our knowledge, our goals. That’s what Battelle has been saying for years with the “Database of Intentions”.

    What I am saying is that Google needs to turn around, and expose the “Intentions of the Database”. This is more than just knowing when and how data is being collected. We need to be able to understand exactly how the data is being applied, how often, and why. And why not. We need this information in a form that is more than just general guidelines. Because the queries that I type in to Google (the intentions that I added to the database) are not general. They’re specific. So too should the intentions with which the database acts also be made absolutely clear and specific.

  7. nmw: I don’t know the answer to your first question, about “nofollow”.

    As far as your second question, about knowing how search engines work.. yes that is an issue. Just a few days ago in the comments I drew an H.G. Wells parallel, and said that search engines and users need to move away from the current Morlock and Eloi roles that they’re each playing at the moment. There needs to be more balance brought back to both sides, so there isn’t such a schism in knowledge, awareness, and.. intention.

  8. JG — ok, thanks for clarifying your thoughts further regarding desiring not just having transparency and control over tracking, but also having transparency about the “database of intentions.”

    That’s an interesting issue and I agree it would be good to make that transparent as well. Having worked in the bowels of personalization systems, though, it seems to me that it may be tricky in practice to give people something that they can actually get some value out of.

    The reason I say that is basically you have a mass of raw data (clickstreams and other usage behaviors) and then you have complex inferencing engines that use the data and employ highly complex algorithms to make inferences on preferences and interests. You would get nothing out of just seeing the raw data — what you really want is to understand the inferencing logic. But frankly, in any sophisticated system, the logic (which may constitute just huge numbers of matrices that are manipulated) is so complex that you could never make much sense out of it.

    A partial solution we use at ManyWorlds is that in addition to generating a recommendation, we also provide an explanation to you of why you received the recommendation. This gives a simplified but transparent look inside the “mind” of the inferencing engine, and may partially get to what you are ultimately looking for. Frankly, I suspect it is likely, as a practical matter, to be only workable approach for making the database of intentions (more precisely, inferences of your intentions) transparent. And certainly for ads, explanations of why you received the ad are genuinely useful for both you and the advertiser — so I’ve got a feeling explanation-driven advertising is going to become a significant factor down the road . . .

  9. Steve, you’re doing what I would call “Explanatory Information Retrieval”, and I think it’s a hot, interesting, vital area. And definitely the direction the community as a whole needs to think a lot more about.

    I think I often come across in the comments section as critical of companies like Google, but I am this way only because I see this huge gap between what is and what could be.. with no perceivable effort on the part of Google to cross this gap.

    You’re absolutely correct; this stuff is quite tricky in practice. And requires sophisticated inferencing engines. And explanations of those inferences. But how long has Google been around now? 10 years? And how many employees do they have now? Like 16,000? And that must mean that they have like 6,000 PhDs by now, right? Working day and night on all tough, large scale, fantastic problems like this. Doing things that push forward the organization of the world’s information by leaps and bounds. So why do we not see any of it?

    Why instead, as I’ve said a dozen times, is most of what I see Google doing is releasing new chat widgets? Or online payment systems? Or online versions of MS Outlook (calendar, email, etc)?

    Do I think it’s neat that you can type “doctor appointment next Tuesday” into Google calendar and have the appointment automatically appear on the correct date? Yes, that’s neat. But Joseph Weizenbaum was doing that sort of simple nat. language parsing 45 years ago. Putting it into GCal is not all that new, or envelope-pushing, really.

    The world needs fewer calendar applications, and more Explanatory Search systems.

    And so I see this gap between a company whose sole motto is “information organizing”, and who has thousands of PhDs to do so.. and who has had 10 years to work on “Explanatory Search” or something in that or dozens of other directions. And yet all we get are calendars and chat widgets. Google Sets is probably the most interesting information sensemaking tool that they offer to the average searcher.. and that was released.. what.. 7 or 8 years ago now? What else has there been in the meantime, for the searcher, to help make sense of the 563,000 results that a single search retrieves? (And the 5.5 million results that the 10 searches that you did in the last half hour retrieve.)

    Google does not need to buy Wikipedia. Google needs to do something like Powerset has done, and “process” and “data mine” and “semantically parse” wikipedia to produce aggregated, summarized, and navigable, actionable information about the content in wikipedia. So that users can better make use of the information contained therein. Google needs to organize the Wikipedia’s information, in keeping with Google’s own motto. Not own it.

    It’s like they’ve lost sight of the ball.

    So I’m glad to hear that your company is doing something along these lines.

  10. >> Why instead, as I’ve said a dozen times, is most of what I see Google doing is releasing new chat widgets? Or online payment systems? Or online versions of MS Outlook (calendar, email, etc)?

    A little while before Google opened Zurich, I contacted Urs, and I asked — are you only looking for programmers?

    His answer was “yes”.

    Case closed.

    πŸ™‚

  11. In their letters, Broadband providers Knology and Cable One acknowledged that they recently ran tests using deep-packet-inspection technology provided by NebuAd to see whether it could help them serve up more relevant ads, but their customers were not explicitly alerted to the test. Cable One is owned by The Washington Post Co.

  12. A little while before Google opened Zurich, I contacted Urs, and I asked — are you only looking for programmers? His answer was “yes”.

    I also once asked a senior Google exec, probably around the same time you talked to Urs, about programmers versus researchers. His response was: “Google does not do Research and Development. Google does Engineering and Development.”

    In fact, for the longest time, if you were a technical person inside of Google, the only job title you were allowed to have was “Engineer”. I remember reading puff pieces in magazines like Wired, talking about how wonderful and egalitarian Google is, because everyone is an “Engineer”, and what mattered is not your status, but the technical quality of your idea.

    I’m all for egalitarianism and the technical meritocracy of ideas. But what struck me as odd is that, since everyone was an Engineer, they didn’t have any Scientists / Researchers.

    And maybe it’s just my stereotype, but engineers are the type of people who are really good at taking known principles (for example, Force=Mass*Acceleration) and implementing systems that make use of those principles. What they are not so good at is discovering or uncovering new, unknown principles (for example, the weak nuclear force).

    And maybe this is old-fashioned of me, but I’ve always thought good product development requires both scientists and engineers. You need the scientists to think laterally, to explore and to discover new principles, things that have never before been understood, much less implemented. And then you need the engineers to implement those newly discovered principles.

    But you can’t just engineer your way to the discovery of new scientific principles, new ways of understanding the world. Engineering is necessary, of course. Some fantastic engineering was required for the construction of the Large Hadron Collider. But without the scientific principles and thinking behind it, it would have never been designed/constructed in the first place. Engineering, by itself, has no reason to build something like that.

    I think this is true of both physics as well as Information Retrieval (“Organizing the World’s Information”). You can’t just engineer your way into the best systems for organizing the world’s information. If you’re going to do something radically new, interesting, difficult.. such as the “Explanatory Retrieval” that Steve talks about, you need to scientifically discover new principles by which something like that can be done.

    So I am just left struggling to understand, left wondering, what it means to both want to organize the world’s information, but only dedicate a single type of resource (programmer/engineer, rather than scientist or even sociologist!) to the problem.

    You’re probably correct nmw. The reason why we see lots of widgets, but no real fundamentally big new ideas, has a lot to do with only hiring one type of thinker: the programmer.

    Anyone else want to chime in?

  13. Well, they *do* have (at least) 1 researcher: Hal Varian.

    I’ve “studied” Hal Varian — I took advance microeconomics courses as an undergraduate student (just one of the things you can “pull off” as an exchange student ;), and one of these was a graduate course in the “Economics of Information”… — neat stuff (and my professor was really funny too, which made the theory alot more palatable ;).

    And I’ve exchanged emails with Professor Varian himself — about “information” (I do not doubt that I know virtually nothing about economics compared to Prof. Varian’s capabilities in this field, but I did *differ* in something which I considered to be a fundamental detail about “counting units of information”) — I think it was many years ago that Prof. Varian had published a report on (something like) “the amount of information” worldwide. In the report, it was explicated that to count the information, the number of bits were simply “added up” — and I wrote to say I thought this was a common misconception of “information” (as being roughly equal to “data”; note that perhaps the term “information technology”, as applied — for example — telephones, may contribute to this misconception/misperception). This would mean, for example, that taking a 1MB photo of a ballot would create about a million “pieces” of information — and I find this notion absurd. Simply creating more data does not create more information.

    But it has little to do with information retrieval per se. Prof. Varian’s theories (and also George Stigler’s) provide great theoretical insights on information (and also on information’s role in markets), but they do not really have much to do with the human psyche (which is also a kind of “technology”).

    Oh, that reminds me of another researcher who has been involved with Google (in fact, this researcher to some degree may have “fathered” the ideas? πŸ˜‰ — Terry Winograd. I’ve also studied Terry Winograd’s statistical/computational natural language research… (and some of this may even play into the way Google’s engine works — but it’s hard to tell, really).

    All in all, I think you are right: There was one “idea” to Google (and as I’ve said before, it wasn’t even a “new” idea — indeed: it’s an idea that has actually been considered a folly in academia ;). And the fact that Google apparently does not invest in research severely limits the company’s ability to build “new” and/or “improved” tools. So maybe Google is indeed a “one trick pony”… — and maybe that “one trick” is even… umm, well: maybe it’s simply a “neat” trick?

    What did that guy Barnum use to say — something like: “there’s an investor born every minute”?

    ;D nmw

  14. Oh, that reminds me of another researcher who has been involved with Google (in fact, this researcher to some degree may have “fathered” the ideas? πŸ˜‰ — Terry Winograd. I’ve also studied Terry Winograd’s statistical/computational natural language research… (and some of this may even play into the way Google’s engine works — but it’s hard to tell, really)..

  15. nmw: In all fairness, I should point out that in relatively recent history (last 2-3 years-ish), they have hired more than just Hal Varian as a researcher, and I believe there is now even a position called “Research Scientist” at Google, where previous there were not such job titles. (So it is no longer technically true that everyone is “Engineer”.. which thankfully now contradicts all those puff pieces in Wired magazine:-)

    But again, this is only a relatively recent development, and is not a well-established pattern throughout most of Google’s 10 year history. I.e. despite a few folks (couple dozen? a hundred? out of 16,000?) with the “Researcher” title, in the core of their DNA, they remain an “Engineer” company. With the exception of a few people here at their, it seems that they are still really only throwing programmers at the problem of organizing the world’s information, rather than research scientists, library scientists, sociologists, behavioral anthropologists, etc. The proof of the pudding is still in the nature of the products that they release.

    They have so many resources, though, and so many good people. I just wish they would do more. That’s really all I’m saying. It’s a pity, really, that they seem to be stuck in a single mindset.

    And, @Joel: Yes, I have also heard that the boiling frog is an urban legend. But the principle, that of slowly going down a path, little by little, rather than all at once, is still a behavior that we can observe all around us, in human society, quite often. Privacy, or any other, violations do not occur all at once. They occur little by little, so that we don’t notice what is slipping away. So I agree with you that the frog analogy is false. But the idea itself is still generally observable.

    @nmw: Yes, I agree with you as well. The information content of a picture is not 1 million pixels/bytes. The correct conversion factor is 1 picture = 1000 words. πŸ˜‰

    @nwm: By the way, why do I feel like you and I are 2 of only like 5 people that ever read the comments section, and get into this stuff πŸ˜‰

  16. @JG

    To go with the frog analogy:

    1. Most don’t realize that they’re sitting in the water;

    2. On the Internet, no one can tell if you’re a dog

    ;D nmw

  17. 2. On the Internet, no one can tell if you’re a dog

    Don’t you mean, “On the Internet, no one realizes that they’re actually a frog”? πŸ˜‰

  18. JG/NMW — make that at least 3 of us that get into this stuff in the comments section . . .

    And one reason, in my case, is that if your business revolves around innovating, it starts with understanding the unfulfilled needs people have. And thoughtful threads like this is a great place to get insights into those kind of unmet needs.

    Like the need for more transparency with regard to the database of intentions/inferences — I hadn’t thought too deeply about that before. What I had thought deeply about is providing explanations for recommended items (where a recommendation can be an ad, the response to a search query, etc.) to reinforce to the recommendation recipient why the recommendation should really matter to them. But putting the two things together makes the case for explanatory-driven ads/recommendations even clearer because the approach delivers greater transparency, and hence increases trust, plus being more authoritative because of the compelling logic that is exposed.

    And the beauty from a business model standpoint is that explanatory-based ads will not only fulfill unmet needs of ad/recommendation recipients, but also be superior at moving product for advertisers, so both sides of the equation win.

    Now — these few sentences above pretty much lay-out an inevitable path forward IMO — how can it not happen? It’s technically feasible and it fulfills unmet needs. But am I worried that a big, established player will read these comments and run off and execute this inevitabilty? Nope.

    Scale is a wonderful thing for a business to have that can be leveraged in many beneficial ways — but not for creating and implementing the most fundamental innovations. History has shown over and over again that nearly all businesses are, in fact, one trick ponies, and that trick (which, granted, may be a helluva trick) is always developed when they are still foals . . .

  19. >> On the Internet, no one realizes that they’re actually a frog

    That quote alone should move us at least from web 2.4 to web 2.96! πŸ˜€

    >> Scale is a wonderful thing for a business to have that can be leveraged in many beneficial ways

    IMHO, nothing scales as well as “natural language” (see http://gaggle.info/miscellaneous/articles/wisdom-of-the-language πŸ˜‰ — but “natural” language is not “owned” by anyone (even Hitler / Goebels didn’t manage to fool all of the people all of the time — they merely intimidated them with torture and/or fear of power).

    Our host (John Battelle) presented a very good model for how this could happen (and/or how it has played out) — and he used an age-old analogy: the command line versus the menu (I think his presentation at the first CM conference was very well done). He has argued that Google is offering a “command line” — but apparently (as the latest release shows) this pseudo-command line is merely a facade. When people begin to realize that their queries are being pigeon holed into some “segment” which advertisers might be able to target… — well, will they? (see JG’s quote, above)

    IMHO, Google is moving “one step forward, two steps back”. Ultimately, more and more people will realize that Google is nothing but an ad agency (something I have been arguing for years already πŸ˜‰ — and “ultimately” is apparently not so far away (check the comments section to the NYT article that John pointed to the other day ;). As this becomes more widely known, people will probably start looking for other sources of news and/or weather reports — and I guess that will make CBS and NBC quite “happy campers”.

    I suggest that we all forget about Google and start paying more attention to the sites that really matter. “Traditional” search engines will probably also continue to change, since the “one size fits all” approach of today’s top search engines will simply no longer suffice to separate the wheat from the chaffe — early indications of future engines is the plain and simple fact that cars.com is already a search engine for cars (in a commercial setting) and/or the fact that hotels.com is already a search engine for hotels (in a commercial setting).

    Another good quote of JG’s is also rather plain and simple: “the proof is in the pudding”.

    ;D nmw

  20. And the beauty from a business model standpoint is that explanatory-based ads will not only fulfill unmet needs of ad/recommendation recipients, but also be superior at moving product for advertisers, so both sides of the equation win.

    Now — these few sentences above pretty much lay-out an inevitable path forward IMO — how can it not happen? It’s technically feasible and it fulfills unmet needs. But am I worried that a big, established player will read these comments and run off and execute this inevitabilty? Nope.

  21. JG/NMW — make that at least 3 of us that get into this stuff in the comments section . . .

    Ok, so it’s you, me, nmw, and the Turkish spammer. Four peas in a pod. πŸ˜‰

Leave a Reply to JG Cancel reply

Your email address will not be published. Required fields are marked *