Google Responds: No,That’s Not How Facebook Deal Went Down (Oh, And I Say: The Search Paradigm Is Broken)

(image) I’ve just been sent an official response from Google to the updated version of my story posted yesterday (Compete To Death, or Cooperate to Compete?). In that story, I reported about 2009 negotiations over incorporation of Facebook data into Google search. I quoted a source familiar with the negotiations on the Facebook side, who told me  “Senior executives at Google insisted that for technical reasons all information would need to be public and available to all,” and “The only reason Facebook has a Bing integration and not a Google integration is that Bing agreed to terms for protecting user privacy that Google would not.”

I’ve now had conversations with a source familiar with Google’s side of the story, and to say the company disagrees with how Facebook characterized the negotiations is to put it mildly. I’ve also spoken to my Facebook source, who has clarified some nuance as well. To get started, here’s the official, on the record statement, from Rachel Whetstone, SVP Global Communications and Public Affairs:

“We want to set the record straight. In 2009, we were negotiating with Facebook over access to its data, as has been reported.  To claim that the we couldn’t reach an agreement because Google wanted to make private data publicly available is simply untrue.”

My source familiar with Google’s side of the story goes further, and gave me more detail on why the deal went south, at least from Google’s point of view. According to this source, as part of the deal terms Facebook insisted that Google agree to not use publicly available Facebook information to build out a “social service.” The two sides had already agreed that Google would not use Facebook’s firehose (or private) data to build such a service, my source says.

So what does “publicly available” mean? Well, that’d be Facebook pages that any search engine can crawl – information on Facebook that people *want* search engines to know about. This is compared to the firehose data that was the core asset being discussed between the parties. This firehose data is what Google would need in order to surface personal Facebook pages relevant to you in the context of a search query. (So, for example, if you were my friend on Facebook, and you searched for “Battelle soccer” on Google, then with the proposed deal, you’d see pictures of my kids’ soccer games that I had posted to Facebook).

Apparently, Google believed that Facebook’s demand around public information could be interpreted  as applying to how Google’s own search service was delivered, not to mention how it (or other products) might evolve. Interpretation is always where the devil is in these deals. Who’s to say, after all, that Google’s “social search” is not a “social service”? And Google Pages, Maps, etc. – those are arguably social in nature, or will be in the future.

Google balked at this language, and the deal fell apart. My Google source also disputes the claim that Google balked at being able to technically separate public from private data. Conversely, my Facebook source counters that the real issue of public vs. private had to do with Google’s refusal to honor changes in privacy settings over time – for example, if I deleted those soccer pictures, they should also be deleted from Google’s index. There’s a point where this all devolves to she said/he said, because the deal never happened, and to be honest, there are larger points to make.

So let’s start with this: If Facebook indeed demanded that Google not use publicly available Facebook data, it’s certainly understandable why Google wouldn’t agree to the deal. It may not seem obvious, but there is an awful lot of publicly available Facebook pages and data out there. Starbucks, for example, is more than happy to let anyone see its Facebook page, no matter if you’re logged in or not. And then there’s all that Facebook open graph data out on the public web – tons of sites show Facebook status updates, like counts and so on in a public fashion. In short, asking Google to not leverage that data in anything that might constitute a “social service” is anathema to a company who claims its mission to crawl all publicly available information, organize it, and make it available.

It’s one thing to ask that Google not use Facebook’s own social graph and private data to build new social services – after all, the social graph is Facebook’s crown jewels. But it’s quite another thing to ask Google to ignore other public information completely.

From Google’s point of view, Facebook was crippling future products and services that Google might create, which was tantamount to an insurance policy of sorts that Google wouldn’t become a strong competitor, at least not one that  leverages public information from Facebook. Google balked. If Facebook’s demand could have been interpreted as also applying to Google’s search results, well, that’s a stone cold deal killer.

I certainly understand why Facebook might ask for what they did, it’s not crazy. Google might well have responded by narrowing the deal, saying “Fine, you don’t build a search engine, and we won’t build a social network. But we should have the right to create other kinds of social services.” As far as I know, Google didn’t chose to say that. (Microsoft apparently did). And I think I know why: The two companies realized they were dancing on the head of a pin. Search = social, social = search. They couldn’t figure out a way to tease the two apart. Microsoft has cast its lot with Facebook, Google, not so much.

When high stakes deals fall apart, both sides usually claim the other is at fault, and that certainly seems to be the case here. It’s also the case with the Twitter deal, which I’ve gotten a fair amount of new information about as well. I hope to dig into that in another post. For now, I want to pull back a second and comment on what I think is really going on here, at least from the perspective of a longer view.

Our Cherished Search Paradigm Is Broken (But We Will Fix It….Eventually)

I think what we have here is a clear indication that the search paradigm we’ve operated under for a decade or so is broken. That paradigm stems from Google’s original letter to shareholders in 2004. Remember this line?Our search results are the best we know how to produce. They are unbiased and objective, and we do not accept payment for them or for inclusion or more frequent updating.

In many cases, it’s simply naive to claim Google is unbiased or objective. Google often favors its own properties over others, as Danny points out in Real-Life Examples Of How Google’s “Search Plus” Pushes Google+ Over Relevancy and others have also detailed. But there is a reason: if you’re going to show results from all other possible contenders, replete with their associated UI and functional bells and whistles (as Google does with its own Maps, Pages, Plus etc.), well, it’s nearly impossible now to determine which service is the right answer to a particular person’s query. Not to mention, you need to put a deal in place to get all the functionality of the service. Instead, Google has opted, in many cases, to go with their own stuff.

This is not a new idea, by the way. Yahoo’s been doing it this way from the beginning. The contentious issue is that biasing some results toward Google’s own products runs counter to Google’s founding philosophy.

I have a theory as to why all this is happening, and I don’t entirely blame Google. Back when search wasn’t personalized, Google could defensibly say that one service was better than another because it got more traffic, was linked to more (better PageRank), and so on. Back when everyone got the same results and the web was one homogenous glob of HTML, well, you could claim “this is the best result for the general population.” But personalized search has broken that framework – I lamented this back in 2008 with this post: Search Was Our Social Glue. But That Is Dissolving (more here).

With the rise of Facebook and the app economy, the problem of search has become terribly complicated. If you want to have results from Facebook in your search, well, that search service has to do a deal with Facebook. But what if you want results from your running app (I have hundreds of rides and runs logged on AllSportGPS, for example)? Or Instagram? Or Path, for that matter? Do they all have to do deals with Google and Bing? There are so many unconnected pieces of the Internet now (millions of apps, most of our own Facebook experiences, etc. etc.) that what’s a good personal result for one person is not necessarily good for another. If Google is to stay true to its original mission, it needs a new framework and a massive number of new signals – new glue – to put the pieces back together.

There are several ways to resolve this, and in another post, I hope to explore them (one of them, of course, is simply that everyone should just go through Facebook. That’s the vision of Open Graph). But for now, I’m just going to say this: The issues raised by this kerfuffle are far larger than Google vs. Facebook, or Google vs. Twitter. We are in the midst of a major search paradigm shift, and there will be far more tears before it gets resolved. But resolve it must, and resolve it will.

95 thoughts on “Google Responds: No,That’s Not How Facebook Deal Went Down (Oh, And I Say: The Search Paradigm Is Broken)”

  1. Why not comment on Edelman’s suggestion that Google be bound by the EC directive to Microsoft that users be offered a selection of competing services which become default (in Microsoft’s case, the browser menu). In this case, let us choose which social network we want personalized results based upon, which map service we want our location searches to use, which restaurant rating service we want to use in searches, etc.?

    http://www.benedelman.org/news/022211-1.html

    1. Publicly support open, federated social networks based on XMPP or StatusNet. Google+ is largely based on XMPP, but it’s not open or federated. Google could change that by throwing a couple of switches. Facebook chat and MSN use XMPP, but don’t federate, either. They could easily change that, too, but they won’t.

      The situation is more like e-mail than phones. You don’t have to worry about which “service” or “protocol” your friend is using when you send an e-mail — you just send it off. Social should be like that.

      1. I definitely should be like, but it won’t be if everyone uses Facebook or some other closed service. I think Google is compelled to build Google+ and use it to force the other business to open up. I don’t see how Google could force Facebook to open up unless it has a bargaining chip like Google+.

        But yes, social networks should be more like email. I have no clue why we as a group of people allowed this public goods dilemma to get so bad.

  2. Zooming out.

    1- What is really going on here is two corporate silos are having a public food fight about who own OUR data.

    2- Search engines are simply getting two big for their own britches. What next, a sudden need to  include every file or email we have ever shared with google+ of Facebook friends. Every technology has its practical cut off point. Search engines have reached that point but in the name of mega power and profit they cannot accept this fact.

    Larry should listen to Steve job’s advice. Focus and discipline are all about knowing what is not worth doing!

    1. I think it’s more nuanced than that. 1. We should have control of our data, yes, and/but 2. We want search to have our data if it is useful to us.

      1. For we, it is not that useful to search everything all lumped together.

        If I want to search social data I can easily bookmark those processes.It is all a matter of what level of granularity constitutes a useful cutoff point to avoid overload. That of course is a subjective call.

      2. Very similar thoughts I have, SubstrateUndertow.

        This no-end-in-sight quest for ever more personalization of search results, as understood by Google and similar players, is something like a service that starts out recommending restaurants to me based on what I want to eat that night; and doing a very fine job.

        At one point the operator of this service decides that’s not enought: after all, cooking that meal for yourself is a viable option as long as you have all the ingredients at home, so — our service needs to know what foodstuff you have at home (strictly in order to provide a better user experience for you).

        Let’s join forces with another service that keeps track of your larder, and work their data into our recommendations.

        Some might say this was never meant to be part of our service (or misson), and our users did not come to us for this.

        That’s not true: our users are clamoring for it (you can take our word for it), and anyhow, these stupid detractors don’t know the dynamics/trends/any-other-important-sounding-factor of our industry.

      3. I think the model needs to be flipped. Instead of the service cutting a deal to get that data, we should have it, and determine if we want to share it.

      4. Perhaps its time for the “Attention Trust” marketplace?  That effort was founded – now let me think back – OH YAH – at an early Web 2.0 conference, in one of those small meeting rooms.  Esther, Kaliya Hamlin, Steve Gillmor, Mary Hodder, Doc, Me – bunch of folks.  It was just 8 years ahead of itself.

      5. I entirely agree! Users should own their data. But with everyone using Google and Facebook, how do you propose to do that?

      6. Great discussion this week. Thanks for being thorough!
        Longer term I see Search evolving into a utility that is both portable and personal. How many other “things” went that way in the digital years? But this need is against the expectations of any share holders that don’t tolerate wholesale change. Private companies have a big edge in terms of their ability to pivot & work with (serve) markets in times of social change. Technology really is a small part of this discussion but this view will be silly until it is enforced.

  3. Sir

    What you’re not saying is that the app economy is a closed economy
    and that this paradigm shift is one towards a closed proprietary web
    where Facebook or Twitter think they have it “over us.”

    The only answer to this is to stop using Facebook and Twitter and ONLY used open equivalents.

  4. If you’re concerned about privacy, and you’re a college student or alumni, my invention FreezeCrowd.com is a better social network. My reasoning is not biased because it’s my site, it’s a fact that FreezeCrowd is not currently available to search engines. We’re requiring a school email to join, so that we link you with your academic institution, people who you may more likely be friends with on your campus or connected through a club or group. You can set your privacy settings on who can see your profile page, your photos, comment and more.  This information is not available to the search engines. Furthermore, I came up with this idea before any of the others existed in 1999, so I’ve put a lot of thought into this idea over the years. My site is also a better system because we connect people in a group photo or a team photo with friends.  The data is already in the site, so we control the data with suggested keywords, so that someone can’t put on their profile something that we don’t approve of being real or publicly available.  So, when someone enters their favorite book, this data is either already in the database or a user can suggest it.  Then it’s human edited to make sure this information is accurate and real. So, this within itself strengthens the real world mirrored relationship opposed to the virtual only relationship.  I’m confident that people who use my site will be satisfied, and since we’re in Beta we’re open to more user feedback and improving the user experience.

  5. Although the entire debate that you discuss is gripping for technical, business and political reasons, I also find the following extract of your blog to be fascinating (and, viewed from one perspective, it a beautiful self-referential synecdoche of the issue as well):

    “I’ve now had conversations with a source familiar with Google’s side of the story, and to say the company disagrees with how Facebook characterized the negotiations is to put it mildly.”
    Google appears to have suggested that Facebook lied to you and by doing so manipulated you into spreading a message that Google cannot be trusted with users’s (in this case Facebook users) data.

    The reason that this is fascinating to me, is that it is example of how you contributed, as a computational unit. Clearly the new search engine optimization is in part social search engine optimization, where individual journalists are highly linked nodes or powerful computation units in this mechanical turk.

    Google’s SPYW is beautiful to me because it pits an algorithmic answer against a human derived answer in a way that I have not yet seen. And it does this as a remarkable machine: it has a method of ranking the computation units of humans (rel=author, +1, circles etc…); it has a method of implementing a suite of supervised learning algorithms by detecting why people click after a query; it implements directed exploration by allowing the user to circle other individuals related to a query etc…

    How can the endgame not be AI and the rumored Majel powered by a hybrid of humans and silicon (or graphene at some point)? Google, the company, cannot limit the data it has by allowing data silos or its eventually AI will be ignorant. Google, the company, cannot loose users either, because that will decrease the number of CPUs in the mechanical turk.

    It is not yet obvious to me, however, whether or not Google’s SPYW can combat other similar technologies. OpenGraph, for example, also links distributed information and computational resources to a central Facebook processing unit. It is unclear to me what final computational architecture will win, but it enjoyable to observe how we have so quickly become essential to the design of the future AI.

    (The term Trusted Computing will take on a whole new meaning, as this century progresses…)

    1. Let’s not forget that we can and should be stewards of our own data, determining who can use it, in what context.

  6. Although the entire debate that you discuss is gripping for technical, business and political reasons, I also find the following extract of your blog to be fascinating (and, viewed from one perspective, it a beautiful self-referential synecdoche of the issue as well):

    “I’ve now had conversations with a source familiar with Google’s side of the story, and to say the company disagrees with how Facebook characterized the negotiations is to put it mildly.”
    Google appears to have suggested that Facebook lied to you and by doing so manipulated you into spreading a message that Google cannot be trusted with users’s (in this case Facebook users) data.

    The reason that this is fascinating to me, is that it is example of how you contributed, as a computational unit. Clearly the new search engine optimization is in part social search engine optimization, where individual journalists are highly linked nodes or powerful computation units in this mechanical turk.

    Google’s SPYW is beautiful to me because it pits an algorithmic answer against a human derived answer in a way that I have not yet seen. And it does this as a remarkable machine: it has a method of ranking the computation units of humans (rel=author, +1, circles etc…); it has a method of implementing a suite of supervised learning algorithms by detecting why people click after a query; it implements directed exploration by allowing the user to circle other individuals related to a query etc…

    How can the endgame not be AI and the rumored Majel powered by a hybrid of humans and silicon (or graphene at some point)? Google, the company, cannot limit the data it has by allowing data silos or its eventually AI will be ignorant. Google, the company, cannot loose users either, because that will decrease the number of CPUs in the mechanical turk.

    It is not yet obvious to me, however, whether or not Google’s SPYW can combat other similar technologies. OpenGraph, for example, also links distributed information and computational resources to a central Facebook processing unit. It is unclear to me what final computational architecture will win, but it enjoyable to observe how we have so quickly become essential to the design of the future AI.

    (The term Trusted Computing will take on a whole new meaning, as this century progresses…)

  7. Interestingly enough I never feel like searching for anything else than people or business pages on Facebook. Therefore I perceive their search as internal for Facebook purposes. The Bing integration hasn’t changed that at all.

    Stuff people publish on Facebook seems to revolve around status updates and what they were up to on a particular day, what music they were listening to and how many miles they ran. Nothing much you would want to search for really…

    Hence I’m pretty confident that Google at some point will re-neutralize their search and lower the weight they give to G+ results.

  8. Great discussion!

    I can’t help but get the feeling that Google is on an AI path which answers are presented in anticipation of the question. Oh, Eric has logged in. He wants X, Y & Z. Really I’d like “Y” and a little of “Z” with a helping of A & B. In fact I believe Sergy or Larry stated a desire to predict what searchers want in an interview.

    In my opinion this approach under values the entire concept of “search.” An assumption is made that a searcher has an intended destination and that’s not always the case. There are certainly times when personal habits should influence search results. But just as often searches should present new information that would outside my “life influences” consideration set yet valuable because the results expand our level of awareness. Search shouldn’t be about retriving that which is known but openning perspective to that which has yet to be experienced.

    Are our published experiences the most valuable? Do 140 characters represent us as a whole? Will the searches we did as a 25 year old be relevant at 35 or 45? I hope there’s a company out there thinking about a search platform where the user controls how historical and/or third party data influences what’s shown on the SERP.

Leave a Reply to David Abraham Cancel reply

Your email address will not be published. Required fields are marked *