Will Google Reintroduce Microblogging?

I’m Thinking Out Loud about Blogger and Google and……Twitter. yeah, I’m writing about Twitter a lot, but hell, it’s the most interesting thing in search in a while, and things keep popping up that spark my mind. This post from Google, for example. It’s titled “Blogger connects to Google Friend Connect”, which isn’t exactly a barn burner of a headline.

But it got me thinking. As the post notes, back in the Fall Google added a “following” feature to Blogger, which I sense doesn’t work quite as well for blogging (not real time) as it does for microblogging (real time).

However, the addition of Friend Connect to Blogger is a clear response to the success of Facebook Connect in the blog world. I’m planning on adding FB Connect to this site (I have to upgrade my platform first), mainly because I’ve seen how much easier it makes it to let you all comment, as well as the amplification of those comments into Facebook.

What I’d wonder is this: Do you all think Google, through Blogger, will (re)introduce a microblogging service to compete with Facebook Live Feed and Twitter? (Oh, the irony…)

Author: John Battelle

A founder of NewCo (current CEO), sovrn (Chair), Federated Media, Web 2 Summit, The Industry Standard, Wired. Author, investor, board member (Acxiom, Sovrn, NewCo), bike rider, yoga practitioner.

23 thoughts on “Will Google Reintroduce Microblogging?”

  1. yeah, I’m writing about Twitter a lot, but hell, it’s the most interesting thing in search in a while

    Ok, I’ll admit to not being the sharpest pair of scissors in the drawer.. but.. how is Twitter “search”?

    Twitter is a networking / communications channel / social connection service, right? How is it a search service? Do you mean that you can ask a question on Twitter, and 214 of your best friends will answer it for you? Or do you mean something else?

  2. @JG

    Twitter acquired summize.com (a company focused on creating a search app for posts to twitter.com aka “tweets”) about a year ago.

    Many apps “search” through the twitter stream of “latest” / “breaking” news, but twitter.com itself holds the firehose.

    Software apps don’t matter (that’s my “IT doesn’t matter” 2.0 ;). What matters? Language matters.

    Language is the most fundamental information technology.

  3. So that’s what John means? When searching for information, he goes on to Twitter to “summize”, or search through its logs, to find answers to his questions?

    John writes: “it’s the most interesting thing in search in a while

    What makes it interesting? The search technology itself, or the corpus of data that is being searched?

    Asked another way: Is Twitter an interesting search application, because of the underlying search technology, or because of the information that they have available to them, that no one else has?

  4. Twitter search is awesome, a superlative I seldom use. Especially good for real-time reviews, tech gliches, etc.

    When a sevice goes down (or something else happens) people start twittering and Google takes days to index, Twitter is now.

  5. andilinks: So it’s an event-based comment search? Something happens, and you are looking for information on what someone is saying about that topic, right now? If no one has said anything, yet (e.g. you are the first person to discover a service outage), then your search produces no results?

    If I understand, what impresses you about Twitter as a search service is not the algorithms, but the data? Twitter has that real-time data, and that real-time data is valuable to you, and therefore the search is valuable to you?

    Do you ever have a sense of when the search doesn’t work? How often do you have to reformulate your queries? Are you particularly concerned about missing any pieces of information (e.g. not getting every single tweet on a topic of interest to you), and if you are indeed concerned, do you have any sense of when Twitter fails to find all the information you need? Or do you just take whatever Twitter gives you?

  6. @JG Yes, event based or process based, if I’m having problems with something new there are more likely to be tweets with an answer or link. Google’s kluge-factor here is that there are too many top results for any given keyword that are years old and including the numerical year can help but so many old pages have server side current copyright dates.

    Yes, I occasionally reformulate my queries both on twitter and Google. I’m still somewhat new to twitter and the search feature has only been working well recently according to an interview with ev, maybe CSPAN or Charlie Rose.

    I don’t do exhaustive research with twitter, more often just a quest for an answer or solution to a specific problem.

    If you’re in an area that is heavy with twitter users local info is good, reviews, traffic, etc.

    In LA Chicago, NYC etc especially searching for the perfect Chinese dinner or the perfect pizza… Info on car svc., couriers, you get more direct results from real people instead of just another directory.

  7. @JG

    Twitter is a fine example of the Wisdom of the Language at work. The contributions to twitter are of a specific type. Several months ago, it was primarily PR and marketing types (and also an “early adopter” and/or “web oriented” crowd). “What are you doing?” is a hoax – only noobs actually tweet that they’re eating a sandwich or whatever. Also, now several other “groups” have joined twitter — quite prominently journalists, because they realize that “what’s happening” is now communicated through the open forum that’s all a-twitter.

    Twitter’s “algorithm” is primarily meaningful for detecting current trends. There are also a lot of spammers trying to get their topics on the “leaderboard” — and that’s got the PR and marketing folk a little hot under the collar… (I wonder why? πŸ˜‰

    As I wrote on Kara Swisher’s recent blog post ( http://kara.allthingsd.com/20090209/does-real-time-search-make-twitter-a-google-killer-its-fanbots-think-so-boomtown-not-quite-yet/#comment-8387 ), twitter is too “one-size fits-all” to be meaningful in the long run — except perhaps to the entertainment industry or something like that. If you can remember the “National Enquirer”, then I think that’s a good approximation of where twitter.com is most likely headed.

    πŸ™‚ nmw

  8. Example from real-life:

    Last night there was a mild earthquake in So. California. If I had Googled “earthquake” the first page would have been filled with results about earthquake research or The Earthquake Cafe, but when I searched “earthquake” on twitter there were hundreds of real-time results from people who had just felt the earthquake and were reporting location and their personal experience.

  9. Blogs that are using the Following feature have automatically been migrated to Friend Connect, so you don’t need to do a thing. And be sure to stay tuned – over the next several months, there are a number of additional features coming to Blogger as a result of today’s Friend Connect integration.
    One note for readers who previously followed one or more sites with Blogger Following and joined one or more sites with Friend Connect: you can decide whether or not to show the sites you’ve joined via Blogger Following and Google Friend Connect in your Blogger profile. By default, we have turned the “Show blogs I follow in my Blogger profile” option off. If you change your mind, here are instructions for making this list of sites public on your Blogger profile.

  10. andilinks, nmw,

    So what you’re saying is that the interesting thing about Twitter search is not the search. Anyone could implement Twitter search with a combination of basic IDF and sort-by-reverse-time algorithms. What’s interesting about Twitter search is simply the data that populates the service?

    If Google had access to all this same data, and slapped a sort-by-reverse-time filter on it, then it would be Google that was interesting again, not Twitter, right?

    Is this not just a small little vertical, inside of the normal Google search interface? Is it not “real time search”, just like Google has blog search and image search, etc. Google could even have a “Tweet One-Box” show up at the top of your results, with all the real time results before the regular Google results, right?

    One thing, though, that I do find interesting about your example, andilinks, is that you seem to say that you would only be satisfied with having hundreds of personal experience reports, as a result of your search. It sounds like you are interested in recall-oriented search, rather than precision-oriented search.

    So from that standpoint, I am open to being convinced that Twitter search really does represent something different than Google, because Google, for 10 years, has repeatedly demonstrated that they do not know how to do recall-oriented search. They’re only good at precision-oriented search. I don’t even know if they realize that recall-oriented search information needs exist.

    I’ve been commenting about this on John’s blog for 3-4 years now. Here is a recent comment, going into more detail about it:


    So maybe the web community is finally waking up to the need for more recall-oriented searches? It’s not a new idea. It’s decades-old. But maybe the web is finally realizing that a whole new world opens up when you no longer approach web search as “googling” the one, top, best result, but instead try to discover dozens or hundreds of interesting results?

  11. @JG:

    Google is meaningless. Twitter.COM is commercial twitter.

    That’s it — end of story.

    You can apply any/all algorithms you want, what matters most is the domain of data being searched.

    Another example: searching classified ads will lead to different results than searching newspaper articles. Google searches all of it and this leads to confused results — GIGO. Google search is just as meaningless as searching for all instances of strings such as “a” OR “the” in a dictionary.

    There is no trade-off between recall and precision — that myth is primarily promulgated by sloppy thought.

    recall = # relevant hits in hitlist / # relevant documents in the collection

    precision = # relevant hits in hitlist / # hits in hitlist

    ( http://www.oracle.com/technology/products/text/htdocs/imt_quality.htm )

    These two statistics are independent of one another.

  12. nmw: Perhaps I do a poor job explaining. What I mean by “precision-oriented” is that users want to have high-precision at low recall. “Recall-oriented” means that users want to have high precision at high recall.

    It’s a question of whether 1 answer will satisfy an information need, or whether it’ll take 500 answers.

    If I am looking for the address to a local bookstore, then I only need one answer, any answer. I don’t care who gives me that answer. Once I find the answer, I am not going to look at all the other thousands of results, even if they all also contain the correct, relevant answer. I frankly don’t even care if the rest of the results are relevant or non-relevant; I only care about precision at low recall. This is a “precision-oriented” search.

    However, if I, like andilinks above, want to know what happened in a recent So. Cal earthquake, if I want to heard everyone’s personal accounts and varied stories, then my information need is not going to be satisfied after finding only one tweet about that earthquake. I want to find 500 tweets, and I only want to have to slog through 500 results to find those 500 relevant tweets. I don’t want to have to search through 3000 results to find those 500 relevant tweets. I am interested in high precision at high recall. That’s a “recall-oriented” search.

    If you don’t like that terminology, we can change it. But do you at least acknowledge that two different people’s information needs can have differing goals about what it takes to satisfy those needs?

    So, is Twitter finally a realization by the web that recall-oriented needs exist? Google has long believed that they don’t exist.

  13. JG,

    recall and precision are clearly defined terms — and it doesn’t matter whether you search classified ads, newspaper articles, twitter, or random scraps of paper — some results will be considered “relevant” to a certain query, and others will be considered irrelevant. Recall and precision are merely statistics (and they are clearly defined statistics).

    Now depending on the query some types of data will obviously be more relevant than other types of data. For example, searching through classified ad for tomorrow’s weather forecast would probably return close to zero relevant results. However, weather.com (or weather.net, etc.) would probably return much more relevant results.

    Google search everything every time — and that is why the results are merely mixed up confusion. It doesn’t really matter what orientation Google has — searching with such a wide scope will always lead to GIGO.

  14. @JG Yes, high precision at high recall is an important aspect of twitter but there are other ways to use its unique approach. I have experience with several California earthquakes, in fact was less than a mile from where John now lives on Oct 17 ’89 and still have tapes I recorded from local news during that event. Online utility in fact debuted and gained some fame during that event when for a while The Well in Sausalito was one open channel. I do recall listening intently to talk radio and how precious information was during that event.

    But any real-time event whether a TV program, an election or meteor shower gains magnitudes of significance when you are connecting in that way with people who are strangers yet share something in a significant way.

    But as with any new phenomenon new ways to use twitter emerge as ever larger numbers of people are using it. I for one am very excited about the potential. I do feel in a way as I did in the early ’90’s explaining the online experience a la Compuserve, The Well, and the only GUI service at the time, a nascent AOL.

  15. nmw: Yes, recall and precision are clearly defined terms. But the question is, which is more important to the user?

    You know that precision and recall are often plotted on a curve, right? Ranked lists are evaluated by plotting precision at various levels of recall. So imagine two ranked lists, with the following precision/recall curves:

    [Curve A]
    Recall: Precision:
    0.1 0.8
    0.2 0.7
    0.3 0.4
    0.4 0.3
    0.5 0.2
    0.6 0.1
    0.7 0.1
    0.8 0.1
    0.9 0.1
    1.0 0.1

    [Curve B]
    Recall: Precision:
    0.1 0.4
    0.2 0.4
    0.3 0.4
    0.4 0.7
    0.5 0.8
    0.6 0.9
    0.7 0.9
    0.8 0.9
    0.9 0.9
    1.0 0.9

    (Note that no interpolation is being done).

    So, no matter what type of data you are searching, which of these two curves, A or B, is better? That’s my question to you. Any my answer is two part: It depends.

    (1) Curve A has 80% precision at the top of the ranked list, whereas Curve B has 40% precision at the top of the list. So a user that is just looking for an answer, any answer, will prefer Curve A. No matter if that answer is weather, classifieds, etc. It doesn’t matter.

    (2) However, if the user is looking for *all* answers, all information, because they are (like andilinks above) trying to follow the So. Cal earthquake story, then that user is going to prefer Curve B. Initially, at the top of the ranked list, the results are worse. That would be like the first page of Google being filled with junk. But once you get past the first page, the rate at which relevant information is found is much higher for Curve B than for Curve A. You can ultimately find more total relevant information under Curve B, despite the fact that it got off to a bad start, than you can under Curve A.

    So which curve is better, A or B? The answer to that question has nothing to do with the type of information you are seeking. It has everything to do with what you actual needs or goals are. As I explained above, if you are looking for an address, you prefer Curve A. If you are trying to follow an earthquake story, you prefer Curve B.

    Google, no matter what type of information it is searching, has optimized its engine to produce Curve A. It has effectively declared that there do not exist any users in the world who have a type-Curve-B information need.

    And all I am trying to say is that it appears that people who are searching Twitter have a type-Curve-B information need. Not (necessarily) because of the data, but because of the needs of the people doing the search.

  16. @andilinks: I’ve no doubt that you (and, once the tipping point hits, the rest of the world) find value in indexing and searching this particular type of data / stream / real-time information.

    The question I was after was whether tweet-data represents not only a different search “vertical” from anything that Google currently offers, but whether it also invites a fundamentally different type of user information need.

    On the web, especially in the interface that Google has designed, search pretty much equals navigation and/or known item finding. You know something is out there; you just don’t know what the URL is. A business has an address. You know that address exists. You just need to find it. The country of Lichtenstein has a gross domestic product. You don’t know what it is, but you know that exists. You need to find it.

    And once you’ve found that fact, URL, address, location, etc. you’re done. No additional web pages or answers will give you any more information than you already have. You don’t need to find 50 pages with the GDP of Liechtenstein. 1 page will do. You no longer have a need to find anything else, because your information need has been satisfied.

    Finding the 1 page, rather than the 50 pages, is what Google knows. This is what Google excels at. This is what Google has optimized itself for.

    And I was just trying to figure out, from the information need side of things, whether your needs wheh searching Twitter are similar in spirit to your needs when searching the Web. Not in terms of the data, but in terms of what it takes to satisfy your information need.

    From the way you described things, it sounded like your information need would not be satisfied by navigating to someone’s particular tweet. And your information need would not be satisfied by answering one particular factoid. No. It sounds like what you are really after is the *story*, the whole set of atomic information units that, together, tell a story that is greater than any of the parts. It sounds like you have a recall-oriented information need.

    I was wondering this, because I thought to myself, well, all Google needs to do is get good at crawling tweet data in real time, and then they can use all their current web algorithms and interfaces. But if those web algorithms and interfaces are optimized for navigation and fact finding, then Google’s approach is fundamentally broken, and it’s not just a matter of having access to the data.

    So you agree? Twitter is not just different data, but when searching twitter, you have categorically different type of information need, and information need satisfaction criterion?

    Because if that’s really true, if Twitter represents the web’s first real foray into recall-oriented, information-oriented, “story”-oriented search, then that is (pardon the caps) HUGE. It means that the competitive field is wide open, because Google doesn’t know how to do that kind of search. It means that there is a brand new, wide open space, just waiting for innovation.

  17. @JG

    re: “You know that precision and recall are often plotted on a curve, right?”

    Lots of stupidity gets plotted on curves — you have to pay attention to the stuff that matters. This sort of stupidity (a hypothetical relationship between recall and precision) doesn’t matter.

    You will never get all results. To understand this, read up on the topic known as the “economics of information”:


  18. Lots of stupidity gets plotted on curves — you have to pay attention to the stuff that matters.

    Absolutely. And what matters in our discussion is the question of whether there exist information needs that require more than one link/document/web page to satisfy. Until we can establish that common understanding, we have nothing else to talk about.

    Do you, or do you not agree that there are information needs that require more than one “result”?

