free html hit counter John Battelle's Search Blog | Page 42 of 547 | Thoughts on the intersection of search, media, technology, and more.

Why Data Matters, Another Interesting Signal: Direction Requests

By - May 09, 2011

GMaps Directions.png

Greg Linden, a friend to the site back when I was writing the first book, is writing more lately, and he’s got a great post about Google Maps data that highlights why we’ve decided to focus on “The Data Frame” for the Web 2 Summit this year.

Greg notes that Google has a new signal to which it can pay attention, thanks to Google Maps. And while I’m sure Greg could have figured this out on his own, he didn’t have to, because some Googlers have already published their findings in a paper titled “Hyper-Local, Direction-Based Ranking of Places.”

In short, the paper posits that when people signal their intent to go from place A to place B, they are creating the equivalent of a link, or a vote, for the place to which they are requesting directions. Pretty clever. As Greg notes:

…certain very large search engines have massive logs of people asking for directions from A to B, hundreds of millions of people and billions of A to B queries. And, it appears this data may be as or more useful than user reviews of businesses and maybe GPS trails for local search ranking, recommending nearby places, and perhaps local and personalized deals and advertising.

What Greg (and I) found surprising is that Google hasn’t been leveraging this new data signal in its current Maps (and other local) products. It’s clearly a strong signal, and one that could inform all sorts of social context as well. Can you imagine finding out others who have asked for similar directions, and then connecting you to them in some way? I sure can.

I’d also love to see a heat map of directions in real time, overlaid in time, space, and social graph.

Data like this mashed up with reviews, real time traffic, and more will be extraordinarily useful. Food for thought.

  • Content Marquee

Building A New Map And I Need Your Help: What Are The Key Categories of Data In Today's Network Economy?

By - May 04, 2011

Map 2010.png

Many of you probably remember the “Points of Control” Web 2 Summit Map from last year, it was very well received. Hundreds of thousands of folks came to check it out, and the average engagement time was north of six minutes per visitor. It was a really fun way to make the conference theme come to life, and given the work that went into its creation, we thought it’d be a shame to retire it simply because Web 2 has moved on to a new theme.

As I posted last week, this year’s theme is “The Data Frame.” From my updated verbiage describing the theme:

For 2011, our theme is “The Data Frame” - focusing on the impact of data in today’s networked economy. We live in a world clothed in data, and as we interact with it, we create more – data is not only the web’s core resource, it is at once both renewable and boundless.


Consumers now create and consume extraordinary amounts of data. Hundreds of millions of mobile phones weave infinite tapestries of data, in real time. Each purchase, search, status update, and check-in layers our world with more of it. How our industries respond to this opportunity will define not only success and failure in the networked economy, but also the future texture of our culture. And as we’re already seeing, these interactions raise complicated questions of consumer privacy, corporate trust, and our governments’ approach to balancing the two.

How, I wondered, might we update the Points of Control map such that it can express this theme? Well, first of all, it’s clear the game is still afoot between the major players. Some boundaries may have moved, and progress has been made (Bing has gained search share, Facebook and Google have moved into social commerce, etc.), but the map in essence is intact as a thought piece.

Then it struck me – each of the major players, and most of the upstarts, have as a core asset in their arsenals *data*, often many types of it. In addition, most of them covet data that they’ve either not got access to, or are in the process of building out (think Google in social, for example, or in deals, which to my mind is a major play for local as well as purchase data.) Why not apply the “Data Frame” to the map itself, a lens of sorts that when overlaid upon the topography, shows the data assets and aspirations of each player?

So here’s where you come in. If we’re going to add a layer of data to each player on the map, the question becomes – what *kind* of data? And how should we visualize it? My initial thoughts on types of data hew somewhat to my post on the Database of Intentions, so that would include:

- Purchase Data (including credit card info)

- Search Data (query, path taken, history)

- Social Graph Data (identity, friend data)

- Interest Data (Likes, tweets, recommendations, links)

- Location Data (ambient as well as declared/checked in)

- Content Data (Journey through content, likes, engagement, “behavioral”)

Those are some of the big buckets. Clearly, we can debate if, for example, identity should be its own category, separate from social, etc, and that’s exactly the kind of argument I hope to spark. I’m sure I’ve missed huge swaths of landscape, but I’m writing this in a rush (have a meeting in five minutes!) and wanted to get the engine started, so to speak.

I’m gathering a small group of industry folks at my home in the next week to further this debate, but I most certainly want to invite my closest collaborators – readers here at Searchblog, to help us out as we build the next version of the map. Which, by the way, will be open sourced and ready for hacking….

So please dive into comments and tell me, what are the key categories of data that companies are looking to control?

Set The Data Free, And Value Will Follow

By - April 28, 2011

foreign_coins.jpg

(NB: Much has been written and said on this topic, and this post is in no way complete. We’ll be exploring this issue and many others related to data at the Web 2 Summit this Fall).

Perhaps the largest problem blocking our industry today is the retardation of consumer-driven data sharing. We’re all familiar with the three-year standoff between Google and Facebook over crawling and social graph data. Given the rise of valuable mobile data streams (and subsequent and rather blinkered hand wringing about samesaid) this issue is getting far worse.

Every major (and even every minor) player realizes that “data is the next Intel inside,” and has, for the most part, taken a hoarder’s approach to the stuff. Apple, for example, ain’t letting data out of the iUniverse to third parties except in very limited circumstances. Same for Facebook and even Google, which has made hay claiming its open philosophy over the years.

And this trend is not limited to the large players. I currently have 302 photos locked up in a service called Twitpic. I’d very much like to export them into my iPhoto library, so I can mange them as part of the rest of my photo library. But the only way to do that is to “right click” on each and every one of those photos, copying them to my desktop. That’s several hours of work that most folks simply won’t do. When an enterprising coder wrote an automated script that exported photos from Twitpic to another service called Posterous, Twitpic blocked the program. That was about the time I stopped using Twitpic.

This trend, I predict, will become the petard upon which our industry will hoist itself over the next couple of years. Very well intentioned projects like DataPortability.org and others are working on this issue, but it’s largely hidden from public view and debate, because that debate has been framed as “Us versus Them”, where the “Them” are presumably evil and profit-driven companies who want to leverage our data for their own gain. (See the entire WSJ series as exhibit A in this debate).

So far, the approach companies seem to be taking boils down to this: The data we have is too valuable to let our customers understand it, manage it, and ultimately, do whatever they want with it. We’ll say soothing things, and we’ll let our users take some actions with their data – Facebook will let you authenticate using Facebook Connect on third party sites, for example – but we won’t let you take the data you’ve created on our services, put it in your own pocket (so to speak), and hand it over to other services and platforms such that those platforms can add value to your daily life.

In other words, if information is truly currency in today’s economy, so far the coins in your pocket are all from different countries, and there’s no global exchange mechanism. They’re only worth something in the nation in which they’ve been minted.

For example, you can’t pass your Facebook identity to a third party site so as to enable that site to serve you a better advertising experience. While Facebook insists that your Facebook data is, in fact, *yours*, it turns out it’s not yours if you want to use it to help a third party make money. In other words, it’s not really yours if it has true value to a third party. Which, in essence, means it’s only yours if it’s not valuable to anyone but you. But value is most often a social concept – something has value because a third person values it.

If the true value of the economy we are building is to be unlocked, that value has to flow unchecked from one party to another. Were this to be true, differentiation of services would migrate to a higher level of the stack, so to speak. Services would be considered valuable for what they did with data given to them by consumers, rather than by their ability to lock consumer’s data into their proprietary platform. New models would emerge to reward those services for adding that value, and those models would be both more robust, and far larger than the “one ring to rule them all” model currently at play.

As things stand today, our industry’s practices are gaining the attention of dead-serious regulators, spurred to potentially early lock down of how data is used based on an incomplete understanding of how value will flow through future economic models yet to be invented. (More on this in another post).

A generation from now our industry’s approach to data collection and control will seem outdated and laughable. The most valuable digital services and companies will be rewarded for what they do with openly shareable data, not by how much data they hoard and control.

Now, I live in the real world, and I understand why companies are doing what they are doing at the moment. Facebook doesn’t want third party services creating advertising networks that leverage Facebook’s social graph – that’s clearly on Facebook’s roadmap to create in the coming year or so (Twitter has taken essentially the same approach). But if you are a publisher (and caveat, I am), I want the right to interpret a data token handed to me by my reader in any way I chose. If my interpretation is poor, that reader will leave. If it adds value, the reader stays, perhaps for a bit longer, and value is created for all. If that token comes from Facebook, Facebook also gets value.

Imagine, for example, if back in the early search days, Google decided to hoard search refer data – the information that tells a site what the search term was which led a visitor to click on a particular URL. Think of how that would have retarded the web’s growth over the past decade.

Scores of new services are emerging that hope to enable a consumer-driven ecosystem of data. Let’s not lock down data early. Let’s trust that what we’re best at doing is adding value, not hoarding it.   

More on this in my 2007 post The Data Bill of Rights, not to be confused with the “Commercial Data Privacy Bill of Rights,” introduced last week. While well intentioned, this bill does not consider data ownership and portability.

Announcing Web 2 Summit 2011: The Data Frame

By - April 25, 2011

web2summitschmidt.png

If you’ve been reading my musings these past few months, you may have noticed an increasing fascination with data. Who owns it (the creator, the service, both? Who has access to it – ISPs? Device makers? Marketers? The government? And how are we as an industry leveraging data to create entirely new classes of services?

Well, expect a lot more musing here, because (finally!) we’re ready to announce the theme for the Web 2 Summit, 2011, and it’s this: The Data Frame. From my overview, just posted on the site:

For Summit 2010, we noted that the Web ecosystem had shifted into something of a battlefield, with both major players and upstarts jockeying for lead positions around key “Points of Control.” Looking back at our theme one year later, it’s clear the game is still in its early phases – most of the major players have held their ground and continue to press into new territory. Meanwhile, the cycle of startup creation has intensified and compressed.

Given all this, we’re tempted to simply declare 2011 “Points of Control, The Sequel.” But we’ve noticed a constant uniting nearly all the battles around these strategic regions. That constant? How companies (and their customers) leverage data.

In our original Web 2.0 opening talk, as well as in Tim’s subsequent paper “What is Web 2.0,” we outlined our short list of key elements defining the emergent web economy. Smack in the middle of that list is this statement: “Data Is the Next Intel Inside.” At the time, most of us only vaguely understood the importance of this concept. Three years ago we noted the role of data when “Web Meets World,” and two years ago, we enlarged upon it with “WebSquared.”

This year, data has taken center stage in the networked economy. We live in a world clothed in data, and as we interact with it, we create more – data is not only the web’s core resource, it is at once both renewable and infinite. No longer tethered to the PC, each of us bathes in a continuous stream of data, in real time, nearly everywhere we go.

In the decade since search redefined how we consume information, we have learned to make the world a game and the game our world, to ask and answer “what’s happening,” “what’s on your mind,” and “where are you?” Each purchase, search, status update, and check-in layers our world with data. Billions of times each day, we pattern a world collectively created by Twitter, Zynga, Facebook, Tencent, Foursquare, Google, Tumblr, Baidu, and thousands of other services. The Database of Intentions is scaling to nearly incomprehensible size and power.

Of course, this fact raises serious issues of consumer privacy, corporate trust, and our governments’ approach to balancing the two. As we learn to leverage this ever-shifting platform called the Internet, we are at once renegotiating our social, economic, and cultural relationships – and we’re doing it in real time. How we interact with each other, how we engage with our government, how we conduct business, and even how we understand our place in the world – all has changed in the short two decades since the dawn of the commercial Internet. And all of this is described through a matrix of data, the power of which our culture is only beginning to recognize.

At the Web 2 Summit 2011, we’ll use data as a framing device to understand the state of the web. We know that those who best leverage data will win. So who’s winning, and how? Who’s behind? In each of our key points of control such as location, mobile platforms, gaming, content, social – who is innovating, and where are the opportunities? What new classes of services and platforms are emerging, and what difficult policy questions loom? And what of the consumer – will users become their own “point of control,” and start to understand the power of their own data?

These are some of the questions we’ll be asking and answering at the 8th annual Web 2 Summit. We look forward to exploring them together.

Web 2 Summit 2011

The Palace Hotel San Francisco

Oct. 17-19, 2011

Registration is now open, and an early line up of speakers will be announced shortly (we already have ten amazing names, but I’m holding off till we have at least a baker’s dozen). Stay tuned, and join the conversation.

* And yes, we’ll be updating our “Points of Control” Map with a new layer – the Data layer, naturally.

web2image.jpg


Plato On Facebook

By - April 21, 2011

plato.png

One of my first “big books” out of college was James Gleick’s Chaos: Making a New Science and it still resonates with me, though it’s been so long I think I’m due for a re-read. In any case, the next book up in my ongoing self-education is Gleick’s The Information: A History, a Theory, a Flood. It’s long. It’s dense. It’s good, so far. In fact, there’s already a passage, a quote from Plato, that has struck me as germane to the ongoing threads I attempt to weave here on this site (even if all I’m really making is a lame friendship bracelet – pun intended, as you will see).

Early in the book, Gleick narrates the birth of the written word, which if you think about it (and he certainly has), is quite an extraordinary event. Turns out Plato, who was literate (and therefore quotable today), was not a fan of the written word. His mentor Socrates, Gleick reminds us, was illiterate. Well, OK, that’s not fair. Socrates wasn’t illiterate, he was, in Gleick’s words, a “nonwriter.” In any case, the passage that struck me is Plato speaking about the written word, quoted in “The Information”:

For this invention will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them .You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of wisdom, not true wisdom.

Nicholas Carr would be proud of Plato. But both would be wrong.

Definitions of wisdom shift as cultures shift. Now, of course, to be wise is to be literate. Then, to be wise was to commit knowledge to memory. Now, it’s to the ability to lookup (to search, to find, to divine patterns). I’ve called this search literacy in the past, but I think we’re moving toward something larger.

Consider the same passage, liberally edited to be a critique of the new medium of Facebook and social networking, rather than the new medium of the written word.

For this invention will produce disconnection in the minds of those who learn to use it, because they will not practice true relationships between people. Their trust in Facebook, produced by external connections which are no part of themselves, will discourage the use of their own ability to maintain relationships.You have invented an elixir not of relationships, but of reminding one of relationship; and you offer your pupils the appearance of connection, not true connection.

When writing was new, it was strange, and it was hard to imagine a society based on the written word. At the dawn of digital connectivity, the same holds true. Are digital relationships real? Is the grammar of Facebook robust enough to hold all the nuance of true connection?

Probably not yet. But I for one am happy Plato learned to write. And I can also imagine a time – well after these words sink deeply into the sediments of history – when Plato and Facebook are united in a new technology of memory, relationship, and communication that eclipses anything we might debate today.

Book Review: In The Plex

By - April 20, 2011

Last night I had the pleasure of interviewing Steven Levy, and old colleague from Wired, on the subject of his new book: In The Plex: How Google Thinks, Works, and Shapes Our Lives. The venue was the Commonwealth Club in San Francisco, and I think they’ll have the audio link up soon.

Steven’s interview was a lot like his book – full of previously untold anecdotes and stories that rounded out pieces of Google’s history that many of us only dreamt of knowing about. When I was reporting my book,The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture, I had limited access to folks at Google, and *really* limited access to Larry Page and Sergey Brin. Levy had the opposite, spending more than two years inside the company and seeing any number of things that journalists would have killed to see in years past.

The result is a lively and very detailed piece of reporting about the inner workings of Google. But I was a bit disappointed with the book in that Steven didn’t take all that new knowledge and pull back to give us his own analysis of what it all meant. I asked him about this, and he said he made the conscious decision to not editorialize, but rather lay it all out there and let the reader draw his or her own conclusions. I respect that, but I also know Steven has really informed opinions, and I wish he’d give them to us.

What I took away from In the Plex was a renewed respect for the awesome size and scope of Google’s infrastructure, as well as its ambition. Sometimes we forget that Google is more likely than not the largest manufacturer of computers in the world, and runs the largest single instance of computing power in the world. It’s also one of the largest collectors and analyzers of data in the world. All of this has drawn serious scrutiny, but I don’t think even the regulators really grok how significant Google’s assets are. They should all read Steven’s book.

Levy only grazes the surface of Google’s social blindness, unfortunately, and due to timing could only mention Page’s ascendancy to CEO in his epilogue. But his reporting on how the China issue played out is captivating, as are the many details he fills out in Google’s early history. If you’re fascinated by Google, you’ve got to add this one to your library.

Preliminary Agenda Is Live For CM Summit, Sign Up Now, It Always Sells Out…

By -

hudson theater_image.jpg Federated Media is proud to present the sixth annual Conversational Marketing Summit, June 6-7 at the fabulous Hudson Theater in the Millennium Broadway Hotel in Times Square. The preliminary agenda is now up, more is coming, but you can get a pretty good sense of the lineup – it’s amazing.
This year’s CM Summit will bridge the conversations of FM’s regional Signal conferences on one stage, bringing together the topics of content marketing, location services, mobile, data, and the real-time web onto one stage.

See our initial agenda, now live on the site.

The rise of digital platforms present massive opportunities, but one significant challenge: finding the signal in an increasingly noisy ecosystem of sites, apps, and services. Audiences fragmented between usage on Facebook and Twitter are constantly faced with new services like Groupon, Foursquare, Color, and SimpleGeo. How can we, as marketers, help our customers find the signal that’s right for them? CM Summit we will dive into a day and half of rapid-fire case studies, insightful one-on-one conversations, and dynamic High Order Bits that will help brands, agencies, and marketers better understand consumer trends, experiences and industry signals.  

Join the conversation! This event always sells out.
REGISTER TODAY and get your early-bird pricing, available only until this Friday, April 22. Special thanks to our event sponsors: RIM, AT&T, Google, cms2011-register-now.jpgQuantcast, Demand Media, Facebook, Outbrain, Pandora, R2integrated, Slideshare, Yahoo!, AOL, Mobile Roadie, Spiceworks, Ustream; and our partners: IAB, Mashable, SMAC, and paidContent.
We look forward to seeing you this June 6-7 in New York!
Please visit our site for hotel booking details, a full list of speakers, and more event details.