free html hit counter Joints After Midnight & Rants Archives | Page 13 of 43 | John Battelle's Search Blog

A Big Issue: Taking Control of Your Own Identity and Data – Singly Founder Responds

By - October 18, 2011

If there was a theme to Day One at Web 2 Summit, it was this: We have to start taking control of our own identity and data. And this is not just because we might be worried about how the government or large platforms might use our data (though both issues certainly came up in talks with Chris Poole, Senator Ron Wyden, Genevieve Bell, and Sean Parker, among others). But also because of the value and benefits that will accrue to us and to society in a culture that values individual control of data. Problem is, it’s not simple or natural to do so….yet.

This reminded me of a post I did a couple of weeks ago, called I Wish “Tapestry” Existed. It elicited a very thoughtful response from Jason Cavnar, co-founder of the important Lockers Project and Singly, the startup which hopes to drive this trend forward. So for a bit of light reading, go back to that link and peruse my musings, then read this, which Jason was kind enough to write up based on the points I made (in bold) and agree to let me post:

JB: Services don’t communicate with each other; and # of services (apps) we use is skyrocketing
Cavnar: they don’t talk to each other, but what all apps do talk to, is you. You should be the protocol around which those things are built and data flows.

Also important: data doesn’t do us justice. This is about LIFE. Our lives. Or as our colleague Lindsay (@lschutte) says — “your story”. Not data. Data is just a manifestation of the actual life we are leading. Our data (story) should be ours to own, remember, re-use, discover with and share.

JB: Cool idea…but Tapestry would be hard to do b/c of policy, not tech
Cavnar: the technology actually isn’t trivial – most startups are spending 3-6+ months just doing data aggregation and cleaning — creating common reference points between data sets; (we have talked to 3 dozen + startups about this including sophisticated folks like the people down at SRI). More important than data reclamation and organization would be: how it gets stored; where it gets stored; who do you trust to hold onto it; ensuring the format “operable” (can developers do things with that data?) no matter where it lives; etc. The Locker Project (a placeholder name) is a community that will make sure the data structure gets figured out — the standards for “me” data. Singly is going to be the storage and access brand that you trust to store and empower you with your digital life.

JB: Tapestry = snapshot of what Dr. J is up to; Dr. J doesn’t use social services b/c value doesn’t exceed time invested
Cavnar: the point about Dr. J using those services more if Tapesty existed is very true and interesting — I wish more people recognized that; Also cool: if Dr. J were assured permanence of the data he is creating, he would likely create more liberally.

JB: I have only 5 social platforms
Cavnar: a ton of the data we create as individuals doesn’t take place on those 5 platforms first. The growth of apps is outpacing the growth of those platforms. Ex: most of my photos on Facebook are now originating from Instagram. My listening on Rdio/Spotify. My location data takes place at the service provider level (ATT, Verizon) first. Health Data…Car data…purchase data, etc.

What I really hear you asking are these questions:

Where do we combine and take with us all of our data?
Where is our data home? (a phrase coined by @mdzimm)
What will be our data address?
Shouldn’t that address be mine?
How is that related to our identity?
Shouldn’t the life I lead wind up with all of it’s memories stored in my home?
Shouldn’t someone provide me with home security?
Who is watching the kids when they are home alone and someone (app) wants to borrow milk (data)?
Does the proverbial USPS decide who I am? Or do they just ensure I can be found and send/receive?

JB: An option = pour all of this into Facebook
Cavnar: the problem is not just that it isn’t under your control, but that a 3rd party with interests other than solely and objectively empowering us then dictates how that data is structured and re-used, if at all. Should we, as a society, around such an important issue (our lives), trust a single company to decide / perform those functions? We haven’t, as a society, decided to all live within the same planned communities, home models and use the same interior decorators.

Tapestry can only be built if Facebook decides to enable them to develop it’s own feel/look/value. And you’d only be able to instrument Tapestry to you to the degree that Facebook decided. IE: not developers and not the end user. No home remodeling allowed. Facebook wants to empower developers and is grappling with how to create a win-win for developers and FB. As an industry, we’re at a point where we need to start thinking about win-win-wins (companies with data, developers and you/me/us). Your Tapestry example is one of thousands.

JB: If Tapestry gains traction, I’m worried Facebook would ban it
Cavnar: A few thoughts:

1) Facebook has actually expressed (including this year at f8) their conviction that people own their data. (Mark Zuckerberg’s blog post). John Doerr at KPCB (a Facebook investor) reiterates this belief (37:20) Facebook allows people to download their data from them because of this belief, and their TOS is a license of your data. And there will be more solutions they can offer people coming into play that will let them live out this belief even more elegantly.

2) Ecosystems win: Given that Dr. J, and a lot of other people don’t use Facebook zealously, would Tapestry suffer without Facebook as an experience? And if Tapestry took off, or Dr. J uses Facebook more because of Tapestry, won’t it behoove Facebook to be a part of that experience rather than absent from it?

3) Empowerment wins: once each of us have a digital home, and Tapestry is built on top of that data, along with a whole world of useful, personalized apps, this worry fades. What Jeremie experienced with Jabber is not dissimilar. Utility and empowering people to do more, connect more, etc will win the day and I don’t see Facebook ignoring the AOL history lesson, especially after they go public. Their leadership is sharp.

4) Inalienable rights win: I refuse to believe we are at a point in history where it is a forgone conclusion that people aren’t fundamentally entitled to the data they create. At the foundation of our country’s heritage is the Lockean notion of “Lives, Liberties and Property which Men have in their Persons as well as Goods”. In a worse case scenario, this issue goes to Washington. The folks there are deeply aware of people’s rights in this space. Look no further than Aneesh Chopra and Danny Weitzner and you find people who truly “get it”. Not just on a policy level but an innovation/economic opportunity/systemic problem-solution level.

5) We’re in this together: the leaders of our industry are decent people. We innovate because we care about people’s stories. And making the world better through technology. We are all part of a narrative far greater than those spelled out in Terms of Service. Not only has Facebook said people own their data, but of course Google is starting to make that easier (Takeout) and Dick Costolo tonight reaffirmed Twitter’s core belief that people should have a copy of their Tweets and it’s simply a matter of time to get the history off disk.

6) Innovation wins: Nobody in the business of innovation and human advancement/potential would argue that innovation takes place at the edge of the network. Closest to people. From mainframes to PCs. From landlines to smartphones. The closer to people that you put information, processes (apps), and power (tech), the more creative and economically productive we get. It’s that simple. We need our data. Closest to us. Apps, running on that data. Building Tapestry shouldn’t be hard. Tapestry existing makes the world a better place. Again, Terms of Service cannot argue with that narrative.

What’s Next:
Let’s suspend belief for a minute that we all got a digital home. What we then need is:

– a standard way to organize our data (this is why Singly is open source – so structure isn’t a point of control)

– a place to store all of our data (a home) that we trust and who is aligned to protect us, not use our data for other means. This doesnt have to be a single company, by any means.

– a medium you trust through which you can transmit the data

– a platform that can “address” your data home and mine all the same no matter where we choose to host it, so that Tapestry can have both of us as users and neither of us have to be locked into a single storage choice. Don’t trust Apple anymore? Cool, go to Singly. Don’t trust Singly? Go host your Data on your home server. Etc.

– a rich developer ecosystem adding value time and time again both to the underlying core software, as well as at the application layer.

  • Content Marquee

I Wish "Tapestry" Existed

By - October 07, 2011

tapestry03lg.jpeg

(image) Early this year I wrote File Under: Metaservices, The Rise Of, in which I described a problem that has burdened the web forever, but to my mind is getting worse and worse. The crux:

“…heavy users of the web depend on scores – sometimes hundreds – of services, all of which work wonderfully for their particular purpose (eBay for auctions, Google for search, OpenTable for restaurant reservations, etc). But these services simply don’t communicate with each other, nor collaborate in a fashion that creates a robust or evolving ecosystem.”

I noted that the rise of AppWorld only exacerbates the problem (apps rarely talk to each other or share data).

This must change. Not due to my philosophical problems with a closed web (though I do have that problem) but because yesterday, while driving back from an afternoon in the Valley, I had an idea for a new service, which for now I’ll call Tapestry, for lack of a better name. And then I got depressed: I figured making such a service would be really, really hard to do. And it shouldn’t be. And I hate getting depressed so quickly after having a fun idea.

Read More Read More

Me, On The Book And More

By - October 06, 2011

Thanks to Brian Solis for taking the time to sit down with me and talk both specifically about my upcoming book, as well as many general topics.

Google = Google+

By - September 29, 2011

Earlier this week I participated in Google’s partner conference, entitled Zeitgeist after the company’s annual summary of trending topics. Deep readers of this site know I have a particular affection for the original Zeitgeist, first published in 2001. When I stumbled across that link, I realized I had to write The Search.

The conference reminds me of TED, full of presentations and interviews meant to inspire and challenge the audience’s thinking. I participated in a few of the onstage discussions, and was honored to do so.

I’d been noodling a post about the meaning of Google’s brand*, in particular with respect to Google+, for some time, and I’d planned to write it before heading to the conference, if for no other reason than it might provide fodder for conversations with various Google executives and partners. But I ran out of time (I wrote about Facebook instead), and perhaps that’s for the good. While at the conference, I got a chance to talk with a number of sources and round out my thinking.

I also got the chance to ask Larry Page a question (video is embedded above, the question is at 19.30). In essence, my query was this: For most of Google’s history, when people thought about Google, they’d think about search. That was the brand: Google = search. For the next phase of Google’s life, what does Google equal?

I asked this question with an answer in mind (as I said, I’d been thinking about this for some time), but I didn’t get the answer I had hoped for. What Page did say was this:

“I’d like the brand to represent the things I just spoke about (for that, see the video) … it’s important that people trust the brand…that we’re trustworthy…and I think also it should stand for a beauty and technological purity…innovation, and things that are important to people, driving technology forward.”

The text above doesn’t really do Page’s answer justice, because somehow when he said “beauty” – a word I was surprised to hear – he delivered it with a sincerity that I and others at the conference found…almost Apple-like.

Then again, Page didn’t directly answer the question, at least from a marketing standpoint. In 2009, Google’s brand = search. That kind of clarity and consistency is what every marketer seeks to define in their brand.

At the moment, Google’s brand is a bit confusing. Google equals Chrome. And YouTube. And Android. And Google Docs. And Gmail. And Maps, Places, Voice, Calendar….and self driving cars, and investments in energy research, and antitrust hearings, and Adwords, and of course search. Not to mention Google+.

Oh, and Motorola.

One can forgive the average consumer if he or she is a bit confused about what Google really means.

In conversations with various Google executives over the past few weeks, including leaders in product, marketing, and search, it’s clear that the company is well aware of this problem, and is focused on finding a solution. And while most have seen Google+ as the company’s answer to Facebook’s social graph, I now see it as something far bigger.

In short, Google+ = Google.

Google VP of Product Bradley Horowitz, who I know well enough to know he doesn’t say things without thinking about them a bit, recently told Wired as much, but the context was missing. To wit:

Wired: How was working on Google+ different from working on the company’s previous offerings?

Horowitz: Until now, every single Google property acted like a separate company. Due to the way we grew, through various acquisitions and the fierce independence of each division within Google, each product sort of veered off in its own direction. That was dizzying. But Google+ is Google itself. We’re extending it across all that we do—search, ads, Chrome, Android, Maps, YouTube—so that each of those services contributes to our understanding of who you are.

Horowitz is making an important point, but the interview moved on. It should have lingered. In those conversations with Googlers over the past month, I’ve heard one consistent theme: Larry Page is obsessed with Google+, and not just for its value as a competitor to Facebook. Rather, as I wrote earlier this month, Google+ is the digital mortar between all of Google’s offerings, creating a new sense of what the brand *means*.

So what is that meaning? I’d like to venture a guess: one seamless platform for extending and leveraging your life through technology. In short, Google = the operating system of your life.

At the moment, there are really only three serious players who have the technological, capital, and brand resources to stake such an audacious claim. Of course, they are Apple, Microsoft, and Google (Amazon seems on the precipice of becoming the fourth). Of the three, Apple has the best handle on its brand. And Microsoft made its brand in the operating system world, so it has at least pitched its tent in the right part of our collective mindspace.

But Google? Well, Google’s got some brand work to do. Google’s products don’t all work together in a seamless way, and at first glance, don’t seem to all speak to the same brand experience. Google+ is the company’s attempt to address that problem, such that every experience with Google “makes sense” from a brand perspective. Which is to say, from the customer’s point of view. As a very senior Google marketing executive recently told me: “There’s a reason it’s called Google….plus!”

If this is correct, then the stakes of ensuring that Google+ succeeds are raised, significantly. Google has twice tried to out-social Facebook (Buzz, Orkut), and neither quite worked. But this time, Google’s not just trying to beat Facebook. It’s being far more ambitious – it’s trying to redefine what happens inside your brain when you consider the concept of “Google.” Part of that is social, sure. But far more of it has to do with being the brand to which you entrust nearly every technology-leveraged part of your life. HugeG+Ad.png

If that indeed is what the company is trying to do, I’m more certain that Google+ will succeed. Why? Because it means the company is committed in a new way to a singular purpose. It means it will cut new kinds of deals so as to compete (like bringing Cityville to Google+, or undermining Facebook’s Skype partnership through Hangouts, or, soon, bringing media and marketing into Google+). It means tying Google+ to its core promotion engine of search (which it most certainly has). And it means, as Horowitz told Wired, “extending (Google+) across all that we do.” I recently asked Google’s head of local, Marissa Mayer, what percentage of her products were integrated with Google+. Five or so percent, she told me. But she quickly added: That’s going to change, and fast.

At Zeitgeist, when Page answered my question about the brand, he answered mostly with meaning – innovation, trust, beauty. But Larry spoke for twenty or so minutes prior to my asking him that question, and he mentioned Google+ over and over, pressing how important the project was, and how excited he was about it. So come to think of it, maybe his first response to me – I’d like the brand to represent the things I just spoke about - was all the answer we really needed.

* And not for the first time. I’ve written about it quite a bit….the precursor to this post is this one: On Google’s Brand. More here .

Facebook As Storyteller

By - September 25, 2011

1316765387_5.jpeg

(image) Recently I was in conversation with a senior executive at a major Internet company, discussing the role of the news cycle in our industry. We were both bemoaning the loss of consistent “second day” story telling – where a smart journalist steps back, does some reporting, asks a few intelligent questions of the right sources, and writes a longer form piece about what a particular piece of news really means.

Instead, we have a scrum of sites that seem relentlessly engaged in an instant news cycle, pouncing on every tidbit of news in a race to be first with the story. And sure, each of these sites also publish smart second-day analysis, but it gets lost in the thirty to fifty new stories which are posted each day. I bet if someone created a venn diagram of the major industry news sites by topic, the overlap would far outweigh the unique on any given day (or even hour).

This is all throat clearing to say that with the Facebook story last week, I am sensing a bit more of a “pause and consider” cycle developing. Sure, everyone jumped on the new Timeline and Open Graph news, but by day two, I noticed a lot more thought pieces, and most of them were either negative in tone, or sarcastic (or both.) Exmples include:

Can Facebook Become the Web? (Fortune)

The Facebook Timeline is the nearest thing I’ve seen to a digital identity (and it’s creepy as hell) (benwerd)

Dazed and Confused? Welcome to the Club (PC)

Facebook Just Shifted From Scale to Engagement (AdAge)

Facebook’s terrible plan to get us to share everything we do on the Web. (Slate)

@ F8: Zuckerberg Wants Users’ Whole Lives, But To What End? (PC)

Analysis of F8, Timeline, Ticker and Open Graph (Chris Saad)

All of life has been utterly (Dan Lyon)

Now, I am not endorsing all these pieces as perfect second day posts, but collectively, they do give us a fairly good sense of the issues raised by Facebook’s big news.

I’d like to add one more thought. Perhaps this might be called a “second week” post, given it’s been four or five days since the big news. In any case, the thing I find most interesting about the new approach to sharing and publishing on Facebook lies in what Mark Zuckerberg said his new product would deliver: “The story of your life.”

Now, long time readers know where I stand when it comes to telling the “story of your life.” I’m firmly in the camp that believes that story belongs to you, and should be told on your own domain, your own terms, and with a very, very clear understanding of who owns that story (that’d be you.) And this applies to brands as well: Your brand story should not be located or dependent on any third party platform. That’s the point of the web – anyone can publish, and no one has rights over what you publish (unless, of course, you break established law).

It was our inherent desire to tell “stories of our lives” that led to the explosion of blogging ten or so years ago. And crafting a rich narrative is just that, a craft (some elevate it to art). Yet Facebook’s new timeline, combined with the promiscuous sharing features of the Open Graph and some clever algorithms, promises to build a rich narrative timeline of your life, one that is rife with personal pictures, shared media objects (music, movies, publications), and lord knows what else (meals, trips, hookups – anything that might be recorded and shared digitally).

Now, I don’t find much wrong with this – most folks won’t spend their days obsessing over their timelines so as to present a perfectly crafted media experience. I’m guessing Facebook is counting on the vast majority of its users continuing to do what they’ve always done with Facebook’s curation of their data – ignore it, for the most part, and let the company’s internal algorithms manage the flow.

But our culture has always had a small percentage of folks who are native storytellers, people who do, in fact, obsess over each narrative they find worthy of relating. And to those people (which include media companies and brands falling over themselves to integrate with Open Graph), I once again make this recommendation: Don’t invest your time, or your narrative exertions, building your stories on top of the Facebook platform. Make them elsewhere, and then, sure, import them in if that’s what works for you. But individual stories, and brand stories, should be born and nurtured out in the Independent Web.

I’ve got plenty of philosophical reasons for saying this, which I wont’ get into in this post (some are here). But allow me to relate a more economic argument: At present, there’s no way for our story tellers to make money directly from Facebook for the favor of crafting engaging narratives on top of the company’s platform. And from what I can divine, Facebook plans to make a fair amount of money selling advertising next to these new timeline profiles. As they get richer and more multi-media, so will the advertisements. Do you think Facebook intends to cut its 800 million narrative agents into those advertising dollars? I didn’t think so.

Which is just fine, for most folks – for people who don’t see the “stories of their lives” as a way to make a living. But if crafting narrative is your business, or even just a hobby that brings in grocery money, I’d counsel staying on the open web. (BTW, crafting narratives is *every* brand’s business.) For you, Facebook is a wonderful distribution and community building platform. But it shouldn’t be where you build your house.

Maybe There Really Will Only Be Five Computers…

By - September 01, 2011

File:Thomas_J_Watson_Sr.jpeg

Thomas J. Watson, legendary chief of IBM during its early decades and the Bill Gates of his time, has oft been quoted – and derided – for stating, in 1943, that “I think there is a world market for maybe five computers.” Whether he actually said this quote is in dispute, but it’s been used in hundreds of articles and books as proof that even the richest men in the world (which is what Watson was for a spell) can get things utterly wrong.

After all, there are now hundreds of millions of computers, thanks to Bill Gates and Andy Grove.

But staring at how things are shaping up in our marketplace, maybe Watson was right, in a way. The march to cloud computing and the rush of companies building brands and services where both enterprises and consumers can park their compute needs is palpable. And over the next ten or so years, I wonder if perhaps the market won’t shake out in such a way that we have just a handful of “computers” – brands we trust to manage our personal and our work storage, processing, and creation tasks. We may access these brands through any number of interfaces, but the computation, in the manner Watson would have understood it, happens on massively parallel grids which are managed, competitively, by just a few companies.*

It seems that is how Watson, or others like him, saw it back in the 1950s. According to sources quoted from Wikipedia, Professor Douglas Hartee, a Cambridge mathematician, estimated that all the calculations required to run in England would take about three “computers,” each distributed in distinct geographical locations around the country. The reasoning was pretty defensible: computers were maddeningly complex, extraordinarily expensive, and nearly impossible to run.

Now, that’s not true for a Mac, an iPhone, or even a PC. But very few of us would want to own and operate EC2 or S3.

Right now, I’d wager that the handful of brands leading the charge to win in this market might be Google, Amazon, Microsoft, Apple, and….IBM. About five or so. Maybe Watson will be proven right, even if he never was wrong in the first place.

* Among other things, it is this move to the cloud, with its attendant consequences of loss of generativity and control at the edges, which worries Zittrain, Lanier, and others. But more on that later.

More on Twitter's Great Opportunity/Problem

By - August 10, 2011

Itwitter-bird.pngn the comments on this previous post, I promised I’d respond with another post, as my commenting system is archaic (something I’m fixing soon). The comments were varied and interesting, and fell into a few buckets. I also have a few more of my own thoughts to toss out there, given what I’ve heard from you all, as well as some thinking I’ve done in the past day or so.

First, a few of my own thoughts. I wrote the post quickly, but have been thinking about the signal to noise problem, and how solving it addresses Twitter’s advertising scale issues, for a long, long time. More than a year, in fact. I’m not sure why I finally got around to writing that piece on Friday, but I’m glad I did.

What I didn’t get into is some details about how massive the solving of this problem really is. Twitter is more than the sum of its 200 million tweets, it’s also a massive consumer of the web itself. Many of those tweets have within them URLs pointing to the “rest of the web” (an old figure put the percent at 25, I’d wager it’s higher now). Even if it were just 25%, that’s 50 million URLs a day to process, and growing. It’s a very important signal, but it means that Twitter is, in essence, also a web search engine, a directory, and a massive discovery engine. It’s not trivial to unpack, dedupe, analyze, contextualize, crawl, and digest 50 million URLs a day. But if Twitter is going to really exploit its potential, that’s exactly what it has to do.

The same is true of Twitter’s semantic challenge/opportunity. As I said in my last post, tweets express meaning. It’s not enough to “crawl” tweets for keywords and associate them with other related tweets. The point is to associate them based on meaning, intent, semantics, and – this is important – narrative continuity over time. No one that I know of does this at scale, yet. Twitter can and should.

Which gets me to all of your comments. I heard both in the written comments, on Twitter, and in extensive emails offline, from developers who are working on parts of the problems/opportunities I outlined in my initial post. And it’s true, there’s really quite a robust ecosystem out there. Trendspottr, OneRiot, Roundtable, Percolate, Evri, InfiniGraph, The Shared Web, Seesmic, Scoopit, Kosmix, Summify, and many others were mentioned to me. I am sure there are many more. But while I am certain Twitter not only benefits from its ecosystem of developers, it actually *needs* them, I am not so sure any of them can or should solve this core issue for the company.

Several commentators noted, as did Suamil, “Twitter’s firehose is licensed out to at least publicly disclosed 10 companies (my former employer Kosmix being one of them and Google/Bing being the others) and presumably now more people have their hands on it. Of course, those cos don’t see user passwords but have access to just about every other piece of data and can build, from a systems standpoint, just about everything Twitter can/could. No?”

Well, in fact, I don’t know about that. For one, I’m pretty sure Twitter isn’t going to export the growing database around how its advertising system interacts with the rest of Twitter, right? On “everything else,” I’d like to know for certain, but it strikes me that there’s got to be more data that Twitter holds back from the firehose. Data about the data, for example. I’m not sure, and I’d love a clear answer. Anyone have one? I suppose at this point I could ask the company….I’ll let you know if I find out anything. Let me know the same. And thanks for reading.

The Future of The Internet (And How to Stop It) – A Dialog with Jonathan Zittrain Updating His 2008 Book

By - August 06, 2011

segment_9081_460x345.jpeg

(image charlie rose) As I prepare for writing my next book (#WWHW), I’ve been reading a lot. You’ve seen my review of The Information, and In the Plex, and The Next 100 Years. I’ve been reading more than that, but those made it to a post so far.

I’m almost done with Sherry Turkle’s Alone Together, with which I have an itch to quibble, not to mention some fiction that I think is informing to the work I’m doing. I expect the pace of my reading to pick up considerably through the Fall, so expect more posts like this one.

Last week I finished The Future of The Internet (And How to Stop It), by Harvard scholar Jonathan Zittrain. While written in 2008, this is an ever-more important book, for many reasons, in that it makes a central argument about what we’ve built so far, and where we might be going if we ignore the lessons we’ve learned as we’ve all enjoyed this E-ticket ride we call the Internet industry.

The book’s core argument has to do with a concept Zittrain calls “generativity” – the ability of a product or service to generate innovation, new ideas, new services, independent of centralized, authoritative control. It is, of course, very difficult to create generative technologies on a grand scale – it’s a statement of faith and shared values to do such a thing, and it really rubs governments and powerful interests the wrong way over time. Jonathan goes on to point out that truly open, generative systems are inherently subject to the tragedy of the commons – practices such as malware, bad marketing tactics, hacking etc. These threats are only growing, and provide a good reason to shut down generativity in the name of safety and order.

The Internet, as it turned out for the first ten or fifteen years, is one of the greatest generative technologies we’ve ever produced. And yes, I mean ever – as in, since we all figured out fire, or the wheel, or … well, forgive me for getting all Wired Manifesto on you, but it’s a very big deal.

But like Lessig before him, Zittrain is very worried that the essence of what has made the Internet special is changing, in particular, as the mainstream public falls deeper in love with services like Facebook and Apple’s iPhone.

His book is a meditation and a lecture, of sorts, on the history, meaning, and implications of this idea. After I read it, I was inspired to email Jonathan. I sent him this note:

“Hi Jonathan -

Wondering if, to start off an interview process (for my book), you might want to do a back and forth email interview that I’d publish on my site. It’d be mostly related to your book and some questions about how you view things have progressed since it came out. That would be both a good way for me to “review” the book on my site as well as to delve into some of the issues it raises in a fresh light. You game?”

To which he responded:

“Sure!”

And my questions, and his response, in lightly edited form, are below. I think you’ll enjoy his thoughts updating his thesis over the past three years. Really good stuff. I have bolded what I, as a magazine editor, might turn into a “pullquote” were I laying this out on a printed page.

JBAT:

- You wrote the Future of the Internet three years ago. It warned of a lack of awareness with regard to what we’re building, and the consequences of that lack of attention. it also warned of data silos and early lockdown. Three years later, how are we doing? Are things better, worse, the same?

And a follow up. On a scale of one to ten, where one is “actively helping” and ten is “pretty much evil,” how do the following companies rate in terms of the debate you frame in the book?

- Google (you can break this down into Android, Search, Apps, etc)

- Facebook (which was really not at full scale when you published)

- Apple

- Twitter

- Microsoft (again break it down if you wish)

Thanks!

JONATHAN ZITTRAIN:

Sorry this took me so long! I got a little carried away in answering –

- You wrote the Future of the Internet three years ago. It warned of a lack of awareness with regard to what we’re building, and the consequences of that lack of attention. it also warned of data silos and early lockdown. Three years later, how are we doing? Are things better, worse, the same?

It’s the best of times and the worst of times: the digital world offers us more every day, while we continue to set ourselves up for levels of surveillance and control that will be hard to escape as they gel.

That’s because the plus is also the minus: more and more of our activities are mediated by gatekeepers who make life easier, but who also can watch what we do and set boundaries on it — either for their own purposes, or under pressure from government authorities.

On the book’s specific predictions, Apple’s ethos remains a terrific bellwether. The iPhone — released in ’07 — has proved not only a runaway success, but the principles of its iOS have infused themselves across the spectrum. There’s less reason than ever to need a traditional PC, and by that I mean one that lets you run whatever code you want. OS X Lion points the way to a much more controlled PC zone, anyway, as it more and more funnels its software through a single company’s app store rather than from anywhere. I’d be surprised if Microsoft weren’t thinking along similar lines for Windows.

Google has offered a counterpoint, since the Android platform, while including an app store, allows outside code to be run. In part that’s because Google’s play is through the cloud. Google seeks to make our key apps based somewhere within the google.com archipelago, and to offer infrastructure that outside apps can’t resist, such a easy APIs to geographic mapping or user location. It’s important to realize that a cloud-based setup like Google Docs or APIs, or Facebook’s platform offer control similar to that of a managed device like an iPhone or a Kindle. All represent the movement of technology from product to service. Providers of a product have little to say about it after it changes hands. Providers of services are different: they don’t go away, and a choice of one over another can have lingering implications for months and even years.

At the time of the book’s drafting, the alternatives seemed stark: the “sterile” iPhone that ran only Apple’s software on the one hand, and the chaotic PC that ran anything ending in .exe on the other. The iPhone’s openness to outside code beginning in ’08 changed all that. It became what I call “contingently generative” — it runs outside code after approval (and then until it doesn’t). The upside is that the vast creativity of outside coders has led to a software renaissance on mobile devices, including iPhones, from the sublime to the ridiculous. And Apple’s gatekeeping has seemed to be with a light touch; apps not allowed in the store pale in comparison to the torrents of stuff let through. But that masks entire categories of applications that aren’t allowed — namely anything disruptive to Apple’s business model or that of its partners or regulators. No p2p, no alternate email clients, browsers with limited functionality.

More important, the ability to limit code is what makes for the ability to control content. More and more we see content, whether a book, or a magazine subscription, represented in and through an app. It’s sheer genius for a platform maker to demand a cut of in-app purchases. Can you imagine if, back in the day, the only browser allowed on Windows was IE, and further, all commerce conducted through that browser — say, buying a book through Amazon — constituted an “in-app purchase” for which Microsoft was due 30%?

A natural question is why competition isn’t the answer here — or at least reason to not worry about the question. If people thought the iPhone made for a bad deal, why would they want one? The reason they want one is the same thing that made the Mac so appealing when it first came on the scene: it was elegant and intuitive and it just worked. No blue screen of death. Consistency across apps. And, as viruses and worms naturally were designed for the most common platform, Windows, those 5% with Macs weren’t worth the trouble of corrupting.

We’ve seen a new generation of Mac malware as its numbers grow, and in the meantime a first defense is that of curation: the app store provides a rough filter for bad code, and accountability against its makers if something goes wrong even after it’s been approved. So that’s why the market likes these architectures. I’ll bet few Android users actually go “off-roading” with apps not obtained through the official Android app channels. But the fact that they can provides a key safety valve: if Google were to try the same deal as Apple with content providers for in-app content, the content providers could always offer their wares directly to Android users. I’m worried that a piece of malware could emerge on Android that would cause the safety valve of outside code to be changed, either formally by Google, or in practice as people become unwilling to drive outside the lanes.

So how about competition between platforms? Doesn’t that keep each competitor honest, even if all the platforms are curated? I suppose: the way that Prodigy and CompuServe and AOL competed with one another to offer different services as each chased subscribers. (Remember the day when AOL members couldn’t email CompuServe users and vice versa?) That was competition of a sort, but the Internet and the Web put them all to shame — even as the Internet arose from no business plan at all.

Here’s another way to think about it. Suppose you were going buy a new house. There are lots of choices. It’s just that each house is “curated” by its seller. Once you move in, that seller will get to say what furnishings can go in, and collects 30% of the purchase price of whatever you buy for the house. That seller has every reason to want to have a reputation for being generous about what goes in — but it still doesn’t feel very free when, two years after you’re living in the house, a particular coffee table or paint color is denied. There is competition in this situation — just not the full freedom that we rightly associate with inhabiting our dwellings. A small percentage of people might elect to join gated communities with strict rules about what can go inside and outside each house — but most people don’t want to have to consult their condo association by-laws before making choices that affect only themselves.

[I guess the Qs below (about each company) are answered above!]

—-####—-

I guess now my question is, what kind of place are we going to build next?

Thanks for your thoughts, Jonathan! What do you all think?

Twitter and the Ultimate Algorithm: Signal Over Noise (With Major Business Model Implications)

By - August 05, 2011

Note: I wrote this post without contacting anyone at Twitter. I do know a lot of folks there, and as regular readers know, have a lot of respect for them and the company. But I wanted to write this as a “Thinking Out Loud” post, rather than a reported article. There’s a big difference – in this piece, I am positing an idea. It’s entirely possible my lack of reporting will make me look like an uninformed boob. In the reported piece I’d posit the idea privately, get a response, and then report what I was told. Given I’m supposedly on a break this week, and I’ve wanted to get this idea out there for some time, I figured I’d just do so. I honestly have no idea if Twitter is actually working on the ideas I posit below. If you have more knowledge than me, please post in the comments, or ping me privately. Thanks! twitter issue.png

—-

I find Twitter to be one of the most interesting companies in our industry, and not simply because of its meteoric growth, celebrity usage, founder drama, or mind-blowing financings. To me what makes Twitter fascinating is the data the company sits atop, and the dramatic tension of whether the company can figure out how to leverage that data in a way that will insure it a place in the pantheon of long-term winners – companies like Microsoft, Google, and Facebook. I don’t have enough knowledge to make that call, but I can say this: Twitter certainly has a good shot at it.

My goal in this post is to outline what I see as the biggest challenge/opportunity in the company’s path. And to my mind, it comes down to this: Can Twitter solve its signal to noise problem?

Many observers have commented on how noisy Twitter is: That once you follow more than about fifty or so folks, your feed becomes unmanageable. If you follow hundreds, like I do, it’s simply impossible to extract value from your stream in any structured or consistent fashion (see image from my stream at left). Twitter’s answers to this issue has been anemic. One product manager even insisted that your Twitter feed should be viewed as a stream you dip into from time to time, using it as a thirsty person might use a nearby water source. I disagree entirely. I have chosen nearly 1,000 folks who I feel are interesting enough to follow. On average, my feed gets a few hundred new tweets every ten minutes. No way can I make sense of that unassisted. But I know there’s great stuff in there, if only the service could surface it in a way that made sense to me.

You know – in a way that feels magic, the way Google was the first time I used it.

I want Twitter to figure out how to present that stream in a way that adds value to my life. It’s about the visual display of information, sure, but it’s more than that. It requires some Really F*ing Hard Math, crossed with some Really Really Hard Semantic Search, mixed with more Super Ridiculous Difficult Math. Because we’re talking about some super big numbers here: 200 million tweets a day across hundreds of millions of accounts. And that’s growing bigger by the hour.

A mini industry has evolved to address this issue – I use News.me, Paper.li, TweetDeck (recently purchased by Twitter), Percolate and others, but the truth is, they are not fully integrated, systemic solutions to the problem. Only Twitter has access to all of Twitter. Only Twitter can see the patterns of usage and interest and turn meaningful insights and connections into algorithms which feed the entire service. In short, it’s Twitter that has to address this problem. Because, of course, this is not just Twitter’s great problem, it is also Twitter’s great opportunity.

Why? Because if Twitter can provide me a tool that makes my feed really valuable, imagine what it can do for advertisers. As with every major player that has scaled to the land of long-term platform winners (as I said, Google, Microsoft, Facebook), product comes first, and business model follows naturally (with Microsoft, the model was software sales of its OS and apps, not advertising).

If Twitter can assign a rank, a bit of context, a “place in the world” for every Tweet as it relates to every other Tweet and to every account on Twitter, well, it can do the same job for every possible advertiser on the planet, as they relate to those Tweets, those accounts, and whatever messaging the advertiser might have to offer. In short, if Twitter can solve its signal to noise problem, it will also solve its revenue scale problem. It will have built the foundation for a real time “TweetWords” – an auction driven marketplace where advertisers can bid across those hundreds of millions of tweets for the the right to position relevant messaging in real time. If this sounds familiar, it should – this is essentially what Google did when it first cracked truly relevant search, and then tied it to AdWords.

Now, I do know that Twitter sees this issue as core to its future, and that it’s madly working on solving it. What I don’t know is how the company is attacking the problem, whether it has the right people to succeed, and, honestly, whether the problem is even soluble regardless of all those variables. After all, Google solved the problem, in part, by using the web’s database of words as commodity fodder, and its graph of links as a guide to value. Tweets are more than words, they comprise sentiments, semantics, and they have a far shorter shelf life (and far less structure) than an HTML document.

In short, it’s a really, really, really hard problem. But it’s a terribly exciting one. If Twitter is going to succeed at scale, it has to totally reinvent search, in real time, with algorithms that understand (or at least replicate patterns of) human meaning. It then has to take that work and productize it in real time to its hundreds of millions of users (because while the core problem/opportunity behind Twitter is search, the product is not a search product per se. It’s a media product.)

To my mind, that’s just a very cool problem on which to work. But I sense that Twitter has the solution to the problem within its grasp. One way to help solve it is to throw open the doors to its data, and let the developer community help (a recent move seems to point in that direction). That might prove too dangerous (it’s not like Google is letting anyone know how it ranks pages). But it could help in certain ways.

Earlier in the week I was on the phone with someone who works very closely in this field (search, large scale ad monetization, media), and he said this of Twitter: “There’s definitely a $100 billion company in there.”

The question is, can it be built?

What do you think? Am I off the reservation here? And who do you know who’s working on this?

"The Information" by James Gleick

By - July 21, 2011

Even before I was a few pages into The Information, a deep, sometimes frustrating but nonetheless superb book by James Gleick, I knew I had to ask him to speak at Web 2 this year. Not only did The Information speak to the theme of the conference this year (the Data Frame), I also knew Gleick, one of science’s foremost historians and storytellers, would have a lot to say to our industry.The Information.jpg

Now that I’ve finished the book (and by no means will it be the last time I read it) I can say I’m positively brimming with questions I’d like to ask the author. And perhaps most vexing is this: “What is Information, anyway?”

If you read The Information for the answer to this question, you may leave the work a bit perplexed. It may be in there, somewhere, but it’s not stated as such. And somehow, that’s OK, because you leave the book far more ready to think about the question than when you started. And to me, that’s the point.

When I was a kid, and fancied myself smarter than someone who might be in the room at the time, I’d ask them to explain to me where space ended. How far out? Often, and this was the trick, a youngster (we were six or seven, after all) would posit that there must be a wall at some point, an ending, a place where the universe no longer existed. “Oh yeah?!” I’d say, exultant that my trick had worked. “Then what’s on the other side?!”

I think the answer is information. Perhaps others would say God, but if that be true, then both are, and the truth is that both understanding God and understanding information are quests that are more about the narrative than the ending. At least, I think so.

Gleick’s book tells the story of how, over the past five thousand or so years, mankind has managed to create symbols which abstract meaning and intent into forms that are communicable beyond time and space. I too am fascinated with this (hence the focus and title of the new book I just announced – What We Hath Wrought .) While my book will attempt to be a narrative history of the next 30 or so years of information’s impact on our culture, Gleick’s is a history of the past 5,000 or more years – and it manages, for the most part, to stay focused just on the theory of information itself, rather than its political or social impacts. It’s ambitious, it’s heady, and at times, it’s nearly impossible to understand for a lay person such as myself.

Gleick traces the narrative of information from the first stirrings of alphabet-based communication to the explosion of academic excitement that accompanied the rise of “Information Science” and “Information Theory” in the mid to late 20th century. Nearly all the geek heros take a star turn in this work, from Ada Lovelace and Charles Babbage to Lord Kelvin, Claude Shannon, and Marshall McLuhan (Wired’s patron saint, in case you younger readers have forgotten…). Einstein, Borges, and scores of other folks who make you feel smart just for reading the book also make cameos.

The work really picks up speed as it describes the rise of early telecommunications, the role of information in mid century warfare, and the birth of both genetic sciences and the computing industry. In the end, Gleick seems to be arguing, it’s all bits – and I think most of us in this industry would agree. But I think Gleick’s definition of “bit” may differ from ours, and while it may be esoteric, it’s there I want to really focus when he visits Web 2 in October.

Reviews of The Information are mostly raves, and I have to add mine to the pile. But as with his earlier work (Chaos cemented my desire to be a technology journalist, for example, and may as well be viewed as a precursor to The Information), this most recent book is sometimes a rather dry tick tock of various academics’ journeys through difficult problems, often accompanied by descriptions of insights that, I must admit, escaped me the first two or three times I read them*. While I thought I knew it, I had to look up the definition of “logarithm” at least twice, and honestly, as its used in some passages, I had to just give up and hope I didn’t miss too much for my ignorance of Gleick’s nuanced use. (Given his larger point, that the core information is that which can be reduced to its essence, I think I got the point. I think).

I guess what I’m saying is that I had to work hard through parts of this book – for example, in understanding how randomness relates to the essence and amount of information in any given object. But I find the work worth it. I’m also still getting my head around the relationship of randomness to entropy (Maxwell’s Demons help…)

But isn’t that the point of a great book?In the end, I feel far more prepared to be a participant in what we’re making together in this industry, more rooted in the history that got us here, and more….yeah, I’ll say it, more reverent about the implications of our work moving forward. For that, I thank Gleick and The Information.

—–

Previous books I’ve reviewed as I prepare for What We Hath Wrought: In the Plex. Next up: Jonathan Zittrain’s The Future of the Internet (And How to Stop It), which I am finishing this week.

*This, for example, is a typical footnote: “The finite binary sequence S with the first proof that S cannot be described by a Turning machine with n states or less is a (log2 n+cp)-state description of S.” My blogging software doesn’t even have the right scientific notation capabilities to do that phrase justice, but I think you get the point I’m making….