Twitter and the Ultimate Algorithm: Signal Over Noise (With Major Business Model Implications)

John Battelle

13 years ago

Note: I wrote this post without contacting anyone at Twitter. I do know a lot of folks there, and as regular readers know, have a lot of respect for them and the company. But I wanted to write this as a “Thinking Out Loud” post, rather than a reported article. There’s a big difference – in this piece, I am positing an idea. It’s entirely possible my lack of reporting will make me look like an uninformed boob. In the reported piece I’d posit the idea privately, get a response, and then report what I was told. Given I’m supposedly on a break this week, and I’ve wanted to get this idea out there for some time, I figured I’d just do so. I honestly have no idea if Twitter is actually working on the ideas I posit below. If you have more knowledge than me, please post in the comments, or ping me privately. Thanks!

—-

I find Twitter to be one of the most interesting companies in our industry, and not simply because of its meteoric growth, celebrity usage, founder drama, or mind-blowing financings. To me what makes Twitter fascinating is the data the company sits atop, and the dramatic tension of whether the company can figure out how to leverage that data in a way that will insure it a place in the pantheon of long-term winners – companies like Microsoft, Google, and Facebook. I don’t have enough knowledge to make that call, but I can say this: Twitter certainly has a good shot at it.

My goal in this post is to outline what I see as the biggest challenge/opportunity in the company’s path. And to my mind, it comes down to this: Can Twitter solve its signal to noise problem?

Many observers have commented on how noisy Twitter is: That once you follow more than about fifty or so folks, your feed becomes unmanageable. If you follow hundreds, like I do, it’s simply impossible to extract value from your stream in any structured or consistent fashion (see image from my stream at left). Twitter’s answers to this issue has been anemic. One product manager even insisted that your Twitter feed should be viewed as a stream you dip into from time to time, using it as a thirsty person might use a nearby water source. I disagree entirely. I have chosen nearly 1,000 folks who I feel are interesting enough to follow. On average, my feed gets a few hundred new tweets every ten minutes. No way can I make sense of that unassisted. But I know there’s great stuff in there, if only the service could surface it in a way that made sense to me.

You know – in a way that feels magic, the way Google was the first time I used it.

I want Twitter to figure out how to present that stream in a way that adds value to my life. It’s about the visual display of information, sure, but it’s more than that. It requires some Really F*ing Hard Math, crossed with some Really Really Hard Semantic Search, mixed with more Super Ridiculous Difficult Math. Because we’re talking about some super big numbers here: 200 million tweets a day across hundreds of millions of accounts. And that’s growing bigger by the hour.

A mini industry has evolved to address this issue – I use News.me, Paper.li, TweetDeck (recently purchased by Twitter), Percolate and others, but the truth is, they are not fully integrated, systemic solutions to the problem. Only Twitter has access to all of Twitter. Only Twitter can see the patterns of usage and interest and turn meaningful insights and connections into algorithms which feed the entire service. In short, it’s Twitter that has to address this problem. Because, of course, this is not just Twitter’s great problem, it is also Twitter’s great opportunity.

Why? Because if Twitter can provide me a tool that makes my feed really valuable, imagine what it can do for advertisers. As with every major player that has scaled to the land of long-term platform winners (as I said, Google, Microsoft, Facebook), product comes first, and business model follows naturally (with Microsoft, the model was software sales of its OS and apps, not advertising).

If Twitter can assign a rank, a bit of context, a “place in the world” for every Tweet as it relates to every other Tweet and to every account on Twitter, well, it can do the same job for every possible advertiser on the planet, as they relate to those Tweets, those accounts, and whatever messaging the advertiser might have to offer. In short, if Twitter can solve its signal to noise problem, it will also solve its revenue scale problem. It will have built the foundation for a real time “TweetWords” – an auction driven marketplace where advertisers can bid across those hundreds of millions of tweets for the the right to position relevant messaging in real time. If this sounds familiar, it should – this is essentially what Google did when it first cracked truly relevant search, and then tied it to AdWords.

Now, I do know that Twitter sees this issue as core to its future, and that it’s madly working on solving it. What I don’t know is how the company is attacking the problem, whether it has the right people to succeed, and, honestly, whether the problem is even soluble regardless of all those variables. After all, Google solved the problem, in part, by using the web’s database of words as commodity fodder, and its graph of links as a guide to value. Tweets are more than words, they comprise sentiments, semantics, and they have a far shorter shelf life (and far less structure) than an HTML document.

In short, it’s a really, really, really hard problem. But it’s a terribly exciting one. If Twitter is going to succeed at scale, it has to totally reinvent search, in real time, with algorithms that understand (or at least replicate patterns of) human meaning. It then has to take that work and productize it in real time to its hundreds of millions of users (because while the core problem/opportunity behind Twitter is search, the product is not a search product per se. It’s a media product.)

To my mind, that’s just a very cool problem on which to work. But I sense that Twitter has the solution to the problem within its grasp. One way to help solve it is to throw open the doors to its data, and let the developer community help (a recent move seems to point in that direction). That might prove too dangerous (it’s not like Google is letting anyone know how it ranks pages). But it could help in certain ways.

Earlier in the week I was on the phone with someone who works very closely in this field (search, large scale ad monetization, media), and he said this of Twitter: “There’s definitely a $100 billion company in there.”

The question is, can it be built?

What do you think? Am I off the reservation here? And who do you know who’s working on this?

Share this: