I’ve been quiet here on Searchblog these past few months, not because I’ve nothing to say, but because two major projects have consumed my time. The first, a media platform in development, is still operating mostly under the radar. I’ll have plenty to say about that, but at a later date. It’s the second where I could use your help now, a project we’re calling Mapping Data Flows. This is the research effort I’m spearheading with graduate students from Columbia’s School for International Public Affairs (SIPA) and Graduate School of Journalism. This is the project examining what I call our “Shadow Internet Constitution” driven by corporate Terms of Service.
Our project goal is simple: To visualize the Terms of Service and Data/Privacy Policies of the four largest companies in US consumer tech: Amazon, Apple, Facebook, and Google. We want this visualization to be interactive and compelling – when you approach it (it’ll be on the web), we hope it will help you really “see” what data, rights, and obligations both you and these companies have reserved. To do that, we’re busy turning unintelligible lines of text (hundreds of thousands of words, in aggregate) into code that can be queried, compared, and visualized. When I first imagined the project, I thought that wouldn’t be too difficult. I was wrong – but we’re making serious progress, and learning a lot along the way.
I’ll never forget a meal I had with a senior executive at Facebook many years ago, back when I was just starting to question the motives of the burgeoning startup’s ambition. I asked whether the company would ever support publishers across the “rest of the web” – perhaps through an advertising system competitive with Google’s AdSense. The executive’s response was startling and immediate. Everything anyone ever needs to do – including publishing – can and should be done on Facebook. The rest of the Internet was a sideshow. It’s just easier if everything is on one platform, I was told. And Facebook’s goal was to be that platform.
Those words still ring in my ears as we celebrate the 30th anniversary of the web today. And they certainly should inform our perspective as we continue to digest Facebook’s latest self-involved epiphany.
This is an edited version of a series of talks I first gave in New York over the past week, outlining my work at Columbia. Many thanks to Reinvent, Pete Leyden, Cap Gemini, Columbia University, Cossette/Vision7, and the New York Times for hosting and helping me.
If predictions are like baseball, I’m bound to have a bad year in 2019, given how well things went the last time around. And given how my own interests, work life, and physical location have changed of late, I’m not entirely sure what might spring from this particular session at the keyboard.
But as I’ve noted in previous versions of this post (all 15 of them are linked at the bottom), I do these predictions in something of a fugue state – I don’t prepare in advance. I just sit down, stare at a blank page, and start to write.
So Happy New Year, and here we go.
1/ Global warming gets really, really, really real. I don’t know how this isn’t the first thing on everyone’s mind already, with all the historic fires, hurricanes, floods, and other related climate catastrophes of 2018. But nature won’t relent in 2019, and we’ll endure something so devastating, right here in the US, that we won’t be able to ignore it anymore. I’m not happy about making this prediction, but it’ll likely take a super Sandy or a king-sized Katrina to slap some sense into America’s body politic. 2019 will be the year it happens.
2/ Mark Zuckerberg resigns as Chairman of Facebook, and relinquishes his supermajority voting rights. Related, Sheryl Sandberg stays right where she is. I honestly don’t see any other way Facebook pulls out of its nosedive. I’ve written about this at length elsewhere, so I will just summarize: Facebook’s only salvation is through a new system of governance. And I mean that word liberally – new governance of how it manages data across its platform, new governance of how it works with communities, governments, and other key actors across its reach, and most fundamentally, new governance as to how it works as a corporate entity. It all starts with the Board asserting its proper role as the governors of the company. At present, the Board is fundamentally toothless.
3/ Despite a ton of noise and smoke from DC, no significant federal legislation is signed around how data is managed in the United States. I know I predicted just a few posts ago that 2019 will be the year the tech sector has to finally contend with Washington. And it will be…but in the end, nothing definitive will emerge, because we’ll all be utterly distracted by the Trump show (see below). Because of this, unhappily, we’ll end up governed by both GDPR and California’s homespun privacy law, neither of which actually force the kind of change we really need.
4/ The Trump show gets cancelled. Last year, I said Trump would blow up, but not leave. This year, I’m with Fred, Trump’s in his final season. We all love watching a slow motion car wreck, but 2019 is the year most of us realize the car’s careening into a school bus full of our loved ones. Donald Trump, you’re fired.
5/ Cannabis for the win. With Sessions gone and politicians of all stripes looking for an easy win, Congress will pass legislation legalizing cannabis. Huzzah!!!! Just in time, because…
6/ China implodes, the world wobbles. Look, I’m utterly out of my depth here, but something just feels wrong with the whole China picture. Half the world’s experts are warning us that China’s fusion of capitalism and authoritarianism is already taking over the world, and the other half are clinging to the long-held notion that China’s approach to nation building is simply too fragile to withstand democratic capitalism’s demands for transparency. But I think there may be other reasons China’s reach will extend its grasp: It depends on global growth and optimistic debt markets. And both of those things will fail this year, exposing what is a marvelous but unsustainable experiment in managed markets. This is a long way of backing into a related prediction:
7/ 2019 will be a terrible year for financial markets. This is the ultimate conventional wisdom amongst my colleagues in SF and NY, even though I’ve seen plenty of predictions that Wall St. will have a pretty good year. I have no particular insight as to why I feel this way, it’s mainly a gut call: Things have been too good, for too long. It’s time for a serious correction.
8/ At least one major tech IPO is pulled, the rest disappoint as a class. Uber, Lyft, Slack, Pinterest et al are all expected this year. But it won’t be a good year to go public. Some will have no choice, but others may simply resize their businesses to focus on cash flow, so as to find a better window down the road.
9/ New forms of journalistic media flourish. It’s well past time those of us in the media world take responsibility for the shit we make, and start to try significant new approaches to information delivery vehicles. We have been hostages to the toxic business models of engagement for engagement’s sake. We’ll continue to shake that off in various ways this year – with at least one new format taking off explosively. Will it have lasting power? That won’t be clear by year’s end. But the world is ready to embrace the new, and it’s our jobs to invest, invent, support, and experiment with how we inform ourselves through the media. Related, but not exactly the same…
10/A new “social network” emerges by the end of the year. Likely based on messaging and encryption (a la Signal or Confide), the network will have many of the same features as the original Facebook, but will be based on a paid model. There’ll be some clever new angle – there always is – but in the end, it’s a way to manage your social life digitally. There are simply too many pissed off and guilt-ridden social media billionaires with the means to launch such a network – I mean, Insta’s Kevin Systrom, WhatsApp’s Jan and Brian, not to mention the legions of mere multi-millionaires who have bled out of Facebook’s battered body of late.
So that’s it. On a personal note, I’ll be happily busy this year. Since moving to NY this past September, I’ve got several new projects in the works, some still under wraps, some already in process. NewCo and the Shift Forum will continue, but in reconstituted forms. I’ll keep up with my writing as best I can; more likely than not most of it will focus the governance of data and how its effect our national dialog. Thanks, as always, for reading and for your emails, comments, and tweets. I read each of them and am inspired by all. May your 2019 bring fulfillment, peace, and gratitude.
If you’re read my rants for long enough, you know I’m fond of programmatic advertising. I’ve called it the most important artifact in human history, replacing the Macintosh as the most significant tool ever created.
So yes, I think programmatic advertising is a big deal. As I wrote in the aforementioned post:
“I believe the very same technologies we’ve built to serve real time, data-driven advertising will soon be re-purposed across nearly every segment of our society. Programmatic adtech is the heir to the database of intentions – it’s that database turned real time and distributed far outside of search. And that’s a very, very big deal. (I just wish I had a cooler name for it than “adtech.”)”
But lately, I’m starting to wonder if perhaps adtech is failing, not for any technical reason, but because the people leveraging are complicit in what might best be called a massive failure of imagination.
I’m about to go on a rant here, so please forgive me in advance.
But honestly, who else out there is sick of being followed by ads so stupid a fourth grader could do a better job of targeting them?
Case in point is the ad above. I took this screen shot from my phone this past weekend while I was reading a New York Times article. The image – of a robe Amazon wanted me to buy – was instantly annoying, because I had in fact purchased a robe on Amazon several days before. Why on earth was Amazon retargeting me for a product I just bought?!
But wait, it gets worse! As I perused the next Times article, this ad shows up:
You might think this ad makes more sense. If the dude buys a robe, makes sense to try to sell him a new pair of slippers, no? Well, sure, but only if that same dude didn’t buy a new pair of slippers two weeks ago. Which, in fact, I did just do.
So, yeah, this ad sucks as well. Not only is it not useful or relevant, it’s downright annoying. The vast machinery of adtech has correctly identified me as a robe-and-slippers-buying customer. But it’s failed to realize *I’ve already bought the damn things.*
Is it possible that adtech is this stupid? This poorly instrumented? I mean, are programmatic buyers simply tagging visitors who land on ecommerce pages (male robe intender?) without caring about whether those visitors actually bought anything?
Are the human beings responsible for setting the dials of programmatic just this lazy?
I’ve been a critical observer of adtech over the past ten or so years, and one consistent takeaway is this: If there’s a way for a buyer to cut corners, declare an easy win, and keep doing things they way the’ve always been done, well, they most certainly will.
But why does it have to be this way? Digging into the examples above yields an extremely frustrating set of facts. Consider the data the adtech infrastructure either got *right* about me as a customer, or could have gotten right:
I am a frequent ecommerce customer, usually buying on Amazon
I recently purchased both a robe and some slippers
I am reading on the New York Times site as a logged on (IE data rich) customer of the Times‘ offerings
These are just the obvious data points. My mobile ID and cookies, all of which are available to programmatic buyers, certainly indicate a high household income, a propensity to click on certain kinds of ads, a rich web browsing history reflecting a thickly veined lodestar of interest data, among countless other possible inputs.
Imagine if a programmatic campaign actually paid attention to all this rich data? Start with the fact I just purchased a robe and slippers. What are products related to those two that Amazon might show me? Well, according to its own “people who bought this item also bought” algorithms, folks who bought men’s robes also bought robes for the women in their life. Now there’s a cool recommendation! I might have clicked on an ad that showed a cool robe for my wife. But no, I’m shown an ad for a product I already have.
I’ve got a few calls in to verify my hunch, but I suspect the ugly truth is pure laziness on the part of the folks responsible for buying ads. Consider: The average cost for a thousand views (CPM) of a targeted programmatic advertisement hovers between ten cents (yes, ten pennies) to $2. With costs that low, the advertising community can afford to waste ad inventory.
Let’s apply that reality to our robe example. Let’s say the robe costs $60, and yields a $20 profit for our e-commerce advertiser, not including marketing costs. That means that same advertiser is can spend upwards of $19.99 per unit on advertising (more, if a robe purchaser turns out to be a “big basket” e-commerce spender). So what does our advertiser do? Well, they set a retargeting campaign aimed anyone who ever visited our erstwhile robe’s page. With CPMs averaging around a buck, that robe’s going to follow nearly 20,000 folks around the internet, hoping that just one of them converts.
Put another way, programmatic advertising is a pure numbers game, and as long as the numbers show one penny of profit, no one is motivated to make the system any better. I’ve encountered many similar examples of ad buyers ignoring high-quality data signals, preferring instead to “waste reach” because, well, it’s just easier to set up campaigns on one or two factors. Inventory is cheap. Why not?
This is problematic. What’s the point of having all that rich (and hard won) targeting data if buyers won’t use it, and consumers don’t benefit from it? An ecosystem that fails to encourage innovation will stagnate and lose share to walled gardens like Facebook, Google, and others. If the ads suck on the open web (and they do), then consumers will either install ad blockers (and they are), or abandon the open web altogether (and they are).
Well, Walmart vs. Amazon is all about big business – a platform giant (Amazon) disrupting an OldBigCo (Walmart and its kin). Over the past two decades, Amazon bumped Walmart out of the race to a trillion-dollar market cap, and the OldCo from Bentonville had to reset and play the role of the upstart. The Token Act levels the playing field, forcing both to win where it really matters: In service to the customer.
But while BigCos are sexy and well known, it’s the small and medium-sized business ecosystem that determines whether or not we have an economy of mass flourishing. So let’s explore the Token Act from the point of view of a small business startup, in this case, a new neighborhood restaurant. I briefly touched upon this idea in my set up post, Don’t Break Up The Tech Oligarchs. Force Them To Share Instead. (If you haven’t already, you might want to read that post before this one, as I lay out the framework in which this scenario would play out.) What I envision below assumes the Token Act has passed, and we’re at least a year or two into its adoption by most major data players. Here we go…
Fresh off her $2,700 win from Walmart, Michelle decides she’s ready to lean into a lifelong dream: Starting a restaurant in her newly adopted neighborhood of Chelsea in New York City. Since moving to the area from California, she’s noticed two puzzling trends: First, a dearth of interesting mid- to high-end dinner spots walking distance from her new place, and second, what appears to be higher-than-average vacancy rates for the retail storefronts in the same general area. It appears to be a buyer’s market for retail restaurant space in Chelsea. So why aren’t new places launching? She read the Times’ piece on vacancies a few years ago (before the Token Act passed) and was left just as puzzled as before – seems like there’s no rhyme or reason to the market.
Michelle wants to start a high end American gastro pub – the kind of place she loved back when she lived in Northern California (she’s fond of Danny Meyers’ Gramercy Tavern, pictured above, but it’s a bit too far away from her new place). She has a strong hunch that such a place would be a hit in her new neighborhood, but she’s not sure her new neighbors will agree.
Now starting a restaurant requires a certain breed of insanity – they say the best way to make a small fortune in the business is to start with a large one. The truth is, launching restaurants has historically been a crap shoot – you might find the best talent, the best designer, and the best location – but if for some reason you don’t bring the je ne sai quois, the place will fail within months, leaving you and your partners millions of dollar poorer.
It’s that je ne sai quois that Michelle is determined to reveal. The tools she will leverage? The newly liberated resources of data tokens.
Before we continue, allow me to draw your attention back to the rise of search, indeed, the very era which begat Searchblog in the early 2000s. Google Adwords launched in 2000, and within a few years, the media world had been turned upside down by what I termed The Database of Intentions. As if by magic, people everywhere could suddenly ask new kinds of questions, finding themselves both surprised and delighted by the answers they received.
A Gates-Line compliant ecosystem quickly developed on top of this new platform, driven by an emerging industry of search engine marketing and optimization. SEO/SEM sprung into existence to help small and medium sized businesses take advantage of the Google platform – by 2006 the industry stood at nearly $10 billion in spend, growing more than 60 percent year on year. Adwords grew from zero to millions of advertisers by connecting to a long tail of small businesses that took advantage of an entirely new class of revealed information: The intents, desires, and needs of tens of millions of consumers, who relentlessly poured their queries into Google’s placid and unblinking search box.
Were you a limo service in the Bronx looking for new customers? It paid huge dividends to purchase Adwords like “car service bronx” and “best limo manhattan.” Were you a dry cleaner in West LA hoping to expand? Best be first in line when customers typed in “best cleaners Beverly Hills.” Selling heavy machinery to construction services in the midwest? If you don’t own keywords like “caterpillar dealer des moines” you’d lose, and quick, to whoever did optimize to phrases like that.
My point is simply this: Adwords was a freaking revolution, but it ain’t nothing compared to what will happen if we unleash data tokens on the world.
Ok, back to Michelle and her new restaurant. Of course Michelle will leverage Adwords, and Facebook, and any other advertising service to help her new business grow. But none of those services can help her figure out her je ne sai quois – for that, she needs something entirely novel. She needs a new question machine. And the ecosystem that develops around data tokens will offer it.
Thanks to her Walmart experience, Michelle has become aware of the power of personal data. She’s also read up on the Token Act, the new law requiring all data players at scale to allow individuals to create machine-readable data tokens that can be exchanged for value as directed by the consumer. After doing a bit of research, she stumbles across a startup called OfferExchange, which manages “Token Offers” on behalf of anyone who might want to query TokenLand. OfferExchange is a spinout from ProtocolLabs, a pioneer in secure blockchain software platforms like Filecoin. It’s still early in TokenLand, so an at-scale Google of the space hasn’t emerged. OfferExchange works more like a bespoke yet platform-based research outfit – the firm has a sophisticated website and impressive client list. It uses Facebook, Twitter, LiveRamp, and Instagram to identify potential token-creating consumers, then solicits those individuals with offers of cash or other value in exchange for said tokens.
Michelle does a Crunchbase search for OfferExchange and sees it’s backed by Union Square Ventures and Benchmark, which gives her some comfort – those firms don’t fund fly-by-night hucksters. And OfferExchange site is impressive – in less than five minutes, it guides her through the construction of an elegant query. Here’s how the process works:
First, the site asks Michelle what her goal is. “Starting a restaurant in New York City,” she responds. The site reconstructs around her answer, showing suggested data repositories she might mine. “Restaurants, New York City,” reads the top layer of a directory-like page. Underneath are several categories, each populated with familiar company names:
Restaurant Reservation and Review Services
OpenTable Google Resy Yelp Eat24 Facebook (more)
Food Delivery Services
GrubHub Uber Eats PostMates InstaCart (more)
Uber Lyft Juno Via (more)
Real Estate Services (Commercial)
LoopNet DocuSign CompStak (more)
Foursquare Uber Lyft Google NinthDecimal (more)
American Express Visa Mastercard Apple Pay Diners Club (more)
And so on – if she wished, Michelle could dig into dozens of categories related to her initial “restaurant New York City” search.
Michelle’s imagination sparks – the kinds of queries she could ask of these services is mind blowing. She could limit her query to people who live within walking distance of her neighborhood, asking her *actual neighbors* for tokens that tell her what restaurants they eat at, when they eat there, the size of their checks, related reviews, abandoned reservations, the works. She might discover that folks like Indian takeout on Mondays, that they rarely spend more than $100 on a meal on Tuesdays, but that they splurge on the weekends. She could discover the percentage of diners in Chelsea who travel more than two miles by car service to eat out at a place similar to the one she has in mind, and what the size of the check might be when they do. She can also check historical average rents for restaurants in her zip code, over time, which will certainly help with negotiating her lease. The possibilities are endless.
Put another way, with OfferExchange’s services, Michelle can litigate the merde out of her je ne sai quois.
This post is getting long, so I’ll stop here and pull back for a spot of Thinking Out Loud. I could continue the story, imagining the process of the token offer Michelle would put out through OfferExchange’s platform, but suffice to say, she’d be willing to pay upwards of $5-20 per potential customer for their data. The marketing benefit alone – alerting potential customers in the neighborhood that she’s exploring a new restaurant in the area – is worth tens of thousands already. And of course, OfferExchange can connect anyone who offers their tokens to Michelle’s new project a discount on their first meal at the restaurant, should it actually launch. Cool!
But let’s stop there and consider what happens when local entrepreneurs have access to the information currently silo’d across thousands of walled garden services like Uber, LoopNet, Resy, and of course Facebook and Google. While better data won’t insure that Michelle’s restaurant will succeed, it certainly increases the odds that it won’t fail. And it will give both Michelle and her investors – local banks, savvy friends and family members – much more conviction that her new enterprise is viable. Take this local restaurant example and apply it to all manner of small business – dry cleaners, hardware stores, bike shops – and this newly liberated class of information enables an explosion of efficiency, investment, and, well, flourishing in what has become, over the past four decades, a stagnant SMB environment.
Is this Money Ball for SMB? Perhaps. And yes, I can imagine any number of downsides to this new data economy. But I also believe the benefits would far outweigh the downsides. Under the Token Act as I envision it, co-creators of the data – the services like Uber, OpenTable, or Facebook – have the right to charge a vig for the data being monetized. Sure, it’d be possible for an entrepreneur to steal customers via tokens, but I’m going to guess the economic value of allowing your customers to discover new use cases for their data will dwarf the downside of possibly losing those customers to a new competitor. Plus, this new competitive force will drive everyone to play at a higher level, focusing not on moats built on data silos, but instead on what really matters: A highly satisfied customer. That’s certainly Michelle’s goal, and the goal of every successful local business. Why shouldn’t it also be the goal of the data giants?
Social conversations about difficult and complex topics have arcs – they tend to start scattered, with many threads and potential paths, then resolve over time toward consensus. This consensus differs based on groups within society – Fox News aficionados will cluster one way, NPR devotees another. Regardless of the group, such consensus then becomes presumption – and once a group of people presume, they fail to explore potentially difficult or presumably impossible alternative solutions.
This is often a good thing – an efficient way to get to an answer. But it can also mean we fail to imagine a better solution, because our own biases are obstructing a more elegant path forward.
This is my sense of the current conversation around the impact of what Professor Scott Galloway has named “The Four” – the largest and most powerful American companies in technology (they are Apple, Amazon, Google, and Facebook, for those just returning from a ten-year nap). Over the past year or so, the conversation around technology has become one of “something must be done.” Tech was too powerful, it consumed too much of our data and too much of our economic growth. Europe passed GDPR, Congress held ineffectual hearings, Facebook kept screwing up, Google failed to show up…it was all of a piece.
The conversation evolved into a debate about various remedies, and recently, it’s resolved into a pretty consistent consensus, at least amongst a certain class of tech observers: These companies need to be broken up. Antitrust, many now claim, is the best remedy for the market dominance these companies have amassed.
It’s a seductive response, with seductive historical precedent. In the 1970s and 80s, antitrust broke up AT&T, ultimately paving the way for the Internet to flourish. In the 90s, antitrust provided the framework for the government’s case against Microsoft, opening the door for new companies like Google and Facebook to dominate the next version of the Internet. Why wouldn’t antitrust regulation usher in #Internet3? Imagine a world where YouTube, Instagram, and Amazon Web Services are all separate companies. Would not that world be better?
Perhaps. I’m not well read enough in antitrust law to argue one way or the other, but I know that antitrust turns on the idea of consumer harm (usually measured in terms of price), and there’s a strong argument to be made that a free service like Google or Facebook can’t possibly cause consumer harm. Then again, there are many who argue that data is in fact currency, and The Four have essentially monopolized a class of that currency.
But even as I stare at the antitrust remedy, another solution keeps poking at me, one that on its face seems quite elegant and rather unexplored.
The idea is simply this: Require all companies who’ve reached a certain scale to build machine-readable data portability into their platforms. The right to data portability is explicit in the EU’s newly enacted GDPR framework, but so far the impact has been slight: There’s enough wiggle room in the verbiage to hamper technical implementation and scope. Plus, let’s be honest: Europe has never really been a hotbed of open innovation in the first place.
But what if we had a similar statute here? And I don’t mean all of GDPR – that’s certainly a non starter. But that one rule, that one requirement: That every data service at scale had to stand up an API that allowed consumers to access their co-created data, download a copy of it (which I am calling a token), and make that copy available to any service they deemed worthy?
Imagine what might come of that in the United States?
I’m not a policy expert, and the devil’s always in the details. So let me be clear in what I mean when I say “machine-readable data portability”: The right to take, via an API, what is essentially a “token” containing all (or a portion of) the data you’ve co created in one service, and offer it, with various protections, permission, and revocability, to another service. In my Senate testimony, I gave the example of a token that has all your Amazon purchases, which you then give to Walmart so it can do a historical price comparison and tell you how much money you would save if you shopped at its online service. Walmart would have a powerful incentive to get consumers to create and share that token – the most difficult problem in nearly all of business is getting a customer to switch to a similar service. That would be quite a valuable token, I’d wager*.
Should be simple to do, no? I mean, don’t we at least co-own the information about what we bought at Amazon?
Well, no. Not really. Between confusing terms of service, hard to find dashboards, and confounding data reporting standards, The Four can both claim we “own our own data” while at the same time ensuring there’ll never be a true market for the information they have about us.
So yes, my idea is easily dismissed. The initial response I’ve had to it is always some variation of: “There’s no way The Four would let this happen.” That’s exactly the kind of biases I refer to above – we assume that The Four control the dialog, that they either will thwart this idea through intensive lobbying, clever terms of service, and soft power, or that the idea is practically impossible because of technical or market limitations. To that I ask….Why?
Why is it impossible for me to tokenize all of my Lyft ride data, and give for free it to an academic project that is mapping the impact of ride sharing on congestion in major cities? Why is it impossible for a small business owner to create an RFP for all OpenTable, Resy, and other dining data, so she can determine the best kind of restaurant to open in her neighborhood? I’m pretty certain she’d pay a few bucks a head for that kind of data – so why can’t I sell that information to her (with a vig back to OpenTable and Resy) if the value exchange is there to be monetized? Why can’t I tokenize and sell my Twitter interactions to a brand (or more likely, an agency or research company) interested in understanding the mind of a father who lives in Manhattan? Why can’t I tokenize and trade my Spotify history for better recommendations on live shows to see, or movies to watch, or books to read? Or, simply give it to a free service that’s sprung up to give me suggestions about new music to check out?
Why can’t an ecosystem of agents, startups, and data brokers emerge, a new industry of information processing not seen since the rise of search optimization in the early aughts, leveraging and arbitraging consumer information to create entirely new kinds of businesses driven by insights currently buried in today’s data monopolies?
Such a world would be fascinating, exciting, sometimes sketchy, and a hell of a lot of fun. It’d be driven by the individual choices of millions of consumers – choosing which agents to trust, which tokens to create, which trades felt fair. There’s be fails, there’d be fraud, there’d be bad actors. But over time, the good would win over the bad, because the decision making is distributed across the entire population of Internet users. In short, we’d push the decision making to the node – to us. Sure, we’d do stupid things. And sure, the hucksters and the hustlers would make short term killings. But I’ll take an open system like this over a closed one any day of the week, especially if the open system is governed by an architecture empowering the individual to make their own decisions.
It’s be a lot like the Internet was once imagined to be.
I’ve been noodling on such an ecosystem, and I’m convinced it could dwarf our current Internet in terms of overall value created (and credit where credit is due, The Four have created a lot of value). It’d run laps around The Four when it comes to innovation – tens of thousands of new companies would form, all of them feeding off the newly liberated oxygen of high quality, structured, machine readable data. Trusted independent platforms for value exchange would arise. Independent third party agents would munge tokens from competing services, verifying claims and earning the trust of consumers (will Walmart really save you a thousand bucks a year?! We can prove it, or not!). Huge platforms would develop for the processing, securitization, permissioning, and validation of our data. Man, it’d feel like…well, like the recumbent, boring old Internet was finally exciting again.
There’s no technical reason why this world doesn’t exist. The progenitors of the Web have already imagined it, heck, Tim Berners Lee recently announced he’s working pretty much full time on creating a system devoted to the foundational elements needed for it to blossom.
But until we as a society write machine-readable data portability into law, such efforts will be relegated to interesting side shows. And more likely than not, we’ll spend the next few years arguing about breaking up The Four, and let’s be honest, that’s an argument The Four want us to have, because they’re going to win it (more money, better lawyers, etc. etc.). Instead, we should just require them – and all other data services of scale – to free the data they’ve so far managed to imprison. One simple new law could change all of that. Shouldn’t we consider it?
*In another post, I’ll explore this example in detail. It’s really, really fascinating.
A theme of my writing over the past ten or so years has been the role of data in society. I tend to frame that role anthropologically: How have we adapted to this new element in our society? What tools and social structures have we created in response to its emergence as a currency in our world? How have power structures shifted as a result?
Increasingly, I’ve been worrying a hypothesis: Like a city built over generations without central planning or consideration for much more than fundamental capitalistic values, we’ve architected an ecosystem around data that is not only dysfunctional, it’s possibly antithetical to the core values of democratic society. Houston, it seems, we really do have a problem.
I know, it’s been a while since I’ve written here, and most of my recent stuff has focused on Facebook. I’ve been on the road the entire summer, and preparing to move from the Bay area to NYC ( that’s another post). But before you roll your eyes in anticipation of yet another Facebook rant, no, this post is not about Facebook, despite that company’s continued inability to govern itself.
No, this post is about the business of health insurance.
Last week ProPublica published a story titled Health Insurers Are Vacuuming Up Details About You — And It Could Raise Your Rates. It’s the second in an ongoing series the investigative unit is doing on the role of data in healthcare. I’ve been watching this story develop for years, and ProPublica’s piece does a nice job of framing the issue. It envisions “a future in which everything you do — the things you buy, the food you eat, the time you spend watching TV — may help determine how much you pay for health insurance.” Unsurprisingly, the health industry has developed an insatiable appetite for personal data about the individuals it covers. Over the past decade or so, all of our quotidian activities (and far more) have been turned into data, and that data can and is being sold to the insurance industry:
“The companies are tracking your race, education level, TV habits, marital status, net worth. They’re collecting what you post on social media, whether you’re behind on your bills, what you order online. Then they feed this information into complicated computer algorithms that spit out predictions about how much your health care could cost them.”
HIPPA, the regulatory framework governing health information in the United States, only covers and protects medical data – not search histories, streaming usage, or grocery loyalty data. But if you think your search, video, and food choices aren’t related to health, well, let’s just say your insurance company begs to differ.
Lest we dive into a rabbit hole about the corrosive combination of healthcare profit margins with personal data (ProPublica’s story does a fine job of that anyway), I want to pull back and think about what’s really going on here.
The Tragedy of the Commons
One of the most fundamental tensions in an open society is the potential misuse of resources held “in common” – resources to which all individuals have access. Garrett Hardin’s 1968 essay on the subject, “The Tragedy of the Commons,” explores this tension, concluding that the problem of human overpopulation has no technical solution. (A technical solution is one that does not require a shift in human values or morality (IE, a political solution), but rather can be fixed by application of science and/or engineering.) Hardin’s essay has become one of the most cited works in social science – the tragedy of the commons is a facile concept that applies to countless problems across society.
In the essay, Hardin employs a simple example of a common grazing pasture, open to all who own livestock. The pasture, of course, can only support a finite number of cattle. But as Hardin argues, cattle owners are financially motivated to graze as many cattle as they possibly can, driving the number of grass munchers beyond the land’s capacity, ultimately destroying the commons. “Freedom in a commons brings ruin to all,” he concludes, delivering an intellectual middle finger to Smith’s “invisible hand” in the process.
So what does this have to do with healthcare, data, and the insurance industry? Well, consider how the insurance industry prices its policies. Insurance has always been a data-driven business – it’s driven by actuarial risk assessment, a statistical method that predicts the probability of a certain event happening. Creating and refining these risk assessments lies at the heart of the insurance industry, and until recently, the amount of data informing actuarial models has been staggeringly slight. Age, location, and tobacco use are pretty much how policies are priced under Obamacare, for example. Given this paucity, one might argue that it’s utterly a *good* thing that the insurance industry is beefing up its databases. Right?
Perhaps not. When a population is aggregated on high-level data points like age and location, we’re essentially being judged on a simple shared commons – all 18 year olds who live in Los Angeles are being treated essentially the same, regardless if one person has a lurking gene for cancer and another will live without health complications for decades. In essence, we’re sharing the load of public health in common – evening out the societal costs in the process.
But once the system can discriminate on a multitude of data points, the commons collapses, devolving into a system rewarding whoever has the most profitable profile. That 18-year old with flawless genes, the right zip code, an enviable inheritance, and all the right social media habits will pay next to nothing for health insurance. But the 18 year old with a mutated BRCA1 gene, a poor zip code, and a proclivity to sit around eating Pringles while playing Fortnite? That teenager is not going to be able to afford health insurance.
Put another way, adding personalized data to the insurance commons destroys the fabric of that commons. Healthcare has been resistant to this force until recently, but we’re already seeing the same forces at work in other aspects of our previously shared public goods.
A public good, to review, is defined as “a commodity or service that is provided without profit to all members of a society, either by the government or a private individual or organization.” A good example is public transportation. The rise of data-driven services like Uber and Lyft have been a boon for anyone who can afford these services, but the unforeseen externalities are disastrous for the public good. Ridership, and therefore revenue, falls for public transportation systems, which fall into a spiral of neglect and decay. Our public streets become clogged with circling rideshare drivers, roadway maintenance costs skyrocket, and – perhaps most perniciously – we become a society of individuals who forget how to interact with each other in public spaces like buses, subways, and trolley cars.
Once you start to think about public goods in this way, you start to see the data-driven erosion of the public good everywhere. Our public square, where we debate political and social issues, has become 2.2 billion data-driven Truman Shows, to paraphrase social media critic Roger McNamee. Retail outlets, where we once interacted with our fellow citizens, are now inhabited by armies of Taskrabbits and Instacarters. Public education is hollowed out by data-driven personalized learning startups like Alt School, Khan Academy, or, let’s face it, YouTube how to videos.
We’re facing a crisis of the commons – of the public spaces we once held as fundamental to the functioning of our democratic society. And we have data-driven capitalism to blame for it.
Now, before you conclude that Battelle has become a neo-luddite, know that I remain a massive fan of data-driven business. However, if we fail to re-architect the core framework of how data flows through society – if we continue to favor the rights of corporations to determine how value flows to individuals absent the balancing weight of the public commons – we’re heading down a path of social ruin. ProPublica’s warning on health insurance is proof that the problem is not limited to Facebook alone. It is a problem across our entire society. It’s time we woke up to it.
So what do we do about it? That’ll be the focus of a lot of my writing going forward. As Hardin writes presciently in his original article, “It is when the hidden decisions are made explicit that the arguments begin. The problem for the years ahead is to work out an acceptable theory of weighting.” In the case of data-driven decisioning, we can no longer outsource that work to private corporations with lofty sounding mission statements, whether they be in healthcare, insurance, social media, ride sharing, or e-commerce.
(image) Today I had a chance to testify to the US Senate on the subject of Facebook, Cambridge Analytica, and data privacy. It was an honor, and a bit scary, but overall an experience I’ll never forget. Below is the written testimony I delivered to the Commerce committee on Sunday, released on its site today. If you’d like to watch, head right here, I think it’ll be up soon. Forgive the way the links work, I had to consider that this would be printed and bound in the Congressional Record. I might post a shorter version that I read in as my verbal remarks next…we’ll see.
Honorable Committee Members –
My name is John Battelle, for more than thirty years, I’ve made my career reporting, writing, and starting companies at the intersection of technology, society, and business. I appreciate the opportunity to submit this written and verbal testimony to your committee.
Over the years I’ve written extensively about the business models, strategies, and societal impact of technology companies, with a particular emphasis on the role of data, and the role of large, well-known firms. In the 1980s and 90s I focused on Apple and Microsoft, among others. In the late 90s I focused on the nascent Internet industry, the early 2000s brought my attention to Google, Amazon, and later, Twitter and Facebook. My writings tend to be observational, predictive, analytical, and opinionated.
Concurrently I’ve been an entrepreneur, founding or co-founding and leading half a dozen companies in the media and technology industries. All of these companies, which span magazines, digital publishing tools, events, and advertising technology platforms, have been active participants in what is broadly understood to be the “technology industry” in the United States and, on several occasions, abroad as well. Over the years these companies have employed thousands of staff members, including hundreds of journalists, and helped to support tens of thousands of independent creators across the Internet. I also serve on the boards of several companies, all of which are deeply involved in the technology and data industries.
In the past few years my work has focused on the role of the corporation in society, with a particular emphasis on the role technology plays in transforming that role. Given this focus, a natural subject of my work has been on companies that are the most visible exemplars of technology’s impact on business and society. Of these, Facebook has been perhaps my most frequent subject in the past year or two.
Given the focus of this hearing, the remainder of my written testimony will focus on a number of observations related generally to Facebook, and specifically to the impact of the Cambridge Analytica story. For purposes of brevity, I will summarize many of my points here, and provide links to longer form writings that can be found on the open Internet.
Facebook broke through the traditional Valley startup company noise in the mid 2000s, a typical founder-driven success story backed by all the right venture capital, replete with a narrative of early intrigue between partners, an ambitious mission (“to make the world more open and connected”), a sky-high private valuation, and any number of controversial decisions around its relationship to its initial customers, the users of its service (later in its life, Facebook’s core customers bifurcated to include advertisers). I was initially skeptical about the service, but when Sheryl Sandberg, a respected Google executive, moved to Facebook to run its advertising business, I became certain it would grow to be one of the most important companies in technology. I was convinced Facebook would challenge Google for supremacy in the hyper-growth world of personalized advertising. In those early days, I often made the point that while Google’s early corporate culture sprang from the open, interconnected world wide web, Facebook was built on the precept of an insular walled garden, where a user’s experience was entirely controlled by the Facebook service itself. This approach to creating a digital service not only threatened the core business model of Google (which was based on indexing and creating value from open web pages), it also raised a significant question of what kind of public commons we wanted to inhabit as we migrated our attention and our social relationships to the web. (Examples: https://battellemedia.com/archives/2012/02/its-not-whether-googles-threatened-its-asking-ourselves-what-commons-do-we-wish-for ; https://battellemedia.com/archives/2012/03/why-hath-google-forsaken-us-a-meditation)
In the past five or so years, of course, Facebook has come to dominate what is colloquially known as the public square – the metaphorical space where our society comes together to communicate with itself, to debate matters of public interest, and to privately and publicly converse on any number of topics. Since the dawn of the American republic, independent publishers (often referred to as the Fourth Estate – from pamphleteers to journalists to bloggers) have always been important actors in the center of this space. As a publisher myself, I became increasingly concerned that Facebook’s appropriation of public discourse would imperil the viability of independent publishers. This of course has come to pass.
Of course, the potent mix of News Feed and a subset of independent publishers combined to deliver us the Cambridge Analytica scandal, and we are still grappling with the implications of this incident on our democracy. But it is important to remember that while the Cambridge Analytica breach seems unusual, it is in fact not – it represents business as usual for Facebook. Facebook’s business model is driven by its role as a data broker. Early in its history, Facebook realized it could grow faster if it allowed third parties, often referred to as developers, to access its burgeoning trove of user data, then manipulate that data to create services on Facebook’s platform that increased a Facebook user’s engagement on the platform. Indeed, in his early years as CEO of Facebook, Mark Zuckerberg was enamored with the “platform business model,” and hoped to emulate such icons as Bill Gates (who built the Windows platform) or Steve Jobs (who later built the iOS/app store platform).
However, Facebook’s core business model of advertising, driven as it is by the brokerage of its users’ personal information, stood in conflict with Zuckerberg’s stated goal of creating a world-beating platform. By their nature, platforms are places where third parties can create value. They do so by leveraging the structure, assets, and distribution inherent to the platform. In the case of Windows, for example, developers capitalized on Microsoft’s well-understood user interface, its core code base, and its massive adoption by hundreds of millions of computer users. Bill Gates famously defined a successful platform as one that creates more value for the ecosystem that gathers around it than for the platform itself. By this test – known as the Gates Line – Facebook’s early platform fell far short. Developers who leveraged access to Facebook’s core asset – its user data – failed to make enough advertising revenue to be viable, because Facebook (and its advertisers) would always preference Facebook’s own advertising inventory over that of its developer partners. In retrospect, it’s now commonly understood in the Valley that Facebook’s platform efforts were a failure in terms of creating a true ecosystem of value, but a success in terms of driving ever more engagement through Facebook’s service.
For an advertising-based business model, engagement trumps all other possible metrics. As it grew into one of the most successful public companies in the history of business, Facebook nimbly identified the most engaging portions of its developer ecosystem, incorporated those ideas into its core services, and became a ruthlessly efficient acquirer and manipulator of its users’ engagement. It then processed that engagement into advertising opportunities, leveraging its extraordinary data assets in the process. Those advertising opportunities drew millions of advertisers large and small, and built the business whose impact we now struggle to understand.
Another misconception: Facebook does not “sell” its data to any third parties. While Facebook may not sell copies of its data to these third parties, it certainly sells leases to that data, and this distinction bears significant scrutiny. The company may not wish to be understood as such, but it is most certainly the largest data broker in the history of the data industry.
Lastly, the Cambridge Analytica scandal may seem to be entirely about a violation of privacy, but to truly understand its impact, we must consider the implications relating to future economic innovation. Facebook has used the scandal as an excuse to limit third party data sharing across and outside its platform. While this seems logical on first glance, it is in fact destructive to long term economic value creation.
So what might be done about all of this? While I understand the lure of sweeping legislation that attempts to “cure” the ills of technological progress, such approaches often have their own unexpected consequences. For example, the EU’s adoption of GDPR, drafted to limit the power of companies like Facebook, may in fact only strengthen that company’s grip on its market, while severely limiting entrepreneurial innovation in the process (Example: https://shift.newco.co/how-gdpr-kills-the-innovation-economy-844570b70a7a )
As policy makers and informed citizens, we should strive to create a flexible, secure, and innovation friendly approach to data governance that allows for maximum innovation while also insuring maximum control over the data by all effected parties, including individuals, and importantly, the beneficiaries of future innovation yet conceived and created. To play forward the current architecture of data in our society – where most of the valuable information is controlled by an increasingly small oligarchy of massive corporations – is to imagine a sterile landscape hostile to new ideas and mass flourishing.
Instead, we must explore a world governed by an enlightened regulatory framework that encourages data sharing, high standards of governance, and maximum value creation, with the individual at the center of that value exchange. As I recently wrote: “Imagine … you can download your own Facebook or Amazon “token,” a magic data coin containing not only all the useful data and insights about you, but a control panel that allows you to set and revoke permissions around that data for any context. You might pass your Amazon token to Walmart, set its permissions to “view purchase history” and ask Walmart to determine how much money it might have saved you had you purchased those items on Walmart’s service instead of Amazon. You might pass your Facebook token to Google, set the permissions to compare your social graph with others across Google’s network, and then ask Google to show you search results based on your social relationships. You might pass your Google token to a startup that already has your genome and your health history, and ask it to munge the two in case your 20-year history of searching might infer some insights into your health outcomes. This might seem like a parlor game, but this is the kind of parlor game that could unleash an explosion of new use cases for data, new startups, new jobs, and new economic value.”
It is our responsibility to examine our current body of legislation as it relates to how corporations such as Facebook impact the lives of consumers and the norms of our society overall. Much of the argument around this issue turns on the definition of “consumer harm” under current policy. Given that data is non-rivalrous and services such as Facebook are free of charge, it is often presumed there is no harm to consumers (or by extension, to society) in its use. This also applies to arguments about antitrust enforcement. I think our society will look back on this line of reasoning as deeply flawed once we evolve to an understanding of data as equal to – or possibly even more valuable than – monetary currency.
Most observers of technology agree that data is a new class of currency in society, yet we continue to struggle to understand its impact, and how best to govern it. The manufacturing of data into currency is the main business of Facebook and countless other information age businesses. Currently the only participatory right in this value creation for a user of these services is to A/engage with the services offered and B/purchase the stock of the company offering the services. Neither of these options affords the user – or society – compensation commensurate with the value created for the firm. We can and must do better as a society, and we can and must expect more of our business leaders.
God, “innovation.” First banalized by undereducated entrepreneurs in the oughts, then ground to pablum by corporate grammarians over the past decade, “innovation” – at least when applied to business – deserves an unheralded etymological death.
This will be a post about innovation. However, whenever I feel the need to peck that insipid word into my keyboard, I’m going to use some variant of the verb “to flourish” instead. Blame Nobel laureate Edmund Phelps for this: I recently read his Mass Flourishing, which outlines the decline of western capitalism, and I find its titular terminology far less annoying.
So flourishing it will be.
In his 2013 work, Phelps (who received the 2006 Nobel in economics) credits mass participation in a process of innovation (sorry, there’s that word again) as central to mass flourishing, and further argues – with plenty of economic statistics to back him up – that it’s been more than a full generation since we’ve seen mass flourishing in any society. He writes:
…prosperity on a national scale—mass flourishing—comes from broad involvement of people in the processes of innovation: the conception, development, and spread of new methods and products—indigenous innovation down to the grassroots. This dynamism may be narrowed or weakened by institutions arising from imperfect understanding or competing objectives. But institutions alone cannot create it. Broad dynamism must be fueled by the right values and not too diluted by other values.
Phelps argues the last “mass flourishing” economy was the 1960s in the United States (with a brief but doomed resurgence during the first years of the open web…but that promise went unfulfilled). And he warns that “nations unaware of how their prosperity is generated may take steps that cost them much of their dynamism.” Phelps further warns of a new kind of corporatism, a “techno nationalism” that blends state actors with corporate interests eager to collude with the state to cement market advantage (think Double Irish with a Dutch Sandwich).
These warnings were proffered largely before our current debate about the role of the tech giants now so dominant in our society. But it sets an interesting context and raises important questions. What happens, for instance, when large corporations capture the regulatory framework of a nation and lock in their current market dominance (and, in the case of Big Tech, their policies around data use?).
I began this post with Phelps to make a point: The rise of massive data monopolies in nearly every aspect of our society is not only choking off shared prosperity, it’s also blinkered our shared vision for the kind of future we could possibly inhabit, if only we architect our society to enable it. But to imagine a different kind of future, we first have to examine the present we inhabit.
The Social Architecture of Data
I use the term “architecture” intentionally, it’s been front of mind for several reasons. Perhaps the most difficult thing for any society to do is to share a vision of the future, one that a majority might agree upon. Envisioning the future of a complex living system – a city, a corporation, a nation – is challenging work, work we usually outsource to trusted institutions like government, religions, or McKinsey (half joking…).
But in the past few decades, something has changed when it comes to society’s future vision. Digital technology became synonymous with “the future,” and along the way, we outsourced that future to the most successful corporations creating digital technology. Everything of value in our society is being transformed into data, and extraordinary corporations have risen which refine that data into insight, knowledge, and ultimately economic power. Driven as they are by this core commodity of data, these companies have acted to cement their control over it.
This is not unusual economic behavior, in fact, it’s quite predictable. So predictable, in fact, that it’s developed its own structure – an architecture, if you will, of how data is managed in today’s information society. I’ve a hypothesis about this architecture – unproven at this point (as all are) – but one I strongly suspect is accurate. Here’s how it might look on a whiteboard:
We “users” deliver raw data to a service provider, like Facebook or Google, which then captures, refines, processes, and delivers that data back as services to us. The social contract we make is captured in these services’ Terms of Services – we may “own” the data, but for all intents and purposes, the power over that information rests with the platform. The user doesn’t have a lot of creative license to do much with that data he or she “owns” – it lives on the platform, and the platform controls what can be done with it.
Now, if this sounds familiar, you’re likely a student of early computing architectures. Back before the PC revolution, most data, refined or not, lived on a centralized platform known as a mainframe. Nearly all data storage and compute processing occurred on the mainframe. Applications and services were broadcast from the mainframe back to “dumb terminals,” in front of which early knowledge workers toiled. Here’s a graph of that early mainframe architecture:
This mainframe architecture had many drawbacks – a central point of failure chief among them, but perhaps its most damning characteristic was its hierarchical, top down architecture. From an user’s point of view, all the power resided at the center. This was great if you ran IT at a large corporation, but suffice to say the mainframe architecture didn’t encourage creativity or a flourishing culture.
The mainframe architecture was supplanted over time with a “client server” architecture, where processing power migrated from the center to the edge, or node. This was due in large part to the rise the networked personal computer (servers were used for storing services or databases of information too large to fit on PCs). Because they put processing power and data storage into the hands of the user, PCs became synonymous with a massive increase in productivity and creativity (Steve Jobs called them “bicycles for the mind.”) With the PC revolution power transferred from the “platform” to the user – a major architectural shift.
The rise of networked personal computers became the seedbed for the world wide web, which had its own revolutionary architecture. I won’t trace it here (many good books exist on the topic), but suffice to say the core principle of the early web’s architecture was its distributed nature. Data was packetized and distributed independent of where (or how) it might be processed. As more and more “web servers” came online, each capable of processing data as well as distributing it, the web became a tangled, hot mess of interoperable computing resources. What mattered wasn’t the pipes or the journey of the data, but the service created or experienced by the user at the point of that service delivery, which in the early days was of course a browser window (later on, those points of delivery became smartphone apps and more).
If you were to attempt to map the social architecture of data in the early web, your map would look a lot like the night sky – hundreds of millions of dots scattered in various constellations across the sky, each representing a node where data might be shared, processed, and distributed. In those early days the ethos of the web was that data should be widely shared between consenting parties so it might be “mixed and mashed” so as to create new products and services. There was no “mainframe in the sky” anymore – it seemed everyone on the web had equal and open opportunities to create and exchange value.
This is why the late 1990s through mid oughts were a heady time in the web world – nearly any idea could be tried out, and as the web evolved into a more robust set of standards, one could be forgiven for presuming that the open, distributed nature of the web would inform its essential social architecture.
But as web-based companies began to understand the true value of controlling vast amounts of data, that dream began to fade. As we grew addicted to some of the most revelatory web services – first Google search, then Amazon commerce, then Facebook’s social dopamine – those companies began to centralize their data and processing policies, to the point where we are now: Fearing these giants’ power over us, even as we love their products and services.
An Argument for Mass Flourishing
So where does that leave us if we wish to heed the concerns of Professor Phelps? Well, let’s not forget his admonition: “nations unaware of how their prosperity is generated may take steps that cost them much of their dynamism.” My hypothesis is simply this: Adopting a mainframe architecture for our most important data – our intentions (Google), our purchases (Amazon), our communications and social relationships (Facebook) – is not only insane, it’s also massively deprecative of future innovation (damn, sorry, but sometimes the word fits). In Facebook, Tear Down This Wall, I argued:
… it’s impossible for one company to fabricate reality for billions of individuals independent of the interconnected experiences and relationships that exist outside of that fabricated reality.It’s an utterly brittle product model, and it’s doomed to fail.Banning third party agents from engaging with Facebook’s platform insures that the only information that will inform Facebook will be derived from and/or controlled by Facebook itself. That kind of ecosystem will ultimately collapse on itself.No single entity can manage such complexity. It presumes a God complex.
So what might be a better architecture? I hinted at it in the same post:
Facebook should commit itself to being an open and neutral platform for the exchange of value across not only its own services, but every service in the world.
In other words, free the data, and let the user decide what do to with it. I know how utterly ridiculous this sounds, in particular to anyone reading from Facebook proper, but I am convinced that this is the only architecture for data that will allow a massively flourishing society.
Now this concept has its own terminology: Data portability. And this very concept is enshrined in the EU’s GDPR legislation, which took effect one week ago. However, there’s data portability, and then there’s flourishing data portability – and the difference between the two really matters. The GDPR applies only to data that a user *gives* to a service, not data *co-created* with that service. You also can’t gather any insights the service may have inferred about you based on the data you either gave or co-created with it. Not to mention, none of that data is exported in a machine readable fashion, essentially limiting its utility.
But imagine if that weren’t the case. Imagine instead you can download your own Facebook or Amazon “token,” a magic data coin containing not only all the useful data and insights about you, but a control panel that allows you to set and revoke permissions around that data for any context. You might pass your Amazon token to Walmart, set its permissions to “view purchase history” and ask Walmart to determine how much money it might have saved you had you purchased those items on Walmart’s service instead of Amazon. You might pass your Facebook token to Google, set the permissions to compare your social graph with others across Google’s network, and then ask Google to show you search results based on your social relationships. You might pass your Google token to a startup that already has your genome and your health history, and ask it to munge the two in case your 20-year history of searching might infer some insights into your health outcomes.
This might seem like a parlor game, but this is the kind of parlor game that could unleash an explosion of new use cases for data, new startups, new jobs, and new economic value. Tokens would (and must) have privacy, auditing, trust, value exchange, and the like built in (I tried to write this entire post without mentioned blockchain, but there, I just did it), but presuming they did, imagine what might be built if we truly set the data free, and instead of outsourcing its power and control to massive platforms, we took that power and control and, just like we did with the PC and the web, pushed it to the edge, to the node…to ourselves?
I rather like the sound of that, and I suspect Mssr. Phelps would as well. Now, how might we get there? I’ve no idea, but exploring possible paths certainly sounds like an interesting project…