Mapping Data Flows: Help Us Ask the Right Questions

I’ve been quiet here on Searchblog these past few months, not because I’ve nothing to say, but because two major projects have consumed my time. The first, a media platform in development, is still operating mostly under the radar. I’ll have plenty to say about that, but at a later date. It’s the second where I could use your help now, a project we’re calling Mapping Data Flows. This is the research effort I’m spearheading with graduate students from Columbia’s School for International Public Affairs (SIPA) and Graduate School of Journalism. This is the project examining what I call our “Shadow Internet Constitution” driven by corporate Terms of Service.

Our project goal is simple: To visualize the Terms of Service and Data/Privacy Policies of the four largest companies in US consumer tech: Amazon, Apple, Facebook, and Google. We want this visualization to be interactive and compelling – when you approach it (it’ll be on the web), we hope it will help you really “see” what data, rights, and obligations both you and these companies have reserved. To do that, we’re busy turning unintelligible lines of text (hundreds of thousands of words, in aggregate) into code that can be queried, compared, and visualized. When I first imagined the project, I thought that wouldn’t be too difficult. I was wrong – but we’re making serious progress, and learning a lot along the way.

One of the most interesting of the early insights is how vague these documents truly are. The conditional (“might,” “could,” “may” etc) seems to be their favorite verb tense. It likely comes as no surprise to dedicated readers, but despite the last two years of public outrage, tech companies can pretty much do anything they want with your data, should they care to. Another interesting takeaway: The sheet amount of information that *can* be collected is staggering. A third insight: Even if you can find the data dashboards that give you control over how your data is used, cranking them to their fullest powers often won’t limit data collection and use, but rather will limit their application in very specific use cases. It’s all about the metadata. Lastly, it’s fascinating to see how similar these documents are across the top four companies, and how Apple, for example, has pretty much exactly the same rights to use your data as, say, Facebook.

I could go on, but what we really want to know is what *you* wish you understood about these companies’ data practices. That’s why we’ve built a very short, very subjective survey that we’re hoping you’ll take to give us input and feedback as we start to actually build our visualization.

I’ve buried the lead, but here’s the ask: Will you please take a minute to give us your input? Here’s the link, and thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *