On Small, Intimate Data

Part of the research I am doing for the book involves trying to get my head around the concept of “Big Data,” given the premise that we are in a fundamental shift to a digitally driven society. Big data, as you all know, is super hot – Facebook derives its value because of all that big data it has on you and me, Google is probably the original consumer-facing big data company (though Amazon might take issue with that), Microsoft is betting the farm on data in the cloud, Splunk just had a hot IPO because it’s a Big Data play, and so on.

But I’m starting to wonder if Big Data is the right metaphor for all of us as we continue this journey toward a digitally enhanced future. It feels so – impersonal – Big Data is something that is done to us or without regard for us as individuals. We need a metaphor that is more about the person, and less about the machine. At the very least, it should start with us, no?

Elsewhere I’ve written about the intersection of data and the platform for that data – expect a lot more from me on this subject in the future. But in short, I am unconvinced that the current architecture we’ve adopted is ideal – where all “our” data, along with the data created by that data’s co-mingling with other data – lives in “cloud” platforms controlled by large corporations whose terms and values we may or may not agree with (or even pay attention to, though some interesting folks are starting to). And the grammar and vocabulary now seeping into our culture is equally mundane and bereft of the subject’s true potential – the creation, sharing and intermingling of data is perhaps the most important development of our generation, in terms of potential good it can create in the world.

At Web 2 last year a significant theme arose around the idea of “You Are the Platform,” driven by people and companies like Chris Poole, Mozilla, Singly, and many others. I think this is an under-appreciated and important idea for our industry, and it centers around, to torture a phrase, the idea of “small” rather than Big Data. To me, small means limited, intimate, and actionable by individuals. It’s small in the same sense that the original web was “small pieces loosely joined” (and the web itself was “big.”)  It’s intimate in that it’s data that matters a lot to each of us, and that we share with much the same kind of social parameters that might constrain a story at an intimate dinner gathering, or a presentation at a business meeting. And should we choose to share a small amount of intimate data with “the cloud,” it’s important that the cloud understand the nature of that data as distinct from its masses of “Big Data.”

An undeveloped idea, to be sure, but I wanted to sketch this out today before I leave for a week of travel.

24 thoughts on “On Small, Intimate Data”

  1. Reminds me of something Vinod Khosla said at Disrupt SF last year. He said he was investing in big data, but also “data reduction”. That term stuck with me. When people say “big data”, it seems partly they are talking about the incredible progress we’ve made in creating, storing and processing the huge volumes. But I think they also mean (but don’t always say) that it’s real magic or usefulness is when it’s not big, but small as you say — and actionable.

  2. My brother is a pharmacogenomist at Mayo and his colleagues have a saying that the old 5,000-person clinical trial is the luxury of those research institutions that don’t have to innovate. Instead, much of the more innovative research is focused on the Drug for One – a drug for your specific DNA. A whole different approach is used – much smaller data sets, the use of avatar mice, the genome of both the tumor and the person are sequences, etc.

    Big data is not “human” enough and the insights are still historical, and barely predicative.

    All that to say, agree 100 percent with the small data notion … look forward to seeing where you go with it.

  3. One element missing in this mix is time i.e. historical and current data. Usually historical is big data and current is small, intimate data and volunteered by the individual for some specific goal. Historical tries to estimate/guess the goal while small, current data more likely defines it. 

  4. touché!
    Landscapes evolve… but at some point that evolution becomes more akin to a phase change than simply a modification.

    Technologies have always modified landscapes… but very few (and realized only long after-the-fact) bring about sudden shifts in evolutionary direction.

    The birth of agriculture was certainly one… so is the development of the Internet and the rather sudden leveling of the global transaction landscape.

    The globalized direct P2P sharing of pictures, news, opinions and information unburdened by questions of proximity (whether physical, linguistic, cultural, etc.) are all examples of disencumbering transaction.

    The last time a human being could transact easily with all those within the social organism that constituted the ‘decision body’ most concerned with his survival was as hunter-gatherers.

    Scaling the relationship between the individual and that new landscape is problematic but necessary. Errors in the development of its architecture could have catastrophic impact.

    The birth of agriculture saw the birth of authoritarianism… sadly, this was a ‘natural’ development (like cancer is natural) which resulted from a failure to understand the Altruism dilemma. It took a few thousand years to move a bit beyond that. And we’re still dealing with it.

    The Internet… like the birth of agriculture… isn’t going away.

    But I’m not so sure we’re doing such a great job of addressing (or even recognizing) the perils. However, recognizing the need for keeping our eyes open to new models is a good idea…

    So… again… touché

    And as nervous as we all get about it… its necessary to look at money, credit, banking and P2P transaction in relation to this landscape. I’m not trying to start a revolution but there are a lot bigger issues than just being able to wave your phone in the air to pay for a mocha latte. And the speech-related micro-transaction is the equivalent of regaining the ability to spit across the campfire… it changes the relationship of the individual to society and decision.

  5. The timing of this post couldn’t be better for me. I recently arrived in the
    Bay area to pursue exactly this topic, so the topic and comments reinforce that
    I may have made the right move.

    Although, you’ve been in and around this topic forever (starting with
    your ‘Search’ book), and it’s of obvious interest to readers of this blog, Small
    Data is only now receiving the recognition it deserves among a wider swath of
    online users. I believe you mentioned in a previous post the World Economic Forum report –
    Personal Data: The Emergence of a New Asset
    Class – which served to legitimize this area to the general public.

    But two recent
    publications – one by Doc Searl’s, The Intention Economy: When Customers Take Charge  and the other by Joseph Turow,
    The Daily You: How the New Advertising Industry Is Defining Your
    Identity and Your Worth  – take
    two distinct approaches and arrive at the same conclusion (among others):
    Consumers are not getting their fair share of the value of their purchasing
    intent information and the way to correct that imbalance is to take a
    demand-driven approach to sales. In other words, consumers ‘intentcasting’ their
    purchasing aspirations to merchants, in stark contrast to inherently
    inefficient one-to-very- many broadcasting methods now in use.  Toward that end, Searls advocates VRM (Vendor
    Relationship Management) for customers and the formation of a ‘4th
    party’, tools or organizations that are unquestionably on the consumer’s side (analogous
    to the ubiquitous 3rd parties that assists vendors).

    Also worth reviewing is the The Personal Data
    Ecosystem http://pde.cc/ which summarizes the worldwide community addressing this topic,
    and also publishes an monthly journal that highlights the goings on in this now
    dynamic area.

    All the above signal that this is the time for commercializing what is,
    perhaps the largest pool of wealth in the history of commerce. But not with predatory
    business models as is now the case in seller-dominated commerce, especially
    online. It important to point out that Small Data advocates are not proposing a
    reversal of the buyer-seller relationship so they are now in dictatorial
    charge, but enough change so both parties to a sales transaction are in a
    *balanced* relationship, one where *both* buyers genuinely benefit. 

    My startup (presently in ‘pre-traction’ mode), Office Tower2, proposes to
    reach that goal by focusing on the
    specific demographic who are most likely to embrace the idea of trading
    their purchasing intent data for cash rewards – web-fluent and well-paid people
    who work in office buildings.

  6. Conceptually, it’s a galvanizing idea – creating a distinct intimate data structure! The obvious hurdle, how do you convince the big data revenue agents that this particular partition is not central to their competitive advantage?

    Look forward to your sketch…

  7. Another question would be, how do we get large businesses and entities to see the role that small, intimate data plays in their transactions? Building a loyal and defined community around a brand would require some knowledge of small data. 

  8. Wasn’t I talking about this sort of thing half a decade ago?  
    http://battellemedia.com/archives/2006/09/google_personalized_search_who_owns_the_profiles.php#comment-335337004

    Put the user at the center and let the user do his or her own aggregation, sharing, etc.  Rather than putting the web company at the center, and letting Big Data do the aggregation, etc.  Users have orbited web companies for too long.  Web companies need to orbit users.Well, even if I were talking from this perspective half a decade ago, Doc Searls and others introduced it a decade ago or more.  So I’m not claiming novelty.  I’m just saying that I’ve wanted things like this for a long, long time.  But rather than becoming more small-data orbiting, the web has gone the opposite direction, into Big Data.  I appreciate that this theme might have a couple of web2.0 conference small data adherents.  But in the past 5 or 10 years, the trend has gone strongly in the opposite direction, toward Big Data.  And I don’t see that trend reversing.  

    So what are we to do?

    1. I think this question is going to resolve, because it centers on control, and at least here in the US, we tend to wake up and realize what’s what, even it if’s a bit late.

      1. But george (below) has a really good point, when he mentioned the revenue streams.  If somehow we could convert these large web companies into service providers for which we paid a monthly fee, then a lot of these issues would clear up.  It’s the dependence on the advertising model that leads to the need for big data in the first place.  imho.

Leave a Reply

Your email address will not be published. Required fields are marked *