The Database of Intentions

So nothing really new in the news today, I wanted to take a graf or two and explain what I mean by The Database of Intentions, referred to in this post. That way I can use it again and again and just link the phrase to this post. Hey, we…

So nothing really new in the news today, I wanted to take a graf or two and explain what I mean by The Database of Intentions, referred to in this post. That way I can use it again and again and just link the phrase to this post. Hey, we love the web, Ted Nelson lives….

The Database of Intentions is an idea central to the book I’ve been working on for the past year or so, which is tentatively titled “The Search: Business and Culture in the Age of Google” (Penguin/Putnam/Portfolio 2004). As with many in this industry, it all started with the Macintosh. Back in the mid 80s I was an undergraduate in Cultural Antropology, and I had a class – taught by the late Jim Deetz,which focused on the idea of material culture – basically, interpreting the artifacts of everyday life. It took the tools of archaeology – usually taught only in the context of civilizations long dead – and merged them with the tools of Cultural Anthropology, which interpreted living cultures. He encouraged us to see all things modified by man as expressions of culture, and therefore as keys to understanding culture itself. I began to see language, writing, and most everyday things in a new light – as reflecting the culture which created them, and fraught with all kinds of intent, controversies, politics, relationships. It was a way to pick up current culture and hold it in your hand, make sense of it, read it.

At the same time I was making extra money beta testing some software on a brand spanking new Mac, vintage 1984. Anthropology and technology merged, and I became convinced that the Mac represented mankind’s most sophisticated and important artifact ever – a representation of the plastic mind made visible. (Yeah, college – exhaaaaale – wasn’t it great!).

Anyway, the idea that a graphical user interface and, later, a network connecting many GUIs, could provide a medium between many minds drove much of my fascination with reporting on technology, from MacWeek to Wired to The Standard to now. The “Mac as the greatest artifact” meme became one of my standard riffs, from discussions with potential writers at Wired, to discussions with partners at The Standard. The idea that we could better understand ourselves by looking at how we employ technology was and remains the driving force of my work as a journalist.

This is all a long-winded way of saying, I’ve now come to the conclusion that humankind has created a far more fascinating and important artifact, one that surpasses the Macintosh (and its badly drawn descendant, Windows). And before you roll your eyes and say “Oh God, not the Internet…”, no, it’s not the Internet. It’s something that is a product of the Internet, what I call the Database of Intentions.

The Database of Intentions is simply this: The aggregate results of every search ever entered, every result list ever tendered, and every path taken as a result. It lives in many places, but three or four places in particular hold a massive amount of this data (ie MSN, Google, and Yahoo). This information represents, in aggregate form, a place holder for the intentions of humankind – a massive database of desires, needs, wants, and likes that can be discovered, subpoenaed, archived, tracked, and exploited to all sorts of ends. Such a beast has never before existed in the history of culture, but is almost guaranteed to grow exponentially from this day forward. This artifact can tell us extraordinary things about who we are and what we want as a culture. And it has the potential to be abused in equally extraordinary fashion.

Once I grokked this idea (late 2001/early 2002), my head began to hurt. This was A Big Idea, one that certainly was not new (I edited this piece in 1995), but my comprehension of it was new; and it explained the recent surge in paid listings as a successful advertising vehicle – the first truly robust commercial exploitation of the Database of Intentions.

So I decided to focus on this idea in book form. I started looking around for folks who understood these ideas better than I did, and I found many – an entire industry of people devoted to search , and a subset of academics, writers, entrepreneurs and visionaries engaged in the exploration of this idea and its implications. The goal of my book, then, is to tell the story of this idea, and how it drove the rise of computing, the internet (and the bubble), and where it might be going. Clearly Google will play an important role, but I’m not out to write yet another boring and opportunistic book about today’s hot company. I’m just starting the writing, and reporting continues apace. And if you’ve managed to read all the way down to this point, I hope you will join the conversation.

87 thoughts on “The Database of Intentions”

  1. Hi there.
    I came across your blog by clicking through a series of links, starting with Matt’s newsletter. What is going to improve search in a dramatic way is something that sounds mundane at first: the storage market is moving toward interoperability, after eons of pundits bitching about the lack thereof. (One of my recurring freelance gigs that isn’t on newsstands is with an outfit called The Storage Reporter, so I’m right in the trenches on this topic.) Search has suffered from the technical difficulties stemming from the fact that most companies’ data now doubles in volume ANNUALLY (three years max, however). The data doesn’t go away, it just piles up in disparate siloes; CRM and search vendors have been touting products that allegedly pull data from multiple sources, but such software is stymied by the limitations of the hardware architectures underlying them. That’s why the interoperability efforts of the Storage Networking Association, and the new SMI-S specs are such a big deal. Feel free to drop me a line if you want to learn more — but I have a feeling you might be on this already.
    Cheers,
    Jackie Cohen 🙂

  2. John,

    For the sake of defintion, let’s say that search engines index digital representations of “things”: a blog entry, an image, a CD review, and so on. In this context, the Database of Intentions (as I understand your explanation) comprises an unprecedented view of how people relate to things, as well as how things relate to each other. Maybe it also includes derivatives like how people relate to other people due to their common interest in certain things.

    If I’ve got your definition right, why aren’t the players with the highest potential leverage not the big search engines but rather the big ISPs? Set aside who has so far played the game better–Google clearly wins that–and consider what happens if a big ISP gets serious about building a Database of Intentions. We’ll take AOL as an example.

    AOL can track every clickstream by each of its millions of users, wherever they go on the Web. That includes whatever they do on search engines: the words they search for, the results they get back. Thus, AOL has exactly the same search-behavior information as the search engines themselves, except AOL has that information across *all* search engines. On top of that, AOL sees what a user does after he/she clicks to a result, which has massive implications for the fidelity of the database.

    So if the point is to generate information that drives better knowledge about “who we are” and–of particular commercial interest–“what we want,” then ISPs have proximity to far better raw materials. If the technical and privacy issues of mining those raw materials can be managed, duplicating any other bits of the search engines’ infrastructure (like PageRank-style link analysis) should be possible.

    Of course, the question of “big ISP” versus “big search engine” is messy because companies like AOL, Microsoft, and Yahoo all have anti-Google internal search efforts, and those companies also own or have close partnerships with ISP units. But that just says the ingredients are already at hand; it’s a question of who puts them together first.

    –Steve

  3. As a quick comment to Steve, your comments appear to rely on the idea of a single user per ISP account for its power. What about corporate ISP accounts that account for a huge portion of data? My office has a thousand different people using search engines on the same line, and IP addresses vary daily, so aggregation can’t really be done at that level.

    And just an FYI to John: people use search engines in a lot of different ways that could easily confound your goals (without ever giving evidence of the inaccuracy of your results). As an example: I rely on search engines to provide me with certain links, rather than bookmarking them. One reason is that I use many different computers and have no single bookmark file, but the reason is unimportant. My point is that the links I truly use/need/like don’t go through search engines because I have them bookmarked and/or know them by heart. On the other hand, for a site I reference once every other week, I may use Google to look up every time, because I know how and where it appears on the results list. Thus, the information I use *less* often is actually much more highly represented in search engines than the pages I use all the time…

    I guess in general I am wary of notions that aspire to “explain it all” and your choice of wording (Database of Intentions) sets off that alarm in my head. Examine your premises very, very carefully.

  4. Doesn’t building a Database of Intentions violate user rights? Marketers would jump on this like vultures on carrion, and that would encourage the database managers to clamp down on the information and auction it off. This widens the information (and thus power) gap between marketers and users.

  5. My thought is that the DBoI is already “built”- it’s how it gets used that’s the interesting question. It’s an astounding artifact, from an academic standpoint, and also quite a haystack for the spooks of the world. I agree, Euphrosyne, that one idea can’t “explain it all.” Bias exists, results can be misinterpreted. But the fact is, this thing exists. And that alone means it has power, and is worth examining.

  6. John,
    I came across this web log following your link from an e-mail. I’m a tech biz reporter at the San Fernando Valley Business Journal, who thought he would venture into news and investigative stuff (still will probably), but some days I reminisce over my younger years … reading Jules Verne like there’s no tomorrow. I read Michael Crichton all the time too. To summarize (here’s my lede) I simply am a science fiction addict-cum-journalist, although I really want to just be a journalistic jack-of-all-trades.

    Anyway, just thought I’d drop a note. The Google book idea is brilliant, and I hope it materializes.

  7. The DBofI is a good idea. But it relies primarily on the fact that people’s online footprints/mindprints are more easily traceable than their offline counterparts, and the offline ones are more telling.

  8. Perhaps the Zeitgeist info (popular searches) on Google tells us nothing more than just that – Pop Culture trends. It does not really reveal the common man’s thought. Let me build an analog for a search here. If I was curious about Einstein, and I was in a local library, I can borrow just a few books, not all, as I have limited time. The felllow next to me would probably not be there for the same thing, but I can bet that we are probably both aware of who was running for president, and if there was a Time magazine nearby we both may sneak a peek into it, but our real search intent and purpose is each separate and unique (not about the election), of which will not get a tally or popular score in some rating. That intent gets lost when the metric is through counts, which is really on a superficial level.

  9. Another question is, who drives what. If I see that Britney Spears is on top of the list, it gets me curious too, so it’s the chicken before the egg question. What people get curiouis about is certainly driven by outside (Marketing) influences, and reveals more about the marketing engine rather than the intrinsic individual. It’s just an electronic tally of what’s popular, plain and simple. Far from intention, which is self-driven.

  10. I guess in general I am wary of notions that aspire to “explain it all” and your choice of wording (Database of Intentions) sets off that alarm in my head. Examine your premises very, very carefully

  11. Corporate Information per Jackie Cohen ” most companies’ data now doubles in volume ANNUALLY”. Some of it really great data of potential long-term. Legislation like Sarbanes-Oxley (SOX) was lead to deliberate obfuscation. Removal of search functionality, new layers of password access. In my experience the information piling up in all that expensive storage media might as well be stored in vaults on the moon for all the good it is doing the corporations.

  12. I guess in general I am wary of notions that aspire to “explain it all” and your choice of wording (Database of Intentions) sets off that alarm in my head. Examine your premises very, very carefully

  13. Hello, if I hear of chickens me badly the bird flu to have we of China get now the Chinese to have from the USA google gotten. Unfortunately neither of them gives a vaccine, unfortunate or for. I would like to mention still one the bird flu to have we soon in the grasp

  14. Hello, Great to see you on 60 min. last night speak on the Corporate culture at Google- really enjoyed your persepectives on the future of search -keep up the great work.

  15. Google is a mighty tool. So they made the price. Lokk a the other searchtools. I my website statistics I earn my traffic 99% from google.

  16. Marketers would jump on this like vultures on carrion, and that would encourage the database managers to clamp down on the information and auction it off. This widens the information (and thus power) gap between marketers and users.

  17. Doesn’t building a Database of Intentions violate user rights? Marketers would jump on this like vultures on carrion, and that would encourage the database managers to clamp down on the information and auction it off. This widens the information (and thus power) gap between marketers and users.

  18. The google search is the number one however all tries by PAGE ranking front seats to reserve. Sides only for the google search machines are built are zugemuellt there, not the InterNet

  19. What I find most interesting in the original article was the cultural anthropological focus stated. The tecnology and who controls or not relates simply for me the mechanics of memes. The reality is that I’m glad you’ve done the work you have and the idea of tracking, managing such an important subject I found here.

    Keep it goin – If the book is available I buy it.

  20. Good luck with the book, and please let us know when will it can be read online. At least a short abstract.
    Concerning the Database of intentions, this actually is what google does, by incorporating the history of all your search activity, chats, calendar, websites, gmail, and all the google services (including the local history, on the personal computer). What would then be the result? Well, it can be a huge “thing” that knows everyone, that can serve someone exactly what he is looking for (based on the previous searches and history), so… we’ll just sit and relax, search and order things…
    I am still wondering though if there could be someone using this database… with some evil intentions. What if this comes up?

  21. Google ist die Nummer eins unter den Suchmaschinen und vergibt auch die Page Ranking, wonach eine Seite bewertet wird. Ob es nun nur an den Keywords und backlinks liegt wie eine Seite bewertet wird ist ein Geheimnis von Google

  22. Corporate Information per Jackie Cohen ” most companies’ data now doubles in volume ANNUALLY”. Some of it really great data of potential long-term. Legislation like Sarbanes-Oxley (SOX) was lead to deliberate obfuscation. Removal of search functionality, new layers of password access. In my experience the information piling up in all that expensive storage media might as well be stored in vaults on the moon for all the good it is doing the corporations.

  23. Hi John,

    some interesting thoughts …
    but how google handle the pages of chinese sites for the chinese authorities ?
    Maybe is the slogan from google “filter information to make money” and not “don’t be evil” …

    regards
    timm

  24. Another question is, who drives what. If I see that Britney Spears is on top of the list, it gets me curious too, so it’s the chicken before the egg question. What people get curiouis about is certainly driven by outside (Marketing) influences, and reveals more about the marketing engine rather than the intrinsic individual.link eklesite eklechat
    It’s just an electronic tally of what’s popular, plain and simple. Far from intention, which is self-driven.

  25. great information and really scary the omnipotent google awareness! I am getting about 80% of my visitors by google. i would prefer when there would be more other strong engines to have fair competion and more traffic allover 😉

  26. Okay – I’m a little slow. Just getting to reading “The Search”. Absolutely fascinating that I can find who is linking to my side by just typing in link: and then my website.

  27. So many previous commenters have said “95% of my traffic comes from Google”, etc, but happens to your traffic when Google decides to tweak its algorithms? Like you said in ‘Search’, John, the Google Dance is always a concern for small business owners (like the big feet shoe seller). What does Google care if your website listing doesn’t show up in a results list, if you’re not spending money for paid placement or paid inclusion? What’s to stop them from making your site unreachable by the masses? (Albeit a rhetorical question)….NOTHING!
    Google has far too much control over what people see on the internet, and for that, to me, their corporate motto fails. Google IS evil.

  28. hi John,
    My question is not about Google, but about your MT install. I notice you have the latest version (3.34) and have FastCGI implemented. I’m wondering if you followed the instructions from Six Apart and if you had any errors, specifically with the StyleCatcher and Widget plugins. The missing Google API was another error I had listed several times in my site’s error file.
    Just curious. I’ve had so many problems that I nearly gave up. I’ve ended up disabling the offending plugins and the errors have quietened, but that isn’t the ideal scenario.

  29. in general I am wary of this, its a big information gap between marketers and users.

    I rely on search engines to provide me with certain links, rather than bookmarking them.

    Great Article, lots of information

Leave a Reply to shaggy Cancel reply

Your email address will not be published. Required fields are marked *