free html hit counter December 2003 - Page 2 of 8 - John Battelle's Search Blog

Parts of Patriot II Slipped Into Law While No One Was Looking…

By - December 28, 2003

This is a very big deal. Not just because of the law itself – it’s heinous – but because of the way it was passed by the Bush administration – on a Saturday, during Saddam’s capture celebration, after an unaccountable voice vote on Thanksgiving with no debate. And of course the media did not pay attention, and of course, I hope, we will. I’ll summarize the effect of this later – it gives the government extraordinary new powers of search (which is one reason it relates to this site) – but for now, please give your lawmakers hell.

  • Content Marquee

Find O' The Day: WordNet

By - December 27, 2003

If you’re a linguistic geek, or just like stoning out on how words work, check out WordNet. I was told of this site in an email discussion with a reader, it’s an ongoing academic research project based out of Princeton.

This site (it’s also available as downloadable software) is basically a database of interconnected word meanings. The site says: “WordNet® is an online lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory .” Er, in other words, it’s a neat way to see the various “senses” a particular word might have. The online site has a Word Search function. Type in any word, say….”search“…and you’ll see it has five senses as a noun, and four as a verb. You can then explore various aspects of the word’s senses, including synonyms, derivations, and – really cool – hypernyms: “Search is a kind of…” and hyponyms “ a kind of search.”

I’ll admit, this was my first time stumbling across the terms hypernym and hyponym and actually understanding why they matter. So why does this matter to Search writ large? Because one way to think about improving Search is for an engine to drill down on a particular query based on what sense of the word the searcher intended. In other words, when you type “jaguar” into the query box, which sense did you mean – the cat, the car, the team, the software? If a search engine can create “senses” of words on the fly, it might be able to create smart responses to difficult and high-results queries (AltaVista and others do something similar with clustering, but this technology has not been blessed by everyone as relevant enough..including Google). Think of Google’s spell checker, but with “senses” of words, instead of spellings of words. “Did you mean the cat?….” etc. Now, I have no idea if this particular implementation would be useful to a search engine, it probably has all sorts of problems. But it’s interesting to think about nevertheless.

Thanks to Steve K for this pointer and the conversation that provoked it.

(An aside – my email is I welcome email if a particular thought of yours is more comfortable in that medium as opposed to the site’s comments area. I’ve learned a lot from such exchanges).

China's Answer to Google

By - December 26, 2003

In the English language version of People’ Daily (take it for what it’s worth…) is a rather exuberant announcement for the launch of the “world’s largest Chinese search engine”, known officially in English as “China Search Online” ( The page is reasonably clutter-free, as compared to most Chinese portals I’ve seen (I co-taught a course on weblogs and China last semester, the product of that course is a cool weblog called China Digital News.)
In any case, the folks behind the engine, HII (who went public earlier this month, see here) are compared to Google, they even have a no-human-editors-have-touched-this news product to boot.

On Invisible Tabs (and Hands)

By -

In an email conversation, Danny Sullivan (he of Search Engine Watch fame) and I recently were discussing last week’s post on Froogle. Danny disagreed with my premise that Google’s actions were inconsistent, in fact, he believes they may well be consistent with a new and evolving interface approach that he calls “invisible tabs.” He explains the idea here. The gist: search engines will intuit what you are looking for behind the scenes, and deliver to you the results most consistent with that intuition, making the tab format redundant in the first place.

As Danny put it in an email to me:

The real departure is going to be if Google finally makes the jump and gives you back 10 product/Froogle results at some point, and suggest that you might also search the web, for some queries, rather than the web dominance we get now. That will be them fully putting into play this whole invisible tabs concept that I’ve been talking about recently.

Danny points out that Google already does this with News. Try searching for “George Bush,” for example. You’ll see News results at the top. Google is intuiting that you wanted news on George Bush, or at the very least, that news about George Bush is relevant to your search.

Same thing for Froogle results, Danny explains: “They’re hitting the Froogle database in automated fashion, and if the automated system feels confident enough, you get Froogle results displayed. No different really in look, feel and operation than searching for “iraq” and getting news results.”

Well, yes…and no. What I find interesting is this part of the idea: ” If the automated system feels confident enough, you get Froogle results displayed.” No matter what, code = architecture, and architecture = politics. Somebody had to code that Froogle algorithm to determine its confidence/intuition with regard to your search. Google, and any other search engine worth its shareholder’s money, will never tell you how it makes those decisions. They are the Invisible Hands of the automated search process. The men behind the curtains.

And therein lies the interesting bits.

Regardless, we should all give Google a lot of credit for having neither paid inclusion nor referral fees in their shopping engine. That is leaving a lot of money on the table toward a greater end, and an indication of the philosophy which guides them.

The Health of Magazines: Blame Cable As Much As Internet

By -

More and more I’m noticing my cable lineup looks like a magazine rack. Used to be, television was a scarce resource. As late as five years ago, it was still being programmed for large audiences – at least a million, if not more. If folks wanted well-produced niche content, they had to go to magazines. Now they can go to the internet as well, but until recently, I thought magazines could still compete for a smaller audience’s attention if they stood out as a voice for a particular community.

But I now believe magazines as we understand them are eroding, succumbing to the twin tides of niche cable and what might be called the second wave of Internet publishing.

First, TV. Cable seems to have finally realized that in a 500-channel universe, not every channel can garner a 20 rating. Hence a willingness to do focused, niche content that aspires to just several hundred thousand viewers at a time. This strategy can produce breakout mini-hits like Trading Spaces and Queer Eye, but in general, it seems cable has figured out how to make money selling audience sizes based on metrics quite similar to those of magazines. Thumbing up and down my cable menu, I feel like I’m at the magazine rack at Barnes & Noble – there’s 25 different sports titles, scores of shelter books (that’s the home/hearth category for you non-magazine folk out there), plenty of music/pop culture plays, even programmatic equivalents of “Guns&Ammo.” None of these shows, save perhaps the pop culture stuff, do more than 500K in audience on any given day. In other words, TV has managed to segment audiences into the same demographic/psychographic buckets that once were the sole purchase of magazine land. PVRs only accelerate this trend, adding the convenience of search and storage to the magazine rack concept. Add in the fact that the average cable bill in the US is more than $40, and you have a subscription+ad model, just like magazines. I should also note that the advertising business has shifted in kind: production costs have been driven down by technology, and buyers now understand how to buy spot and niche cable. End game: TV wins head to head against print. Just ask the publishers of Life.

Now, the Internet. I’ve always thought you could create great magazines if you stayed away from competing solely on audience demographics/psychographics, and focused on the ineffable quality of publishing that might be called community. Because they serve deep and subtle content, magazines can create and/or declare community, a badge that folks wear proudly, a club in which they claim membership. Well, while it’s quite difficult for a cable show to hold this rather ephemeral quality, the Internet has it in spades. Strike two.

If my beloved magazine readers are getting their high-bandwidth niche experience from cable, and their community succor from the web, what is left for magazines to do? Is Battelle saying magazines are dead?

Well, yes and no. We have to rethink what a magazine is. Again. After all, I teach magazine development, for goodness sake, so I can’t very well believe magazines are a dying breed. There are exciting things to be done with the idea of magazines, if we can reinterpret them. For one, make them for smaller audiences, and compete on that point, rather than be ashamed of it. Two, figure out how to make magazines sing in the online world (nearly all attempts to date are awful). And three, figure out a way to get around traditional approaches to the twin evils of circulation marketing and distribution. I’ll post more on my thoughts as to how at a later date, but I wanted to get that cable TV-as-magazine-rack meme out there, and see what you all thought.

A Software Wish

By - December 24, 2003

As long as I’m skipping down memory lane and all, I wanted to leave one tech/site related request – late though it is – for Santa this Christmas Eve. Or maybe I should say, for Ben and Mena over at Six Apart (the folks who brought us the software that makes this site, and many others, possible). I wish for Quark Xpress for the Web. I know this has been something of a pipe dream for many (and I’m not really tight into the discussion on this topic, so perhaps it’s on its way), but now that I’ve mastered a few of the basics of Moveable Type, I really wish I could play with the software the way I did when desktop publishing was young, back in the late 1980s, when Quark and others made history. One of my first gigs was as a paginator for MacWeek, which claimed to be the first desktop-published four-color national weekly magazine in history. It was just really really cool to be a beta tester of software (Quark version .9) that you knew was going to change publishing forever. As great as MT is, it’s still too hard to tweak the sites, to make them look better and perform better from a reader’s point of view. I know making this stuff is extra hard. But I’d sure love to beta test the GUI version of MT, were it ever to come down the chimney….

And Happy Holidays to you all. Thanks for reading these past couple of months, and giving me so much to chew on. I look forward to 2004.

Why Yahoo, Interactive, and Google Love Local Search …

By - December 23, 2003

Because it’s poised to grow to nearly $3 billion in revenues by 2008, up from about $1 billion now. And because the mass of small business owners who currently don’t use search would certainly switch if presented a compelling solution that actually brings in customers. Can you imagine your corner grocery store or dry cleaner buying keyword search? Me too. Move over, Yellow Pages….

Corporate Search Is Sexy

By -

Before there was the web, there was the corporate database. Remember those days? Back in the mid to late 80s, when the Local Area Network was the Next Big Thing, when everyone was madly installing client-server databases, when applications like dBase III and NetWare ruled the roost? You don’t? Sigh. I must be showing my age. I was a cub reporter back then, covering the relatively new beat of “networking” as well as the corporate database market. Yup. Somehow I found that stuff fascinating. I thought this whole idea of connecting disparate networks of information was a hoot.

Anyway, to the point. About 1987 or so a new class of applications developed. Called Executive Information Systems (EIS), these were essentially interfaces to data, designed to live on top of corporate databases and cull the stuff Really Important Executives needed to know so as to make Really Important Decisions. The coolest part of the spec was the fact that the data was queried from the desktop – EIS promised easy and intuitive access those unintelligible databases the geeks kept buying. The idea was sexy, but the category never really took off. The design was too rules based, too top down. For them to work, you had to literally redesign your entire infrastructure. Oh, and the Executives in question had to give a shit.

Fast forward to now. As most of the world remains fascinated with search’s more public face, a significant shift seems to be occurring in the corporate data world. I’m not saying EIS is back, exactly, but the overwhelming presumption of webwide search on your desktop is certainly rewiring how corporations think about their more private databanks. A robust market has grown up around “enterprise search,” (some companies, such as FAST, were spun off from consumer search companies, and Google maintains a unit focused on the market). There’s a crop of interesting startups to boot, including Tim Bray’s company, Antarctica. It’s entirely possible some of the next big ideas in search may well be developed in this more focused, less public field. Any readers out there have suggestions of cool companies in this space I may be overlooking?

The Mayo Database

By - December 22, 2003

Wired News reports on a massive database project from Mayo which has interesting, scary, and rather exciting implications for diagnosis and treatment. Genetic information will eventually be included. Excerpts:

During an office visit, a medic will be able to do enough quick data mining to ensure the most accurate diagnoses and most effective treatments while the patient waits, de Groen said. “Ideally the computer would query both our own database of patients (and) the complete medical literature.” ….

Health-care professionals look forward to the eventual addition of patients’ genetic information to databases like the MCLSS — a field known as clinical genomics — as a major advance in medicine. Among other things, such access would allow doctors to divine with great speed and accuracy what drugs have worked best on a certain type of person with a certain illness. …

“It’s really about applying the knowledge from many to the benefit of one,” said Dr. Anne-Marie Derouault, director of alliances and distribution channel management for IBM Life Sciences, who heads the IBM teams working on the project. “Without genomics, it would be very hard to do that. Putting that kind of information with traditional information is potentially going to bring medicine to another level.”

Thoughts on 2004

By - December 21, 2003

I am not sure why all of a sudden I am struck with the urge to prognosticate, but all weekend long I’ve been thinking about what might happen next year in the search/tech/media nexus. I think it has something to do with the book – my plan is to finish it by about mid year, then pray that nothing major changes for another six months while the manuscript wends its way through the vagaries of the publishing process. It’s either that, or Jeremy envy.

So I’ve been thinking about a number of things, some small, some not so small, which might happen in the next twelve months. Given that I’m writing this on the eve of Winter’s Solstice, I give you Battelle’s First Annual Solstice Hopes and Predictions for 2004. I refuse to say which are hopes, and which predictions. This way, I can claim to be right next year one way or another. Take it for what it cost you on the way in…. (see list via link below)

]]> Read More Read More