Last Friday I had a chance to stop by the Palo Alto offices of Topix, in many ways a classic internet start up – Valley-based, run by a serial entrepreneur, good buzz – but it didn’t take long for me to sense that something was different this time. Before I get into that, let me first give you a few thoughts on the service itself, and the broader role it plays in the search business.
Background: Topix was founded by six guys, four of whom went to high school together in Pittsburgh. No, I’m not making that up. Most of them are IT/Valley vets, CEO Rich Skrenta founded NewHoo and sold it to Netscape a mere six months afterwards, then morphed it into the now famous Open Directory Project. But Netscape was sold to AOL, and after a while Rich got bored (I assume) and left with the intent of starting a company he could “work into my 40s on.” I like the sound of that.
Topix is an internet media play. More specifically, it’s a local advertising media play. The service takes a crawl-and-index approach to a vast array of internet news sources, then runs the resultant stew through a metadata engine which tags every news story with location and subject data. Topix then builds more than 150,000 topic- and location-specific pages, pages that live comfortably between the great gunky mass of search results, on the one hand, and the impersonal morass of most news sites on the other.
Skrenta likes to call Topix a “150,000-facet diamond,” at least one facet of which should appeal to most news consumers. But step back a few thousand feet and look at Topix’s approach, and you start to see something else at work, something instructive to anyone interested in next-generation approaches to search.
At the risk of getting mired in academic debate, one could argue that Topix is a proof point in the semantic web. Topix is not interested in every web result that might be rendered for a search “news kentfield,” for example. Instead, it searches a limited set of web pages – in this case thousands of news sites – and then annotates the content of those pages with semantic tags – latitude and longitude, for example. It then machine-generates something of an “encyclopedia page” for Kentfield, cobbling together news, weather, advertising, police blotters – a local newspaper in real time (Skrenta pointed out that Topix utilizes a newspaper-like layout on the site, because… it seems to work for the reader. Imagine that, we don’t have to throw out everything we learned over the past 200 years). Topix also creates pages by subject; Skrenta argues that in fact many of their industry-related pages – Wireless, for example, or Search Engines, are among the best sources of business information in the free web.
If you take this approach to the web – mediating SERPs with subject-related “landing pages” – you could imagine a broad scale search engine which manages a machine-created ontology of subject pages. WebFountain comes to mind, orthogonally. In fact, such an approach has been attempted by any number of companies, I have heard that Excite was working on such a project before it fell apart in 2001. (There are others, readers, can you chime in?).
Skrenta does not shy from the semantic tag, in fact, he is one of many I’ve spoken with over the course of reporting the book who agree that the web is failing to scale, and well-documented “neighborhoods” of semantic order will help bring the web back into focus. “Whole cataegories are dead on Google now,” Skrenta told me, referring to the ongoing arms race between relevance and search spam. He’s right, of course. But here’s where the story turns to Topix’s distinction.
The inevitable next question for Rich, is, “Sure, OK, broad web search engines like Google are starting to fail under the sheer weight and scale of the web. But Google, Yahoo, MSN, AOL – they’re all hard at work on this problem. Even more, all three have news products already, and are very focused on integrating local search. What makes you think a small startup like you can make it in the face of such pressure?”
A Valley entrepreneur will usually respond with one of two answers: 1. Don’t Worry, I’ll Sell While Small, Make A Good Sum, And Get A Good Job at An Established Company (as Rich did with NewHoo; Blogger and Kaltix also come to mind), or 2. I’ll Take VC Money, Take the Execution Risk, Sell Later And Get Silly Rich (the current path of, say Friendster). In other words, when the market is dominated by large, entrenched competitors, your best shot at succeeding as a startup is to sell, either at the beginning of your life (when the LargeCo is basically buying the talent and nascent market opportunity), or after you’ve scaled to the point of inflection on the build/buy curve. It costs money to get to that scale, which is where the VCs come in.
I found it wonderful to hear that Rich didn’t want to take either of these options. Instead, his goal was to simply start a neat service, get to the point of self-sustaining revenue (with six employees and an office over a trophy shop, that won’t take long), and grow it slowly. In other words, no VC money, no dreams of world domination. Rich just wants to build a nice media business, without having to either sell it or sell out. Rich met with the VCs about Topix, he told me. “After five minutes, every one of them would tell me their vision of what I had to do to win in the market,” he said. “If you take their money, you have to buy that vision.” Better to fund it small, with angels and friends, and let it grow to its own place in the sun. Could it be that the post-bubble web, version 2.0, will allow for such companies to succeed?
I certainly hope so. May a thousand such flowers bloom – and I see them springing up, over at Nick’s Gawker Media, for example, or the grandaddy of them all, craigslist. I’ve heard pretty much the exact same philosophy from both Nick and Craig: I want to run my little media company, VCs and exit strategies be dammed. Welcome to the club, Rich. May you work into your forties at Topix.
UPDATE: Rich emailed me as I posted this with this news:
“I wanted to let you know that, as of this morning, Topix.net is
now crawling over 6,000 news sources, up from 3,600. Here is an
approximate breakdown of the kinds of sources we are crawling:
24% Daily newspapers
19% AM & FM news radio stations
15% Weekly newspapers
15% B2B and consumer magazines
12% TV stations
9% College newspapers
5% Government websites
and frames, which has increased the yield and coverage of our news
We’re also including the news source name on the attribution line,
instead of just the domain name as we were previously.”