Oh, I know, I’m not big on enterprise search, it puts me to sleep. But to be honest, it was the enterprise that got me into this game, nearly 20 years ago, at a now defunct Macintosh weekly called MacWeek. We covered “MVBs” – Macintosh Volume Buyers, and my best sources were big corporate buyers at Anderson and the University of Texas. These guys saw all the cool shit early, and then blabbed about it to me. Our fearless executive editor was a fellow named Dan Farber, who now runs editorial at ZDNet. Anyway, Dan emailed me yesterday and asked what I thought of enterprise search, which is clearly one of the most overlooked stories in search. (His view on it is now up, here). It made me think, and I realized that in fact, enterprise search will probably rise again, and end up being one of the coolest things in search in the next few years. Why? Because it sucks so badly now, fixing it will be the kind of 10X revelation we had when we moved from Yahoo to Google in 1998-99.
Germane to that, here’s an interview in the E-commerce Times with Google enterprise search chief Dave Girouard that’s interesting.
The funny part is it’s easier to find box scores from the 1957 World Series than it is to find last quarter’s sales presentation in the enterprise. While Web search has gotten really good, enterprise search has stagnated, and that’s why we really believe it’s a problem that needs to be solved and that Google has a unique set of capabilities to solve it.”
When Google goes public, and it seems that this is most certainly a when, rather than an if, it will have to grow. And once it’s hit the plateau of consumer facing businesses, it will turn to the corporate IT market (it’s already focused on the problem and is cranking up that focus). That market is still nascent, and there are buckets of money there (just ask Microsoft or FAST.) Mark my words, boring as it might seem, corporate search will be a big deal. And…there will be interesting implications w/r/t transparency and the like once all those corporate documents are discovered by the internal crawler.
7 thoughts on “Enterprise Search (Yaawwwwnnn)”
Enterprise search is actually interesting because it’s not the Web, and there’s more room to do creative things in an environment where http is not the only protocol. For instance, most users could benefit from a combined file/search server, and I think we’ll see more search-enabled SMB devices in the future. There’s also a lot of rich data competing with the unstructured morass. Things like P2P search are also much more likely to work in this environment.
Google doesn’t seem to have a particularly good offering in this area. It looks more like a one-size-fits-all branding approach than a deeply thought out solution.
Some of the gaps in their feature set (based on reading their sales literature; I may be wrong about some points):
Crawl: Content must be on http servers. Back when I had a job, most content was on SMB shares, and I would think a crawler for those could be devised. Likewise, a shared dropbox directory for submissions seems more sensible for users than httpd.
Index: A static index based on periodic crawls. You can’t edit your document, push “save”, and expect to see the changes in the index. The index parses only a fixed set of file types.
Search: Based on, guess what, pagerank. There is some doubt as to whether this is the best ranking algorithm for intranets. No integration with other data sources, eg an ODBC driver.
To address these issues, the ultimate solution is going to look more like a network filesystem with a dynamic index than this device. The door is, for better or worse, open for Microsoft to move into this area.
Kendall: I agree wholeheartedly that enterprise search is vastly more interesting than whole-web searching. The greatest common factor of web content is the hyperlink (which no-one before Google really realized, or took advantage of). And that’s why your other point, that page rank is not optimal for enterprise search, is right on. No matter how heterogenous your document set, no matter how sloppy your content creation workflow or haphazard your categorization, chances are, when your content is all created internally, you’re going to be able to extract more meaningful metadata from your content than what the Google appliance can derive from its hyperlinks (which, in an enterprise context, can’t reasonably be considered a “vote”). I recently worked on a project evaluating enterprise search options for my company. The Google appliance made the list only for its CYA factor: we could easily convince internal critics that we’d bought a good search engine, because it was Google.
Ah yes, I
There is a signicant high-end market for enterprise “search”, if you call it document management, classification, textual analysis, database, or some other technical term. For instance, Autonomy, Texis, etc.
I suppose this validates your points: higher-quality content and metadata beget higher-level tools.
Enterprise search, both on public sites and intranets, is a fascinating problem.
On public sites, the new big thing is faceted metadata search and browse, exposing database or metadata attributes in a usable way. Prof. Marti Hearst of UC Berkeley has done the research to show how it is successful in both Information Retrieval and User Experience — http://flamenco.berkeley.edu.
On intranets, there is so little of the traditional research and so many simple navigational questions –, timesheets, software downloads, holiday schedules, etc. Installing a giant KM system is overkill — save that for specific areas of research. On the other hand, because there’s so much more known about the internal, you want to have control over the indexing and relevance ranking, it shouldn’t just be a black box. Link and authority ranking doesn’t work as well as the WWW, because there is so little meaningful hypertext on intranets — most links are just navigation, not recommendations.
You (and Danny Sullivan) say “yawn”, but I’m a librarian and I think enterprise search is incredibly interesting.
Whoa, Macweek! There’s a flash from the past. It was one of my favorite magazines as a high school kid, along with Macintosh Today (which became defunct earlier than macweek).
I remember lying on those qualification forms to qualify for the subscriptions.
I’m still waiting for the holographic storage promised by Macweek 🙂