free html hit counter Crawler Available, Will Work for Free | John Battelle's Search Blog

Crawler Available, Will Work for Free

By - January 08, 2004

From Boing Boing I learn that the Internet Archive is releasing its crawler for free under a LGPL license. Why is this news? As I’ve argued in the past, it’s not cheap or easy to innovate in the search space, but the search space desperately needs innovation. If key components like crawlers can be snapped in place relatively easily, new ideas heretofore unthinkable become possible. I also like the philosophy behind the crawler, which is named Heritrix: “Heritrix (sometimes spelled heretrix , or misspelled or missaid as heratrix / heritix / heretix / heratix ) is an archaic word for inheritess. Since our crawler seeks to collect the digital artifacts of our culture (my emphasis/link) for the benefit of future researchers and generations, this name seemed apt.”

Way to go, Brewster!


Related Posts Plugin for WordPress, Blogger...

Comments are closed.