free html hit counter Sphere (Beta) - Blog Search Done Better | John Battelle's Search Blog

Sphere (Beta) – Blog Search Done Better

By - October 25, 2005

Sphere1

Spurred by early looks given to Jeremy and Om, I prodded Sphere CEO Tony Conrad to walk me through the beta of Sphere, his new blog search product (still in closed beta, but coming out in a month or two). It comes from a team out of Oddpost (Oddpost was planning to launch this there, but then got bought up by Yahoo. Oddpost founder Toni Schneider remains an advisor to Sphere, though he now works at Yahoo. Other advisors include WordPress dude Matt Mullenwegg and the irrepressible Mary Hodder.)

The financials: Tony has taken angel money from some impressive folks – notably KP partners Will Hearst and Kevin Compton, among others. I think perhaps the most important decision for Sphere, having seen it, is what to do with the money that folks will want to throw at it, but more on that later.

Sphere works better than other blog search I’ve seen, plain and simple. Why? It uses a combination of factors to do a more robust ranking methodology of blogs and posts. It pays attention to the ecology of relationships between blogs, for example, and it gives a higher weighted value to links that have more authority. This will insure, for example, that when a Searchblog author goes off topic and rants about, say, Jet Blue, that that author’s rant will probably not rank as high for “Jet Blue” as would a reputable blogger who regularly writes about travel, even if that Searchblog author has a lot of high-PageRank links into his site.

Sphere also looks at metadata about a blog to inform its ranking – how often does the author post, how long are the posts, how many links on average does a post get? Sphere surfaces this information in its UI, I have to say, it was something to see that each Searchblog post gets an average of 21 links to it. Cool!

Sphere2-2

Lastly, Sphere uses content semantic analysis to help determine rank. This helps defeat one kind of spam (blogs that simply say “Tickets tickets tickets” over and over again), but it does not defeat the kind Joel wrote about. That kind of spam, however, is defeated by the ecology of links – the system will not rank blogs well that are not part of a larger ecosystem of linking.

In short, spam falls to the bottom of the rankings, and that’s a great thing. Tony forwarded me early research his company has done which shows his results are markedly better than any other blog search engine out there. From my initial use of the system, I can say it most certainly is. This is not to knock Technorati or Feedster, but I calls em as I sees em, and in any case, both of those companies have their own differentiation: tagging and RSS, to be specific.

The service gives you ways to filter your results, which I like very much: By date, by relevance, and by language. I also got to see a new version of the service which had some neat interface hacks along the lines of a time axis. More on that as it comes live. As for business models, Sphere also surfaces related content on the right side, and Conrad is thinking of negotiating deals with those publishers, as well as the time honored sponsored links approach (they’d probably use Overture, as Toni is over at Yahoo…)

So back to the larger question. Great blog search is sure useful to folks like me, and to readers of Searchblog, who according to the survey you took a while back are a pretty bloggy crew (I’ll have a link to those results soon). But will normal folks want blog search, and if so, how will it be delivered (Yahoo, for example, thinks it’s within news results…)? Tony admitted that to scale his service he needs two things – many more machines (solved by money) and a lot more distribution (solved by a deal or selling to Yahoo, or Google, or AOL or….etc.). So what to do with his company? In short, should he pull a Flickr or Oddpost, and sell if the right offer comes? Or should he take some VC money and try to make it on his own, like Six Apart and Technorati?

Ah, to have such problems. Good luck Tony, the first look is promising.

Related Posts Plugin for WordPress, Blogger...

8 thoughts on “Sphere (Beta) – Blog Search Done Better

  1. Greg Linden says:

    Hi, John. Sounds interesting. Can you tell us how Sphere’s relevance rank compares to the “sorted by relevance” in Google Blog Search?

    Are they similar in quality? Or does it appear to you that Sphere’s relevance rank is noticeably better when you compare the two side-to-side?

  2. Chris Tolles says:

    Tony made an effort to walk around Silicon Valley and chat with *lots* of people prior to launching sphere, which I thought was a pretty decent thing to do — So, after Web 2.0, I interviewed Tony on NextStuff Now. I talked to Tony for about 45 minutes about Sphere, what he’s up to with the company, etc.

    Here’s the link to the page where the audio stream lives:

    http://www.webmasterradio.fm/episodes/index.php?showId=19

  3. Hey Greg – It just feels more relevant, in that I am finding what I think I want to find quicker. I think it’s because it’s not an automatic sort by date, or by sheer number of links.

  4. I wonder if this one honors robots.txt and robots meta tags. Google’s BlogSearch is iffy about that and was particularly bad about LiveJournal blogs when it first came into beta.

  5. Rob M says:

    No, it does not respect robots.txt! I am in the process of sending them an angry letter, and found links to this site from sphere.com. Their site is so lame they can’t even be bothered to put up any contact info or explanation of what they are, they just have links to some “reviews” like this one. I am not even running a blog! They are trying to crawl my picture gallery, which is NOT intended to be public. So very very lame.

  6. wally says:

    great blog hope the engine comes out soon ty!

  7. Eric T says:

    Hi,

    My website was just hit by the “Sphere Scout” beta spider and I’ve temporarily blocked it, not only in roots.txt but also in my web server security module.

    The reason being: the crawl was too intensive (not emptying its mouth and drawing a breath before taking the next bite). It also repeatedly pulled robots.txt, i.e. before every single mouthful.

  8. meeero says:

    a high pagerank doesn’t mean that the site is good too…

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>