Posting may be lighter than usual, but if you are at the conference, please come and say hello!
Announced about a month ago, Become.com is a shopping search engine that its creators claim is vastly superior to its competitors. These guys can put some wood behind that particular arrow, collectively they were responsible for MySimon (now owned by Cnet) and Wisenut (now owned by LookSmart).
I spoke to Michael Yang, Become CEO, and Yeogirl Yun, the CTO. The founders have developed a new ranking technology – they call it the “Affiinity Ranking Index,” or AIR – which applies a unique combination of math and human editing. Before it does any math, Become puts people in the process of determining relevance for particular shopping-related search topics. A team of editors contextualize pages based on how they relate to each other, then those pages are crawled, and Become’s AIR algorithm is applied.
I can’t really grok how AIR works, but this is from a draft release on AIR: “AIR identifies exceptional web pages by understanding the level of interconnection between valuable sites from within specific fields of interest. AIR evaluates a web page based on what other “knowledgeable” sites in that specific field say about the page, and also evaluates the page based on what the page says about other “knowledgeable” sites in the specific field.”
“Unlike Become.com’s AIR, Google’s PageRank estimates the popularity of a given web page by looking only at links into the page and doing so without any understanding of context. Become.com’s AIR, on the other hand, considers a site to be valuable if 1) it receives links from valuable sites within a similar topic of interest and 2) if it provides links to other valuable sites within a similar topic of interest (while minimizing links to off-topic sites). ”
I pressed Yang and Yun for more details – PageRank is published, after all. But they were mum, save adding that their inspiration was Applied Physics and Engineering Dynamics – two fields in which I must confess I am not very keen. I chided them a bit – after all, calling your new algorithm AIR, but not publishing it might just open one up to some jokes – but they do have the right to protect trade secrets, after all.
The proof is in the use of the engine itself. It’s in a registration-based beta, so you’ll need to sign up. I used it, although cursorily, and I did like how it seems to understand the intent behind a shopping query – it’s not a product search engine, like Froogle, instead it seems to give you a lot of information that helps you in your process of buying. Yang added that a comparison feature is coming.
Yang and Yun hope to take Become and AIR across many vertical search areas – health, people, travel, etc. Given these guys backgrounds, it’s worth checking out.
Yippee, Joe Kraus has posted on the long tail Web 2.0 meme! And wouldn’t you know it, the framing is all about search.
Referring to Excite’s search stats, he notes:
In fact, the frequency of the average query was 1.2. That means if you wrote each of the millions of queries on a slip of paper, put them all in a fish bowl and grabbed one at random, there was a high likelihood that this query was asked only once during the day. Of ten-plus million queries a day, the average search was nearly unique.
The most interesting statistic however, was that while the top 10 searches were thousands of times more popular than the average search, these top-10 searches represented only 3% of our total volume. 97% of our traffic came from the “long tail” – queries asked a little over once a day.
You know the real reason Excite went out of business? We couldn’t figure out how to make money from 97% of our traffic. We couldn’t figure out how to make money from the long tail – from those queries asked only once a day.
Joe goes on to show how this applies to software and his new business, JotSpot.
USA Today rounds up the usual suspects – Danny, Jason, me – as well as some unusual ones in this overview of AdSense.
From this Cnet story:
Now Yahoo plans to launch its own advertising option for small publishers, a source familiar with the plan said. Like Google’s service, Yahoo’s self-serve product will display text ads deemed relevant to the content of specific Web pages. Advertisers pay only when a reader clicks on their ad. Yahoo and publishers will split the fees.
A couple of weeks ago I got to talk with Steve Levine, the founder of Transparansee, a neat technology that lives on top of structured search. The model is to sell it to other sites as a custom install. Think of it as a smart layer of search on top of database-driven applications like dating, home or car buying, or, in the example Steve took me through, Fodor’s.
Transparansee’s “Discovery Search Engine” seeks to address the “stupid computer” problems which plague most structured databases. You most likely have experienced some variant of this: you put in a set of parameters meant to find just what you are looking for – for example, on Fodor’s, you want French bistros in Chelsea priced at $35 with a food rating of 20 or above – and you get no results, or only one or two. You have a sneaking suspicion that the results are missing an entire set of possibilities which are “close enough” to what you want, but you’ve been limited by the parameters you chose – if you open it up too much, you get a bunch of stuff you don’t want. What to do?
Transparensee uses “fuzzy search” algorithms to scour a database and offer on the fly weighting based on any parameter you choose. Presto, what you want to see is at hand. It’s hard to describe, but an “aha” when you see it in action. For example, there may be the perfect French bistro for you, but because it’s one block away in another section of town, it does not get found. With Transparansee, you’d see it at the top of the list, because it matches on so many of the other weights.
This is powerful stuff when you think about it, and it solves a core database search issue, at least for me: you know there is the right answer for the query you are entering, but damned if it isn’t escaping you, due to the blunt nature of structured search. Think of such a tool for Expedia, or Lexis Nexis, for example. No, I can’t point you to examples quite yet, but the site has some that you can peruse via PDF files.
It sort of reminds me of collaborative filtering, but for more types of datasets. After all, it’s hard to imagine a collaborative filtering application for home buying – “people who bought this home, also bought these homes…”.
I asked Steve what his plans were for the technology, and he said “to prove it out with as many clients as possible.” Is he open to Transparensee finding a home at one of the majors, or does he want to become the Swizerland of structured search? Too early to tell, Levine said. He’s still in early startup mode. But this looks promising, and I hope the idea spreads.
This just in….you can now customize Google News.
I have a riff brewing – but it ain’t quite fermented – about ads and tagging. Some of this has been spurred on by conversations with folks like Andy at Waxy. There’s something there, and recent developments, like comments on ads, is starting to point that way. Adding to the meme, Jeff Jarvis, who has been my posting partner on the whole PDA/Sell Side advertising concept, is already riffing on ads and tags. This is a brewing area, more to come…
Today’s Times has a longish piece on search titled “Search Engines Build a Better Mousetrap.” The article reviews alternatives to Google.
I am quoted in it, but the reporter misheard one key detail: I said “millions” not “billions” in the quote below…
John Battelle, who maintains a Web log about search technology (Searchblog, at battellemedia.com), said innovations like “Block View” showed how dynamically the search companies were taking advantage of new technologies – and new economies.
“In 1997 you would have had to spend tens of billions, and it wouldn’t have made any sense,” Mr. Battelle said. “Now, you can strap a camera and G.P.S. on a computer and drive down the street taking pictures. It’s a neat idea, and it didn’t cost the farm to try. Now imagine that across the whole Web – that’s what’s happening.”
Even back in the late 90s, it would have been tough to spend billions on a feature like Block View! Also, the quote is … condensed. My point is simply this: Innovations like Block View are now sprouting up everywhere, throughout the web, not just in search. Why? Because they can, the ecosystem now supports it. That’s the Web 2.0 meme in action…
Along those lines, I did finally get a response on the issue of the “keyword hint” feature that had apparently been beefed up for publishers like Boing Boing (see the post here). Here is Google’s “official response” to my query, which was essentially this: is this a new feature, previously unannounced, or are you selling something that doesn’t exist? Why is a Google rep cold calling me with this feature, and promising it to me as something to draw me into using AdSense?
“The keyword tool is a limited test and only available to premium publishers at this point. We continue to work on making more tools available for all of our publishers.”
Sigh. Transparency, guys. Upfront, and on the backend. It works, really, I swear. And admitting mistakes. Two things that are hard to do, but pay huge dividends.
Fact is, the Google rep who called me didn’t tell me this is a “limited test only available to premium publishers.” Nope, he made it seem like it was a normal feature, ready for me if only I signed up. What did he think I was going to do with the information that Google had new tools that might make AdSense work better? Keep it to myself? I’m a ***publisher*** after all.
Anyway. End of rant. It’s hard being number one. But it’s easier if you are in conversation with those that put you there.
Update: I hadn’t noticed, but a reader tipped me – Google linked to me in a recent corporate blog post. That’s a very good sign – it was one of the things that seemed off about their blog – that they never pointed to anyone else. Cool.