Powered by Rollyo

Recent Comment
Spotlight

  • Reader Michael Megalli writes: It is difficult to engage in genuine conversations with the marketplace when you can't change the reality of how a company does business, what it sells, how it works with partners, etc. [go]

Recent Comments

  • Sarah Worsham: " Interesting idea, but doesn't seem to wo ..." [go]
  • seo results: " Interesting, so kinda like what Ask.com ..." [go]
  • nmw: " >> This means than more than a third, if ..." [go]
  • Dan Keldsen: " John - we're currently running a researc ..." [go]
  • Jim: " Hi John, I have been reading this blog f ..." [go]
  • Faisal Segu: " Seems to me we need to connect these "bi ..." [go]
  • JG: " Google's new motto: Gaudy and relevant ..." [go]
  • JG: " Sooner or later, people will catch on ..." [go]
  • prefabrik ev: " Looking a blinking, moving, las vegas st ..." [go]
  • JG: " nmw: you may notice that many (indeed ..." [go]
  • Teddie: " Interesting. I've presented on the Futur ..." [go]
  • Trogdor: " I'm confused. Does this mean that people ..." [go]
  • nmw: " JG & Tony, if you review ..." [go]
  • Lorie: " Yeah, incidentally, I was just looking a ..." [go]
  • jay: " Time to switch to Yahoo. ..." [go]
  • Beatles: " hmmm... wonder if we will see the day w ..." [go]

PERFECT FOR THAT PERSON WITH EVERYTHING
Order 'The Search'

thesearch_bookcover.jpg

Yup, it makes the perfect gift for that officemate or colleague who you thought had everything....including you! If you order here, I promise to sign it, assuming we can figure out the shipping...

You can also buy the audio version here.

Check my book page for more info.

Blogger's Rights

Top Posts

Active Topics

Monthly Archives

About John Battelle

Searchblog Newsletter

Enter email to subscribe to Searchblog's newsletter:

Calendar

May 2008
Su Mo Tu We Th Fr Sa
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Syndicate

Powered by

You are browsing the December 2004 category

December 23, 2004

Tech Review: New Form of Advertising

Trcover
I am breaking my holiday silence as a few readers have reminded me that I forgot to post a link to a column I wrote up for Technology Review magazine. It's basically a rewrite of my Sell Side Advertising post, but to make the concept a bit more approachable (or perhaps to ruin it) I changed the name to Publisher Driven Advertising. In any case, as always I owe a debt to Ross Mayfield and many others for the ideas contained within. And it's my hope that in 2005 we can take this idea and see where it might run.

December 22, 2004

A Look Ahead

Crystal Ball
Here we are again, the end of the year. Last year I did pretty well with my prognostications, mainly because I chose carefully. This time, I'm feeling a bit more reckless. A year from now, I am sure I'll be scratching my head - what was I thinking? - but then again, that's not such a bad place to be.

So in no particular order, here are some things that I believe have a reasonable chance of occurring in 2005 with regard to the intersection of media, technology, and search.

1. We will have a goat rodeo of sorts in the blogging/micropublishing/RSS world as commercial interests push into what many consider a "pure medium." I've seen this movie before, and it ends OK. But it's important that the debate be full throated, and so far it looks to be shaping up that way. I'm already seeing these forces at work over at Boing Boing, and I am sure they will continue. We'll all work on figuring out ways to stick to our principles and get paid at the same time, however, I expect that things might get more contentious before they get better, and 2005 may be a more fractious year in the blogosphere as we evolve through this process.

2. Along those lines, things will not go as swimmingly as we'd like with regard to "monetization." As the majors get into the space and start throwing around their weight and lucre, some folks will make bad decisions, and others will freeze and make no decisions at all. It will get harder to innovate before it gets easier. We'll all be surprised by the lack of what we consider "progress" in the RSS/Blogging world, and expectations of major publishing revenues will not materialize as quickly as perhaps we think they should. However, we'll in fact be making huge strides in understanding the path forward, it just won't seem like it. By the end of the year, the world will begin to realize that "blogs" are in fact an extraordinarily heterogeneous ecosystem comprised of scores, if not hundreds, of different "types" of sites.

3. There will be two to five major new sites that emerge from "nowhere" to become major cultural influencers along the lines of the political bloggers of 2004. One of them will be sold to a major publisher/aggregator for what seems like a large sum of money, driving the abovementioned #2 and #1.

4. Meanwhile, the long tail will become the talk of the "old line" media world. To capture some of that value, we'll see a slew of deals and new publishing projects from the established brands that seek to capture the idea of community journalism, affiliate commerce sales, and collaborative content creation.

5. Google will do something major with Blogger. I really have no idea what, but it's overdue. Six Apart will grow quickly but face a crisis in its implementation as its core users demand more features that are "unbloglike" like customer databases and robust publishing support tools. This (and other things) may drive Six Apart or one of its competitors into the arms of Yahoo or AOL or even - gasp - Quark or Adobe or Marcomedia.

6. Ask will continue to consolidate traffic by buying smaller search sites.

7. Yahoo and Google will both test systems that combine local merchant inventory information with search, so that merchants can use search as a direct sales channel. By the end of the year, there will be no question that the search companies are in direct competition with the ecommerce companies, but it won't matter - there's room for them all. Paul Ford will continue to get droves of readers to his related, and very prescient, three year old post on how Google takes over the world.

8. Microsoft will lose search share before they gain it back later in the year when the integration of MSN search starts to scale with new versions of Office and IE . Net net, however, MSFT will gain total in total search sessions from last year, and its technology will get much, much better.

9. Firefox will near 15% of total browser share. Firefox faithful will wonder why it's not much much higher. But MSFT will release a very good upgrade of IE, see #8.

10. A third party platform player with major economies of scale (ie eBay or Amazon) will release a search related innovation that blows everyone's mind, and has everyone buzzing about how it redefines what's possible in search.

11. The China question will become a critical issue to the search community. Defining the China question will in itself be a major task of 2005. How do search companies go in without being "evil"? Is the tradeoff worth it?

12. By the end of the year, there will be no question that search is a media business, and that the major players in search are major players in the content business.

13. Something major will finally happen at Tivo. We all hope that it's a sale to Apple, but if it is a sale, it will more likely be to Comcast or DirecTv.

14. All year, Apple will be rumored to launch a video iPod, but it won't - it's still too early. By the end of 2005, we will just be starting to see traction in the video over IP market and its connection to search. Google will introduce Video search at some point in 05, but it will stay in Labs.

15. Mobile will finally be plugged into the web in a way that makes sense for the average user and a major mobile innovation - the kind that makes us all say - Jeez that was obvious - will occur. At the core of this innovation will be the concept of search. The outlines of such an innovation: it'll be a way for mobile users to gather the unstructured data they leverage every day while talking on the phone and make it useful to their personal web (including email and RSS, in particular). And it will be a business that looks and feels like a Web 2.0 business - leveraging iterative web development practices, open APIs, and innovation in assembly - that makes the leap. (More on this when I start posting again).

16. Perhaps most recklessly...I will finish my book. The reviews will be mixed, as my attempt to satisfy both the exacting audience of Searchbloggers and the more general audience of a major trade hardcover may fall flat. Many will say I tried to do too much, others that I didn't do nearly enough (how's that for airing my deepest fears in public?!). However, I'll be happy with the effort, and the book will do OK, thanks mainly to the support of this community. So, ahead of time, thanks for your support this past year. I learned more from this process than I ever thought possible, and I owe it all to you, who grace my site with your time and input.

17. Lastly, I will be involved in starting a new business in the field of media and technology. It will start very slowly, and I'll screw up as much as I possibly can in the early stages, before imposing it on the rest of the world. Hopefully, you'll all be there to keep me honest as I try to figure out a few ideas I've been simmering for the past year or so.

Unless there's a major story which breaks in the next week or so, I'm signing off for the year, and look forward to resuming posting in 2005. Have a wonderful holiday, and a prosperous, healthy New Year. Oh, and please add your thoughts on 2005 below - I know I missed a lot....

December 20, 2004

A Look Back

Nostrad-TmOne year ago, I made a bunch of predictions. Today I call myself out and see how I did. Tomorrow (or maybe a little later in the week) I'll post some predictions for 2005. That will pretty much round out my posts for 2004 - during the holidays I'm focusing on writing the book. But I couldn't resist this little exercise first.

So...How did I do? Not bad, all told. Possibly because my predictions were facile, but still, we all have our own bars, don't we? I managed to limbo under mine pretty well. So, to the specifics.

My first prediction: "The Web becomes a platform (again)." I think with the buzz around desktop search and the Google OS, the strong performance of the major platform players like Amazon, eBay, and Yahoo, the happy and well attended buzz around Web 2.0, this seems to be coming to fruition.

My second: "Blog ecologies of like-minded folks will garner increasing cultural and social power." With the rise of political bloggers, the rising communities of Scoble, Zawodny, and others, and Time Magazine even considering naming "bloggers" as their person of the year (and the dictionary folks making "blog" their word of the year), I'd say this one was pretty much dead on.

Third prediction: "The Dutch auction/OpenIPO model will be validated." Check.

Fourth: "We'll see a major IPO ($100 million+ sold to public) in search that isn't Google." Well, I sort of got this one, and sort of did not. We had Tom Online in March, and Navteq in October. Whether Navteq is search related is debatable, but certainly search drove its valuation. The company raised $880 million. We also had Marchex, but it didn't make my $100 million in proceeds limit (though it did very well...). Perhaps I could argue that Shopping.com, which raised $137 million, is search related (they sure would not have gone out without search...) Also of note: there was Websidestory, $42 million, and Gurunet, which raised $8 million. I probably missed a couple here...

Fifth prediction: "There will be a "Tylenol Scare" in search...There will be much harrumphing, then everyone will calm down, learn from the incident, and move on. " Yep, that about tells the Gmail story.

Sixth: "Once a month, a new search player will be crowned in the press as 'the next Google.'" This one was way too easy to predict. More like once a week...the examples are too profuse to mention...

Seventh: "Second generation blog/RSS aggregation sites will come close to combining directory functions with LinkedIn- and recommendation-engine-like features - think Amazon+Yahoo for the blogosphere...." Yup. Bloglines, Rojo and others did just that.

Eighth: " ...at about the same time Yahoo, AOL, MSN, and Google will build or buy second-generation blog/RSS aggregation sites." Well, I was mostly wrong here. But Yahoo is sure doing some interesting things with RSS, and MSN did introduce MSN Spaces and got very serious about blogging, among other things.

Ninth: "The world will realize the importance of our digital artifacts, and takes further steps to to preserve them." I think Google Print is a step towards this (once we realize that books are worth saving, we might also realize that websites are as well), and the Internet Archive did a world of good this year, but we're still in the digital dark ages, for the most part. This was wishful thinking on my part.

Tenth: "Cable companies will control more than 75% of the PVR market, but a backlash/new TiVo-like device (possibly from Apple) will develop by the end of the year." This is on its way to becoming true, but the cable companies have been slower than I expected in rolling out PVR services. Meantime, there was a new launch, Akimbo, but alas, no Apple moves, yet.

Eleventh: "Microsoft will have a surprise hit product that has nothing to do with Office or Longhorn, causing a minor fire drill in Redmond." I think Halo 2 qualifies as a major hit that has nothing to do with Office or Longhorn...

And finally: "I'll finish my book, try to stop writing this blog, but find it impossible to do so. Meanwhile, a deeply cool, once-in-a-decade-magazine-I wish-I-had-thought-of will launch." Well, I'm close to finishing, but I'd say I missed this one by two or three chapters. I sure didn't stop blogging, and as far as the magazine launch - I'm sort of happy to report, I was wrong. There were plenty of new magazines, but none that made me wish for my old job back. As for cool new magazines that look promising but are not in my personal sweet spot, I was really pleased to see the launch of Chow, from former TIS editor Jane Goldman, and Breathe, from pals Lisa Haines and Deanna Brown. But most of the new magazines were market-driven pap. As for that deeply cool new magazine which I wished I thought of, I've got my money on Make, launching early next year. At least I'm involved in a small way!

Well, 2005 promises to be a very interesting year. Happy Holidays, everyone!

"We'll figure out how to monetize it later"

That's a quote in today's USA Today from Lars Perkins, product manager for Picasa, Google's recently acquired photo software. It's a sentiment that rests deep in Google DNA - make the product first, figure out the business case later. It worked for the original Google service, and it's clearly guiding Print, Orkut, Froogle, and News (though some of those of course are supported by advertising). I don't have the answer to this question, but it's worth raising - how long can this approach to the world stand? It's certainly a wonderful luxury to have - make a useful product, then figure out if/how it might make money. it reminds me of my preferred approach to publishing - make great editorial, then figure out how to sell it later. The only difference - with editorial, there was always a model to fall back on - advertising and subscription. As I've pointed out before, I'm not so sure that advertising alone can foot the bill for all of Google's innovations.

Yahoo Releases Update to Index

Last year Google released a major update to its index right before Christmas, and all hell broke loose. This year Yahoo did the same, but the reaction has been more muted - even positive, in some cases. It's always hard to draw conclusions from index updates, but this one seems to be moving a lot of SEO links around the SERPs...

Update: According to SEORountable, an update is happening over at Google as well, though opinions are mixed as to whether it's major, or simply ongoing maintenance...

The End of Friends

Joi Ito has reached the limit of Orkut friendship - the service cut him off at 1000.

Pell, Geico, and Paid Search

Dave Pell recounts his excursion into buying AdWords based on the Geico keywords, and summarizes some new marketing realties:

My experience does clearly point to the fact that we have opened up a whole new marketing frontier that will require ad buyers and small businesses to be a lot more creative with their marketing plans. It's not as simple as coming up with the most obvious search terms. As the market grows, those will become prohibitively expensive for most buyers. Ad buyers will need to predict what their potential buyers might be interested in and then try to get in front of them as they're on the way to finding it. If you want to get in front of a few thousand potential orthodontics patients, you might have to figure out something more creative than the words teeth and braces. And in many cases, your marketing plan may only last for a few days (or even a few hours) at which time you'll need to add new search terms to the mix.

Google Desktop Security: Welcome to the Software Biz

As I noted when GDS first came out, once you start providing serious PC-based software and integrate it with an internet service, you can become a target of hackers. The Times today writes about the security flaw initially discovered by Rice researchers. Google has already posted an updated version of GDS.

Cindy - An Appreciation

Cindy
Now that the news is out that Cindy McCaffrey is leaving Google, I can post an appreciation. I first met Cindy when I was a cub reporter for MacWeek in 1987. She handled PR for a portion of Apple, and it was my job to try to get anything I could on the company, no matter what. It was Cindy who would call me, exasperated, when I acquired a pre-release version of Apple's new Mac IIci and published a photo of its motherboard on the front page.

And it was Cindy who campaigned internally on my behalf when I came up with the idea, 15 years later, of writing a book that featured search as its subject and Google as a major narrative actor. With Cindy at the helm of communications and marketing, Google has enjoyed perhaps the most unprecedented run of good press in modern corporate history. (Cindy also sidestepped the marketing excesses of the bubble era, a decision that was not easy to take in 1999-2000). She's been at Google since the middle of 1999, and certainly deserves the break she plans to take (I believe sailing for a few weeks with her husband is the first item on her agenda). She told me recently that she's looking forward to reconnecting with family, friends, and "cooking her own dinners." I wish her well, and expect it won't be long before we hear from her again. She's too good - and too restless - to retire forever.

Comment Spam and Search

Anyone with a blog has come across the bane of comment spam, recently it's gotten to near epidemic proportions for folks who use Moveable Type, as I do (I think this is because MT users tend to have high PageRank sites, but that's just a guess).

Why do comment spammers do what they do? Simple: for the ranking juice. A spammer's link inserted into the comment field confers this site's authority, such that it is, to the spammer's target site. Jeremy Zawodny, who is working with the Search team over at Yahoo, posts an interesting commentary on this problem, and suggests a solution.

If you assume the following:

1. 80% of blogs are hosted by or produced on one of the more popular blogging platforms
2. 80% of people don't significantly tweak the default templates available in their blogging software
3. those people are the least likely to be actively fighting spam and, as a result, have more spam than the 20% of blogs where the owner is more defensive

Then a partial solution is fairly clear. I've heard and seen others discuss it over the past few months. The search engines needs to be smarter about reading and indexing content. ...the software needs to be able to recognize the difference between links produced by the blog owner(s) and those contributed by readers and spambots. Once you can identify the difference between those two types of links, you simply stop using the second type of link when calculating rank. Sure, you can still count them for the purpose of providing link counts--just donn't factor them into the ranking.

Jeremy's suggestion has elicited a lot of commentary. As one of the 20% who actively fight comment spam, I'm hoping that some kind of solution is in the works. But I'm not sure this is it - often I appreciate the links that are left in my comment fields, and I wouldn't want to discourage anyone who is well intentioned from continuing the practice. On the other hand, comment spam is a major problem, and it might be worth losing a bit of juice to save the ecosystem from the parasites.

December 17, 2004

WayBack Machine

Guess what year I'm writing about (for the book) today...

Google1998

Froogle's Product Reviews

FrooglereviewsAs many have noted, Froogle has begun to aggregate snippets of product reviews from the web at large. This marks the Google News-ification of Froogle. When will the service jump the shark and start making money on vigs from sales? Or will it? Will publishers like Cnet revolt? Wait, here's Cnet coverage...

The service, which is similar to the company's aggregated site for news around the Web, highlights Google's ambition to bring more content to its own site with the use of its "spidering" technology.

Huh. "Bring more content." That's an interesting way to put it. Indeed.

Job Search, Indeed

Indeed
Indeed.com is a new service which scrapes jobs from scores of services and then wraps a familiar interface around the entire thing - a search interface. I like the ability to refine searches and the ability to search by region. It's fun to play around with. Needs to deal with the duplication issue, as this search for "blogger" shows, but it supports RSS and you can sort by date. Neat. The site's founders have a (nascent) blog as well.

Update: Paul Forster, one of the founders, tells me of the business model the site will pursue: "We'll start with a contextual advertising system similar to Google Adwords or Overture. We won't be accepting payments for improved job positioning in our main search results and our paid ads will be clearly distinguished from our main search results" And reader Otis G. tells me that Indeed is based on Lucene. Cool!

December 16, 2004

On Desktop Search

School_Desk.jpg Gary/Search Engine Watch has posted a review of Ask's new desktop search tool, and reading it reminded me of a conversation I had with Ask's Jim Lanzone earlier today. Jim was a bit crabby - after all, Ask bought Tukaroo a long time ago and deserves credit for seeing the importance of the space way back then. However, as he pointed out, desktop search is simply one arrow that has to be in every serious search players' quiver, so it's not that big a deal that everyone hustled to get on board. He has a point. The torrent of news has all of us atwitter about desktop search, but in the end, it's simply another necessary building block toward good search services.

And, by the way, I got pinged by the folks at Lycos, who want to remind us all that they were in this game really early....with a HotBot desktop search tool.

Get Yer Traffic Reports Here...

Yahootraffic
Yahoo's been busy, today it also announced that it is overlaying traffic data on top of its Maps product. The picture shows get from my house to Yahoo, with traffic highlights.

This is a capability that John Hanke showed with Keyhole at Web 2.0, so I expect we'll see something similar from Google shortly...Release in extended entry.

YAHOO! BECOMES FIRST ONLINE MAPS PROVIDER TO OFFER
REAL-TIME TRAFFIC SOLUTION


Sunnyvale, Calif. – December 16, 2004 – Yahoo! Inc., a leading global Internet company, today announced the launch of a service that lets consumers view live local traffic information on their online maps and driving directions. Yahoo! is the first online site to provide speed conditions and dynamic traffic information nationwide.

The new mapping feature will initially be integrated in Yahoo! Search, Yahoo! Local, and Yahoo! Maps — and is available to consumers at http://maps.yahoo.com. In the coming months, Yahoo! will continue to develop additional enhancements that increase the functionality and power of its mapping and traffic technology to deliver on its vision of delivering the best local search experience online.

Online map usage has grown 60 percent during the last two years (Nielsen NetRatings, October 2004), and products are becoming more advanced with technology like Yahoo!’s innovative SmartView service. In line with this increase in consumer demand online, the new Yahoo! traffic solution provides users with the best way to reach their destinations by incorporating reliable information for over 70 metropolitan areas in the U.S.

With this new mapping service, consumers can now

Customize their Yahoo! Maps display and layer traffic information on top of driving directions or maps
Scroll over the traffic icons - an extension of Yahoo!’s SmartView technology - to find out the road conditions on their journey ahead including the time the incident occurred and the estimated time that it will be resolved
Find traffic accident reports and road construction information in over 70 metropolitan areas
Find real-time driving speed data in over 20 top metropolitan areas
Pan and zoom to find the traffic information most relevant to them
Provide the best way to reach their destinations by incorporating reliable information from embedded road sensors, traffic cameras, police scanners/reports, and traffic helicopters.


For example, if a consumer uses Yahoo! Local to search for a restaurant, he/she can now click on the map or driving directions to see the traffic conditions and determine how long it will take to reach the destination. Or, if a consumer types the phrase Atlanta Traffic into Yahoo! Search, he/she will receive a link on top of the page that would enable him/her to see a real-time map of the traffic conditions in Atlanta.

“Yahoo! continues to improve our users’ online experiences by delivering new and innovative local services,” said Paul Levine, general manager of Yahoo!’s Local Services. “We’re pleased to be the first major maping site to offer a robust, nationwide traffic service. This is an exciting extension of our local strategy and another example of how we can leverage Yahoo!’s content and technology to create a truly differentiated search experience.”


As an extension of Yahoo!’s SmartView product, this real-time traffic service brings content into a mapping interface to offer users the ability to visualize traffic information in a whole new way. With current integration with Yahoo! Search and Yahoo! Local, this mapping product will offer deeper integration across the network in the coming months.

About Yahoo!
Yahoo! Inc. is a leading provider of comprehensive online products and services to consumers and businesses worldwide. Yahoo! is the No. 1 Internet brand globally and the most trafficked Internet destination worldwide. Headquartered in Sunnyvale, Calif., Yahoo!'s global network includes 25 world properties and is available in 13 languages.

Yahoo Video Search, Second Day

Yahoovidballmer
So word is out on Yahoo's video search, many have noted its similarity to previous incarnations from Yahoo acquisitions alltheweb and AltaVista. A post on Yahoo's Search Blog clarifies that those sites now have Yahoo's video search improvements rolled in, so the new product is in fact an improved version. The original post on the video beta release is here.

What I find interesting about this new product is the extensions Yahoo is proposing for RSS - "Media RSS." With it, Yahoo is attempting to address a major problem with indexing video - that of metadata, or more directly, the lack thereof. From Jeremy's post:

----

As Marc Canter has noticed, we could all benefit from a bit more metadata to go with this growing pool of media. Who published this video? What formats are available? How is it licensed?

From our point of view, it means we can build a much better video search. You might want to filter results based on some of that metadata (title, actor, file format, etc). But it also opens up so many more doors. For example, your news aggregator might use your preferences to figure out which videos to download: Windows Media or Quicktime? High bandwidth or low? Heck, we can see entirely new rich media aggregators and tools being built--something like the popular iPodder currently used for podcasting. And when they are, this metadata becomes all the more important.

To get this started, we're suggesting an optional set of metadata extensions that we've been calling "Media RSS" (yes, we're so creative with names). They're aimed at publishers who'd like to provide a rich set of metadata about the media being published. Our video search system will also support these Media RSS extensions in addition to video enclosures (see the FAQ and the draft spec).
-----

Yahoo is using its power as a major distribution player to feed what it hopes will be a major play in video distribution. It may not seem like a big deal now, but as the web increasingly becomes a native environment for video, it will may well prove to be one of the most forward looking things the company has done this year. And by the way, it's always fun to see what the top search is for "dancing monkey." Hey, that looks like Steve Ballmer....

Search Paper Fun: Most Cited

Scholar LogoI sent a query to Lee Giles, the guru at Penn State behind CiteSeer (with Steve Lawrence, who is now at Google) asking him which search-related papers are the most cited. I was struck by the near parity between Page and Brin's original paper on Google and Jon Kleinberg's paper on Hubs and Authorities. Giles did a bit of fiddling with Google Scholar and responded:

For web related work these are well cited in the Google Scholar using the query “web”:

 PDF] The Semantic Web
T Berners-Lee, J Hendler, O Lassila - View as HTML - Cited by 1347
... May 17, 2001. The Semantic Web. A new form of Web content that is meaningful to
computers will unleash a revolution of new possibilities. ... Web: A Research Agenda. ...
Scientific American, 2001 - www-personal.si.umich.edu

 [PDF] The anatomy of a large-scale hypertextual Web search engine
S Brin, L Page - View as HTML - Cited by 1087
Abstract In this paper, we present Google, a prototype of a large-scale search
engine which makes heavy use of the structure present in hypertext. Google ...
Computer Networks and ISDN Systems, 1998 - kulturinformatik.uni-lueneburg.de - firstrate.co.nz - net.cs.pku.edu.cn - scalab.uc3m.es - all 69 versions   

However, this one can’t be ignored:

 [PDF] Authoritative sources in a hyperlinked environment
J Kleinberg… - Cited by 1059
Abstract. The network structure of a hyperlinked environment can be a rich
source of information about the content of the environment, provided we ...
Journal of the ACM, 1999 - portal.acm.org - nan.dhs.org - cs.cmu.edu - mathe.tu-freiberg.de - all 73 versions

 This book is the first to discuss the web in any detail:

 [PS] Modern Information Retrieval
R Baeza-Yates, B Ribeiro-Neto, R Baeza-Yates - View as HTML - Cited by 1198
Page 1. Modern Information Retrieval. Ricardo Baeza-Yates. Berthier Ribeiro-Neto.
ACM Press New York. ... 1.1.2 Information Retrieval at the Center of the Stage . . ...
Addision Wesley, 1999 - dcc.ufmg.br - sunsite.dcc.uchile.cl - sims.berkeley.edu - portal.acm.org - all 7 versions »

All worthy reads!

December 15, 2004

Yahoo Video Search

YahoovideosearchSigh. Again, I find myself in this odd space. I'm under embargo on this information (Yahoo briefed me and others), but a reader just sent me this link out of the blue (my readers are so damn dialed in, first Google Library, now this...). So you guys go look for yourselves, please comment here as to what you think, and I'll write about this on Thursday, as I have holiday stuff to do tonight and can't write it up now. Yahoo Video Search.

Google Wins Key Portion of Geico Case

Reuters just came out with this:

ALEXANDRIA (Reuters) - A federal judge on Wednesday dismissed a key element of insurer GEICO's trademark infringement case against online search engine Google Inc (GOOG.O: Quote, Profile, Research) .

U.S. District Judge Leonie Brinkema ruled that there was not enough evidence of trademark violation to bar Google from displaying rival insurers when computer users search the word "GEICO."

From the AP:

Geico claimed that Google's AdWords program, which displays the rival ads under a "Sponsored Links" heading next to a user's search results, causes confusion for consumers and illegally exploits Geico's investment of hundreds of millions of dollars in its brand.

"There is no evidence that that activity alone causes confusion, " Brinkema said, in granting Google's motion for summary judgment on that issue.

But Brinkema said the case would continue to move forward on one remaining issue, whether ads that pop up and actually use Geico in their text violate trademark law.

More as this develops...

PS - Watch GOOG. It was down before the news broke but is trending back up...

A Blitzkrieg of Zeitgiestian Data

AOL's year end list, Lycos' year end list, Yahoo's Holiday shopping list....

Rosenberg Chimes In

Echoing other conversations I've seen around the web (wait for 12/14 to post)....Scott pings the Big Concept that while the folks building Google now are clearly well intentioned, we are creating an asset in Google together that someday may be out of our collective control. In Salon's blog, Scott Rosenberg comments:

But Google is a public company. The people leading it today will not be leading it forever. It's not inconceivable that in some future downturn Google will find itself under pressure to "monetize" its trove of books more ruthlessly.

Today's Google represents an extremely benign face of capitalism, and it may be that the only way to get a project of this magnitude done efficiently is in the private sector. But capitalism has its own dynamic, and ad-supported businesses tend to move in one direction -- towards more and more aggressive advertising.

Since we are, after all, talking about digitizing the entire body of published human knowledge, I can't help thinking that a public-sector effort -- whether government-backed or non-profit or both -- is more likely to serve the long-term public good. I know that's an unfashionable position in this market-driven era. It's also an unrealistic one given the current U.S. government's priorities.

December 14, 2004

Ferguson on Google: Platform? Yes. Single Platform? No.

Charles Ferguson writes a lengthy and clearly considered piece on Google for Tech Review, focusing on the Microsoft angle and concluding that the only way Google can truly "win" is by controlling a new architecture of computing through the time honored approach of proprietary APIs. Ferguson argues that the search wars are about to enter a major battle for control of standards which simplify the increasingly heterogeneous world of search, and in such a battle, Microsoft is far better suited.

I enjoyed reading this piece, and I am sure I will read it again and again, to more fully consider its argument. But I find myself disagreeing with the premise - why, in this world of the web, do we need to be bound by this winner takes all approach to the world? It works in a resource constrained world of homogenous PCs - once a consumer has purchased his Windows box, he's not going to easily purchase an emerging competitor - but somehow, it really doesnt' strike me as the right metaphor for a Web 2.0 world. I do agree that Google would be well served to make its service more of a platform, and that APIs are the way to go. But I'd really be interested in what Tim O'Reilly has to say about this piece, or Tim Bray, or any number of other folks. I'll keep my eye out...meanwhile, do read the piece. It's a worthy provocation.

Other POVs on this piece: TechDirt, Linden, SEW, Silicon Beat

Print Implications: Google As Builder

Some folks have been calling me and together we've been pondering the implications of the Google Print announcement. And one drop dead obvious thing dawned on me during the conversations.

This is so obvious as to be almost embarrassing to restate, but this program marks a major departure in Google's overall approach to search. After all, what has been the presumptive model till now? If it's on the web and publicly available, it's in the index. That's why we called it web search, after all. But Gary Price and Chris Sherman, among many others, have reminded us how vast and darkly lit the invisible web is - all that information trapped in the amber of password-protected databases, or crumbling film libraries, or ....books.

Now other companies have taken significant steps toward illuminating these dark corners of the world's knowledge web - Yahoo with its CAP program, Amazon with A9 and Search Inside the Book. And Google has long claimed that it's mission was to go beyond the web and crawl the world's information, wherever it lay.

But Google was, until now, the world's purest web search engine. What, I wonder, are the implications of tens of millions of book pages entering this once pure space? (Google has announced that the results will be included in the index, not separated out in a vertical book search engine.)

Why am I on about this? Well, it comes down to the essence of what - so far - has made Google Google: the ranking paradigm. Here's a sketch from the book I am working on:

In essence, academic publishing is a flawed but useful system of peer review incorporating ranking, citation, and annotation as core concepts. Fair enough. So what?

Well, in short, it was Tim Berners Lee’s attempt to address the drawbacks of this system (through network technology and hypertext) that led to his creation of the World Wide Web (4), and it was Larry Page and Sergey Brin’s attempts to make Berners Lee’s World Wide Web better that led to Google.

Which brings us back to Page, and his original research work focusing on backlinks. He reasoned that the entire web was loosely based on the premise of citation and annotation – after all, what was a link but a citation, and what was the text describing that link but annotation?

The point I'm making is this: Google was born of, by, and in the web, as an extremely clever algorithm which noticed the relationships between links, and exploited those relationships to create a ranking system which brought order and relevance to the web. Google's job was not to build the web, its job was to organize it and make it accessible to us.

But all this new Print material, well, it's never been on the web before. It's Google who is actively bringing it to us. How, therefore, does Google rank it, make it visible, surface it, and..importantly...monetize it? If a philanthropist were to drop the entire contents of the Library of Congress onto the web, Google would ultimately index it, and as folks linked to the content, that content would rise and fall as a natural extension of everything else on the web. But in this case, Google itself is adding content to the web, and is itself surfacing the content based on keywords we enter. This is a new role - one of active creator, rather than passive indexer.

This means, in short, that Google is making editorial decisions about how to surface this new content, decisions it can't claim are based on the founding principle of its mission - PageRank. Sure, there are straightforward keyword matching techniques, and over time the web will deep link those book pages - each page in Print has a unique URL. But really, the magic of what made Google Google - the existing link structure of the web - is entirely non-existent with these newly surfaced print pages. By extension, the same will be true for any new media brought into the index - be it movies, music, radio, television, photos, you name it. That's why I'm so interested in what role Google will play in monetizing this content (see here and here) and why I am so fascinated with this media v. technology angle.

I guess the net net of all this is that this move by Google, which I think is monumental, marks a shift in who the company is in the world. It's no longer simply an indexer of the world's knowledge web. Google Print is a clear declaration that it's a builder of it as well.

December 13, 2004

Ask Jeeves: Do We Need Another Desktop Search?

Yes, we do, says Andy Beal. This one's pretty good, according to early reviews. Release in extended entry.

Ask Jeeves Introduces Desktop Search Application
Emeryville, CA, December 15, 2004 - Ask Jeeves®, Inc. (Nasdaq: ASKJ), a
leading provider of information retrieval brands, today introduced a beta
desktop search application. Fast, relevant, and flexible, Ask Jeeves Desktop
Search(tm) makes it easy for people to find information on their computers
or the Internet. The product is available free for download at
http://download.ask.com/desktop.
Ask Jeeves Desktop Search helps people overcome the challenges of
information overload by enabling them to quickly and easily search the
hundreds, even thousands of files, word-processing documents, presentations,
spreadsheets, photos, music and video files, applications and email messages
on their computers.
"Ask Jeeves Desktop Search extends our world-class technologies and
user-centric approach to search beyond the Web, to the information located
on people's computers," said Jim Lanzone, senior vice president of search
properties at Ask Jeeves. "Ask Jeeves Desktop Search will complement our
recently-introduced MyJeeves personal search service and is an important
step in our personalization strategy. We look forward to receiving feedback
on the beta release, as we continue to develop the product in line with
people's needs."
Product highlights include:
- Fresh, Full-Text Index: Upon installation of the small (750K) application,
Ask Jeeves Desktop Search creates an index of the information stored on a
person's computer. This process enables users to search by file name, as
well as by file content. The application currently supports a wide range of
file types, including Microsoft Office files (Word, Excel, PowerPoint),
simple text files, Microsoft Outlook email messages, and image, music, and
video files. Ask Jeeves Desktop Search constantly monitors the computer
(with minimal impact on system resources) for new and deleted files and
email messages to ensure the index is always kept up to date.
- Fast, Flexible Search: Like searching the Internet, users simply type key
words into a search box to let Ask Jeeves Desktop Search scour their
computer for matching results. Users can narrow their searches by selecting
categories like Office Documents, Music, Pictures, MyJeeves, News and
others, or they can sort results by a variety of parameters. Meanwhile,
controls are also provided for users to define how much of their computers
they want Ask Jeeves Desktop Search to index, as well as the speed (and thus
the amount of bandwidth devoted) with which they want it indexed.
- User-Centric Design: Ask Jeeves Desktop Search is very simple to learn and
use. The program takes search beyond the paradigm of 10 blue links and
returns results in a two-panel interface where previews are displayed for
easy review. (This is especially useful for browsing photos and email
messages.) Users will also find a search box conveniently added to common
Windows dialog boxes, such as Insert Attachment or File/Open, where the
process of finding files is frequently required.
Ask Jeeves expects to add new functionality prior to the formal launch of
the product in 2005. Some of these features include expanded support for
Outlook, integration of desktop and Web search results, and PDF support. A
Feedback menu is provided directly on the application interface to make it
easy for users to submit comments and requests for new features.
Ask Jeeves Desktop Search works with Windows 2000 or XP, Office 2000 or
higher, and Outlook 2003. The program requires a minimum of a Pentium III
computer running at 400MHz with 128 MB RAM (1 GHz and 256 MB RAM
recommended).
About Ask Jeeves, Inc.
Ask Jeeves, Inc. provides consumers and advertisers with information
retrieval products across a diverse portfolio of Web sites, portals and
desktop search applications. Ask Jeeves' search and search-based portal
brands include: Ask Jeeves (Ask.com and Ask.co.uk); Ask Jeeves Japan
(Ask.jp, a joint venture); Ask Jeeves for Kids (AJKids.com); Excite
(excite.com); iWon (iwon.com); My Search (mysearch.com); My Way (myway.com);
My Web Search (mywebsearch.com) and Teoma (teoma.com). Ask Jeeves also owns
the search technology Teoma, proprietary natural language processing
technology, as well as portal and ad serving technologies. In addition to
powering several of the Ask Jeeves brands, the Company syndicates its
technologies to help companies increase revenue through powerful search. Ask
Jeeves' advertising division, AJinteractive, provides advertisers with
targeted tools to reach a broad base of valuable customers. Ask Jeeves, Inc.
is headquartered in Emeryville, California, with offices throughout the
United States, as well as in London, England and Dublin, Ireland. For more
information, visit http://www.Ask.com <http://www.ask.com/> or call
510-985-7400.

Google Library: Talk About a Long Tail...

old book 6.gif The NYT now reports on Google's program to digitize some of the world's most important libraries, and it is truly an amazing project. Google was founded at Stanford in partial association with that university's digital library effort, so this must be a pretty proud day for Stanford, which is a participant, as well as the original Googlers. John Markoff spoke to Larry Page:

Mr. Page said yesterday that the project traced to the roots of Google, which he and Mr. Brin founded in 1998 after taking a leave from a graduate computer science program at Stanford where they worked on a "digital libraries" project. "What we first discussed at Stanford is now becoming practical," Mr. Page said.

The details: Google is working with Stanford, the University of Michigan, Harvard, Oxford, and the New York Public Library to make millions of books available in its index. For now the project is in pilot phase, but there are hopes and expectations this will go big in the next few years. A source told me the project was originally named Google Library, but for now it will exist under the Google Print moniker. An example of Google Print is here. The screenshot at left is what I was provided by Google for today's launch.

The implications here are significant. First, the idea that the world's knowledge, as held through books and libraries, is opening up to all via a web browser cannot be understated. It's one thing to have the an original copy of The Origin of Species on the shelves, where students and interested parties have to travel to find it. It's another to have it available to everyone via a search index and your web browser. Second, this move clearly puts Google in the category of innovator when it comes to adding information to their index. But it also raises significant business model questions, one that are both exciting and unanswered. I brought them up in an earlier post:

A very interesting case will be Google Print. As that program expands, and it's rumored that it will, dramatically, a number of questions arise. How will Google monetize out-of-copyright books? If it indeed does bring tens of thousands of out-of-print books onto the web and into its index, will it allow others to access and index that new treasure trove, or will it act more like a traditional media company, which would "own" that resource for itself? How will it choose what it brings into the index - those that might sell? Those that somehow are the most "in demand" by some measurable standard? With regard to books that are in print, will it limit itself to being soley an organizational tool supported by AdWords, or will it start to take a vig for books that are sold via the Google Print service (in fact, maybe it does already and I'm simply unaware of it - any publishers out there, let me know!)? And will the print model scale to television and movies or music?

Google Print already monetizes a selection of in-copyright books via advertising, and shares some of those revenues with the publishers. But it's a very short distance between that and, say, an affiliate link to Amazon or any other booksellers for a cut of an in copyright sale. It's also a very short route to the on demand publishing of an out of print and out of copyright book with a company that is set up to do such a deal, and I am aware of at least one that is about to launch that will provide just such a service. Of course, if you want an ebook, that can be arranged as well. For out of copyright books, the tail is extraordinarily long, and quite possibly very very profitable. In other words, this could well be a step toward diversifying Google's revenue streams away from advertising and into direct sales and/or subscriptions - ie, the content business. As one source who is familiar with the industry tells me, Google is not doing this only out of the kindness of its heart - there is a lot of money to be made in selling books, in particular books with no copyright.

I did ask Adam Smith, a manager of the Print program at Google, how Google will decide which books get scanned first. He said quite forthrightly that he did not have a good answer for me on that yet. I've heard from others that for now it's pretty random, but the question is important. As to whether Google will allow anyone else to index the books they scan, I am pretty sure the answer is no. After all, Amazon is also scanning books, and I am sure they aren't letting others in on their hard work. I'll repost if that turns out to be inaccurate. And of course there are other efforts, including Project Gutenberg and the Internet Archive. But now, we have a commercial giant who has both a mission-based (organize the world's information and make it accessible) as well as a commercially viable reason to bring this information to the world. As David Hayes, a copyright lawyer at Fenwick who worked on this deal and who I've known from my own work with his firm put it: "This will create a revolutionary new information location tool that should be a benefit to the whole world.” I for one applaud the effort - it's an example of enlightened capitalism, and I hope it thrives.

More here and here.

Update: I originally posted the wrong image