free html hit counter August 2006 - Page 6 of 7 - John Battelle's Search Blog

Google Shares Some Data

By - August 06, 2006

No, not the kind that might help you predict earnings, but the kind that might help researchers around the world play with massive sets of word phrases and figure out all kinds of new applications based on the core concept of n-grams (don’t ask me, read this). Massive on the order of trillions, that is. On Friday Google’s research blog announced it would be releasing such a trove, blog post:

We believe that the entire research community can benefit from access to such massive amounts of data. It will advance the state of the art, it will focus research in the promising direction of large-scale, data-driven approaches, and it will allow all research groups, no matter how large or small their computing resources, to play together. That’s why we decided to share this enormous dataset with everyone. We processed 1,011,582,453,213 words of running text and are publishing the counts for all 1,146,580,664 five-word sequences that appear at least 40 times. There are 13,653,070 unique words, after discarding words that appear less than 200 times.

It’s good to see Google giving something back to the research community, in particular given the thread about this very topic on Searchblog earlier. But I’m going to guess that this will only whet the appetite of folks in pure R&D who’d love to see even more information shared – more complex patterns across data, for example – the very same information, unfortunately for them, that is the basis for competitive differentiation, and is not likely to be shared anytime soon.

Update: Yow. AOL release actual search data from half a million users, according to this post. Wow….

  • Content Marquee

Scoble: Hey Microsoft, Optimize This!

By - August 04, 2006


Scoble writes a nice post-Microsoft rant about what’s wrong with his former company, and what the company should pay attention to. The answer: attention data. I agree. Positing a scenario in which he’s looking for an office chair for less than $500, Scoble writes:

When I search on “Office Furniture” why is the first thing I see stores? I don’t wanna see freaking corporate info. I wanna know what HUMANS like to use in their offices.

None of the big search companies have figured out that it’s the humans who “optimize” the Web.

They just wanna collect the big company paychecks.

I’m hearing that too here at Podtech. It’s all bunk. If there is no audience, there is no advertising. I’m not an “eyeball” to be tracked, or optimized.

I’ll be looking for who lets me get to the other humans the fastest….

…Remember Active Desktop and Channels? Microsoft could have OWNED the blog world and RSS. Why did that fail? Cause when we looked at it all we saw were big companies.

If you optimize for them you’ll fail.

Finding your search buddies

By - August 03, 2006

Picture 4-4There’s a new social service that pairs search users in part by their similar queries, as well as pages visitedPicture 2-17 in web browsing and preferred interests. Others Online stands out from many social sites with its browser toolbar that when activated passively records demonstrated interests.

While a user is browsing they can check the Others toolbar to see who else is reading or interested in the topic or site, and a dropdown provides contact to their IM or email details—including a link to their MySpace profile or other social website.

So what this means is…. Every time you search Google, you see the people who relate to those same keywords, plus their Web pages, and you can connect with them instantly by IM or email.

There aren’t too many details on the site on how they track and weight users’ online movements, but it seems to cache much like Google Toolbar and feed users’ background profiles by keyword and url. Although one can clear their search cache, there’s no option to selectively delete.

Others Online will run contextual ads–based of course on its users web history–and says it offers companies a chance to build brand by retaining contact even when users have clicked away. via

Amr: Google Is Slowing Down

By -

Amr Awadallah, a Yahoo employee as famous for predicting Google’s earnings (though not always correctly), yesterday posted that he thinks Google’s revenues will slow down.

2006 is showing the early signs of slow down for four reasons:

1. Once you have tons of revenue, its hard to keep high Year-over-Year (YoY) growth rates

2. The search marketplace is slowing down a bit due to saturation in the US and European markets (still plenty of growth in Asia though, but Yahoo is stronger there).

3. Google launched almost all the tricks in the bag during 2003, 2004 and 2005, the only remaining tricks are visual placement tricks and looser matching (i.e. more, less-relevant, ads on top of web results).

4. None of Google’s other products, other than web search that is, have decent “money” marketshare. Google’s Image Search is actually pretty large, but they have no ads there (will that change in Q3? possibly).

Overly Sensitive?

By -

The East Bay Express – an alternative SF Bay area publication – today published a piece about Google’s advertising filters and the impact they have – and potentially have – on independent news coverage. It’s an interesting read. From it:

Earlier this year, Salon signed a small advertising contract with Google, and employees quickly discovered that whenever a story dealt with sex too explicitly, the search engine would automatically pull its ads. Salon ran stories about a Senate hearing on the effects of pornography, a study on the effect of sex on stress levels, and British attitudes toward rape victims; Google pulled its ads for each of these articles. “What we found in working with Google was that because some of our content violated its ‘family-safe policy,’ as a result we had to work with other partners such as Yahoo,” says Kathryn Surso, Salon’s vice president of business development….

…Few bloggers rely on ad revenue to pay their bills, and Salon’s advertising base is sufficiently diversified that dropping the occasional Google ad doesn’t hurt it. But for smaller Web news outfits, losing Google revenue is much more serious. According to the publisher of a prominent news Web site who agreed to speak only if granted anonymity, his company recently signed a premium Google advertising contract that now accounts for a third of his site’s revenue. A few months ago, his Web site ran a series of stories about a major bombing in Iraq. Within hours, he says, Google’s ads vanished from his home page, and so did all the revenue they generated. “They said we had the word ‘kill’ on our site, and that killed the ads,” the publisher said. “I wrote them and said that would be very difficult for a news site, which would often use the word ‘kill.’ They said, ‘Those are the rules.'”

…When the publisher contacted Google and asked for explicit guidelines about what constitutes illicit content, company representatives refused. “I asked them for a set of keywords, and they wouldn’t give me one,” he says. “I don’t know what the words are; we just have to approach it by toning down the language in our articles. … It’s just ridiculous. I don’t think the [advertisers] are going to have a problem with us reporting the news. … But they’re Google, and we’re a small site. So we’ll have to conform to their regulations if we want their money.”

Mitch Does Search

By -

Mitch Kapor is a legend in the IT world (he’s the guy behind Lotus) and he’s always interested in new models (he’s an investor in FM, for example, which certainly influences my view of the guy). In any case, Mitch is starting another company, Foxmarks, which focuses, in Mitch’s words, “on innovation at the intersection of search and social production.” Richard at ReadWriteWeb has a writeup here. I’m a bit confused, however, if it’s related to this Foxmarks, which lives in a similar vein. I’m pinging Mitch to find out…

Update: Mitch sez:

First we created a Firefox extension called Foxmarks that synchronizes bookmarks.  The URL you sent is for the main page of the web site, which happens to be a Wiki.

We used the bookmark corpus we collected to create the proof-of-concept system for the new startup.  We will keep building on this, but the extension is separate form the web site we are going to build.

Triumvirate against click-fraud

By - August 02, 2006

Google, Yahoo, and Microsoft announce they are joining forces to combat the click-fraud storm (both reality and accusations). The big three search engines will use their shared expertise, touching 86% of the game. The competitors plan to create common guidelines for clickfraud— starting with defining it, then facing the complexities of tracking it.

Picture 8-2AP: John Slade, senior director of Yahoo’s defense against click fraud, predicted the alliance’s guidelines “will be a game-changing step in measuring and fighting click fraud.” It may take more than a year before the guidelines are finalized, said Greg Stuart, chief executive of the Interactive Advertising Bureau. The decision to develop the guidelines reflects the Internet industry’s “commitment to being the most accountable advertising medium and providing marketers with the highest level of transparency,” Stuart said.

(Slashdot, AP)

round up

By -

Live Spaces

Windows Live Spaces launches. TechCrunch notes: …Live Spaces is taking over MSN Spaces completey – MSN Spaces pages now redirect to Live Spaces URLs. This is no small decision, because MSN Spaces is currently the largest blogging platform with over 100 million unique monthly visitors.

Toolbar packaged

Google Toolbar bundles with Firefox into a new multi-year package with RealPlayer: Real regularly distributes more than 2 million pieces of software a day worldwide. When users install RealPlayer, they will be given the option also to install either the Google Toolbar or Firefox.

Yahoo’s new domains

It seems Google isn’t the only one on a domain spree, as Yahoo has 11 new domains of its own. Including (via ResourceShelf)

Google’s Washington counsel talks

Alan Davidson, head of Google’s government affairs office in DC talks about internet regulation and other policy issues. Admits Davidson, “As a lobbyist, we’re getting our butts kicked in Washington.” (podcast at MIT)

(via ResourceShelf)

Know Search? Get a Job!

By -

Oslogotransparent-CopyOutsell, a research company I’ve come to know over the past year, is looking for an analyst in the search space. CEO Anthea Stratigos was kind enough to ask me to speak at their conference earlier this year, and I found the folks there smart and engaged. From their job posting:

Outsell is looking for an experienced, technologically savvy, and energetic Vice President & Lead Analyst to create innovative research and analysis about the Search, Aggregation & Syndication segment and players in the information industry and to work with our business development team to drive business in the space. As a core member of our market research and advisory business, the successful candidate will work in a collaborative environment and deliver executive-level analysis, including tracking and analyzing SAS companies and their customers and users, and providing strategy analysis and decision support to CEOs, COOs, and Marketing executives and their teams.

To whoever gets that job, I’m looking forward to getting to know you! (And hey, maybe I should start a job board here!)