free html hit counter June 2005 | Page 8 of 8 | John Battelle's Search Blog

Speaking of the ChangingWeb…

By - June 04, 2005

Google clearly understands the importance of grokking change. Thursday it launched and “experiment” it calls Google Sitemaps, an XML-based open standard that allows webmasters to inform Google of website changes. Slashdot thread here. Danny on this here.

From its blog post:

We’re undertaking an experiment called Google Sitemaps that will either fail miserably, or succeed beyond our wildest dreams, in making the web better for webmasters and users alike. It’s a beta “ecosystem” that may help webmasters with two current challenges: keeping Google informed about all of your new web pages or updates, and increasing the coverage of your web pages in the Google index.

Interesting to note:

This project doesn’t just pertain to Google, either: we’re releasing it under the Attribution/Share Alike Creative Commons license so that other search engines can do a better job as well. Eventually we hope this will be supported natively in webservers (e.g. Apache, Lotus Notes, IIS).

  • Content Marquee

Grokking PubSub and Data Lock In

By - June 02, 2005

Pubsub-1Earlier this week I spent some time on the phone with Bob Wyman, CTO and founder of PubSub. Over the past year Bob has been heckling me for focusing on “retrospective search – Google and Yahoo, et al, and not paying attention to his offering of” prospective search,” or searching what he calls the “GrayWeb” – that part of the web which is available and open, but is rarely seen because our view of the web is so dependent on traditional approaches to search. Wyman focuses on that portion of the GrayWeb that changes rapidly – the “ChangingWeb” where the future hits the present, where the unique element of the dataset is the fact of its newness. That window – when the information is knowable, but before it becomes forever eternalized in The Index – is where PubSub lives.

In short, PubSub crawls (mostly) blog feeds and offers a service that allows you to stay abreast of topics you choose as new information breaks. (PubSub just announced a political cut of this kind of data, for example). To me, PubSub felt a lot like Google or Yahoo news alerts on steroids, a Feedster clone. But after talking to Bob, I came away convinced that there’s more to PubSub than meets the eye.

PubSub is named for “publish/subscribe” – a well traveled piece of IT theory that has, at its core, the assumption of structured data. Back in the earlier days of the computer biz, Apple, DEC, and others realized the need for users to be alerted with things change – in a database publishing model, for example, a new rev of a document would create an alert. These companies invented publish-subscribe models that, for the most part, really never took off. Why? I think the code was overspecified, and the user interface cumbersome. Wyman worked on pubsub apps at DEC – in fact, he built the pubsub piece of AllInOne, a Notes-like application that had a brief moment in the sun in the late 80s, if memory serves.

A few years ago Wyman found himself wondering if it were possible to apply the publish and subscribe model to the entire world wide web. That’s a pretty audacious idea, but focusing on blogs was a good way to start , because blogs have a wealth of feed-based structured data around each post (timestamp, author, title, often a category). Wyman claims to have figured out algorithms which allow PubSub to process the ChangingWeb rapidly and “at internet scale.”

I’m not in position to judge those claims, but I like the theory behind Bob’s intentions. He plans to create tools that allows bloggers to easily tag their posts with category like information – “this is a book review” or “this is an event announcement.” He’s already built plug ins for Word Press and is looking to continue his work with other platforms like MT, which have similar widgets that so far are not aligned around a particular standard.

In theory anyway, Bob is onto something here. It’s yet another attempt to build the semantic web from the bottom up, and it suffers from all the foibles of such an effort, but the intent is good – let the individual publishers build data structures which, in aggregate, create a fuzzy kind of value that developers can tap into. Were enough of these kind of structured and tagged data sets to become available (“This is a job posting,” “this is something for sale,”) we might well see services evolve which are built on the premise of freely available data – in other words, a new kind of publishing model, one where value comes from what you do with the data, as opposed to who owns access to the data. That may not seem like a big change, but in fact it would be – eBay, Monster, Yahoo, et al are all based on the idea of owning the environment in which structured data lives. More on this shortly, but for now, check out PubSub and let me know what you think.

Light Posting – Web 2.0 Ho

By -

I have a big Web 2.0 meeting later today and will be in the city for much of the day, so posting will be light. Friday or Monday I will be posting a longish riff on PubSub, and also a call for input and feedback on early ideas I have on Web 2.0 program, which is really shaping up. Talk to you then…

An Example of Search Image Manipulation Via Blogs

By -

QuixtarMark Glaser has filed a column on questionable SEO practices in the OJR. It focuses on Quixtar, the online arm of multi-level marketing king Amway. From the piece:

The company, a revamped online version of Amway, has had trouble with critics online and decided to fight them by unloading an arsenal of search engine optimization (SEO) techniques that go against accepted marketing techniques and into the muddy world of Web page spam, also known as link farms and Google bombing.

To put it simply, Quixtar enlisted various people to help create dozens of Weblogs that linked to each other and were filled with positive stories and key words. The idea is to help put these newer blogs at the top of search results for phrases such as “Quixtar success” and “Quixtar opportunity,” while more critical sites such as

Quixtar Blog and Amquix.info would drop down.

It’s interesting to see how a company like Amway/Quixtar is using search to squelch its critics and promote a positively skewed vision of its practices. Not unlike Scientology, Amway is a controversial topic, one that has adamant detractors (it’s a scam, a cult, a house of cards) and true believers (it changed my life, it really works, you too can get rich…). It also has plenty of folks who are just working the plan and not really taking sides.

Net net: Another proof point in the motto You Are What The (first few links in the) Index Says You Are….

Google Update "Bourbon"

By - June 01, 2005

Bourbon-1Now here’s a Google algorithm update they must have named just for me. Nicknamed Bourbon, (I guess this is better than Florida), the update is ongoing. According to “GoogleGuy,” a Google employee who hangs out in search related forums and assuages the fears of webmasters (he’s been know to show up here too…):

Here’s the advice that I’d give now: take a break from checking ranks for several more days. Bourbon includes something like 3.5 improvements in search quality, and I believe that only a couple are out so far. The 0.5 will go out in a day or so, and the last major change should roll out over the next week or so. Then there will still be some minor changes after that as well. So my “weather report” along the lines of http://www.ysearchblog.com/archives/000095.html would be a recommendation that rankings may still change somewhat over the next several days.

Hat tip:

Threadwatch

Patriot 2: Scarier, Spookier

By -

Olduvaifoot-TmI am not a fan of the Patriot Act, as you can see here. The Times has an editorial about the next rev of the act, many parts of which are due for renewal this Fall. The first act redefined many key legal terms so as to make your search history and the like far more susceptible to government snooping without notification, but this second rev sounds scarier. From the editorial:

One of the most common complaints about the Patriot Act is that rather than addressing the real but narrow problems with existing law, it was a wish list of powers law enforcement officials had yearned for over the years that Congress had rightly resisted conferring. Now the Bush administration and its Senate allies have come up with another: a proposal to let F.B.I. agents write their own “administrative subpoenas,” without the need to consult prosecutors or judges, in demand of all manner of records, from business to medical and tax data. There is no serious evidence that agents have been hamstrung by the lack of such wide authority.

Freeing agents from getting a judge’s sign-off is an invitation to overreaching and abuse, as is a proposal to let the F.B.I. ignore postal law restraints when antiterrorism agents choose to monitor someone’s letter envelopes and package covers.

Hell, it’s not the post office we should be worried about, it’s Google Desktop and its ilk. Remember my ephemeral to eternal riff? Uh huh.