Comment Spam and Search

Anyone with a blog has come across the bane of comment spam, recently it's gotten to near epidemic proportions for folks who use Moveable Type, as I do (I think this is because MT users tend to have high PageRank sites, but that's just a guess). Why do comment…

Anyone with a blog has come across the bane of comment spam, recently it’s gotten to near epidemic proportions for folks who use Moveable Type, as I do (I think this is because MT users tend to have high PageRank sites, but that’s just a guess).

Why do comment spammers do what they do? Simple: for the ranking juice. A spammer’s link inserted into the comment field confers this site’s authority, such that it is, to the spammer’s target site. Jeremy Zawodny, who is working with the Search team over at Yahoo, posts an interesting commentary on this problem, and suggests a solution.

If you assume the following:

1. 80% of blogs are hosted by or produced on one of the more popular blogging platforms

2. 80% of people don’t significantly tweak the default templates available in their blogging software

3. those people are the least likely to be actively fighting spam and, as a result, have more spam than the 20% of blogs where the owner is more defensive

Then a partial solution is fairly clear. I’ve heard and seen others discuss it over the past few months. The search engines needs to be smarter about reading and indexing content. …the software needs to be able to recognize the difference between links produced by the blog owner(s) and those contributed by readers and spambots. Once you can identify the difference between those two types of links, you simply stop using the second type of link when calculating rank. Sure, you can still count them for the purpose of providing link counts–just donn’t factor them into the ranking.

Jeremy’s suggestion has elicited a lot of commentary. As one of the 20% who actively fight comment spam, I’m hoping that some kind of solution is in the works. But I’m not sure this is it – often I appreciate the links that are left in my comment fields, and I wouldn’t want to discourage anyone who is well intentioned from continuing the practice. On the other hand, comment spam is a major problem, and it might be worth losing a bit of juice to save the ecosystem from the parasites.

6 thoughts on “Comment Spam and Search”

  1. Being relatively new to both having a blog that has a high pagerank and is powered by moveable type, when we got our first comment spam I started to freak. I’d delete a comment only to see three pop up in its place.

    We ended up getting gothamist.com’s backlog of banned links generated from the MT Blacklist plugin. When we upgraded to 3.1, the ‘de-spam’ link for comments was included in comment notification emails. Since each of my contribs is essentially responsible for monitoring their own comments, this took an incredibly amount of workload off me personally as the editor and distributed it relatively evenly among all the staff, with no training on my part.

    Get a spam comment? Click the de-spam link, the MT Blacklist page comes up, click ‘Do My bidding,’ and boom, ch3ap v!@gr@ and t3*@s h0ld ‘3m links are no more. On top of that, comments with links won’t even appear on the site until I approve them or they are entered by a typekey user. We haven’t had a bit of spam on our public site in months.

  2. I accidently discovered the answer when I launched my blog. Haloscan.com provides third-party comment applet that uses javascript to launch and is hosted with Haloscan.

    With these two elements, it makes it hard for robots to paste comment spam and there is no PageRank leak.

  3. Peatey, for what it’s worth, the reason that many Blogger sites don’t have comment spam is because they require authentication to comment. You can do the same by enabling TypeKey on a Movable Type blog. On TypePad, I believe we catch over 95% of comment spam and automatically delete it. Google’s got a pretty good policy about not giving special treatment to Blogger sites. In addition. Movable Type has had redirects in comment links for over a year, but that fact doesn’t generally stop spammers.

    If you want more specific information, please feel free to get in touch.

    John, we’re going to be talking a lot more about the various tactics out there today, along with some pros and cons of each tactic. We’ve put a lot of thought into this, and with TypePad we probably have the largest base of comment-enabled blogs that don’t use authentication by default for commenters. That’s given us a lot of raw data to look at, and we’ll be sharing some of those insights as well.

  4. Sounds like a reasonable suggestion. It saddens me that automated processes like comment spam once again threaten to snuff out an innovative form of online exchange.

    I suspect the 80% of blog owners who are not yet fully versed in fighting comment spam would be open to some coaching on the fine art of being more proactive.

    Like anti-virus, anti-malware, and all other anti-bad-stuff issues that have come before, collective efforts to mitigate the threat are enhanced when the majority of end-users comes on board.

    The challenge I see lies in getting that expertise, the lowdown on best practices, into the hands of the average joes and josephinas out there who are currently oblivious. More than anything, it’s a communications nightmare. But it has to happen, otherwise the scum who threaten to ruin it for the rest of us will ultimately prevail.

Leave a Reply

Your email address will not be published. Required fields are marked *