John Battelle's Search Blog F’in Spam

10 thoughts on “F’in Spam”

Rob says:

October 6, 2007 at 12:10 pm

Two options:

(1) Simply bypass MT and use your web server (Apache) to ban it the IP addresses, or limit the requests able to be made at any one time.

(2) Use a CAPTCHA rather than your current solution.

Reply
Matt Cutts says:

October 6, 2007 at 12:11 pm

“Sohbet” means “chat” in Turkish. I was talking to someone about Turkish spam this past week. You see a lot of old-school stuff in that market that you don’t see in a lot of other languages. “Toplists” are really common, for example.

Reply
nmw says:

October 6, 2007 at 2:33 pm

Polish seems to currently be hot in the running, too.

It *is* a niusance. Maybe publishing the precise date and IP address could help organizations that fight spam “capture” such people (?) — and/or if these people knew that they are running a risk of being identified and/or prosecuted, that might be a good deterrent (and it would less tax people who are not doing anything wrong).

Reply
Bill Burnham says:

October 6, 2007 at 5:03 pm

I get their spam all the time too. Seems like they use actual humans to post their spam so there’s really no defense. I’ve tried banning their IP addresses but they always use different ones. I am not sure which more evil: Kim Il Jong or Sohbet.

Reply
Hollywood says:

October 6, 2007 at 9:24 pm

Never turn comments off under any circumstances.

The comments on this blog and oReilly’s are some of the most erudite on the social Web.

The comments are just as valuable as the posts – in fact they add an extra dimension and new perspectives because of the quality of the readership.

Reply
nmw says:

October 7, 2007 at 2:09 am

how about “enlisting” the readership to help out by flagging it? I’m not talking about the readers deciding what gets posted or not, but just to help out.

Also: I find it noteworthy that this blog — being about conversations and all that jazz — does not have the quintessential “community thing”: a login (note that SearchMob seems to have a different focus, so I’m not counting that). Would that help — only being able to post as a “member” — with the “risk” of having your membership revoked if you start spamming or something like that?

My gut feeling is that this would be very effective.

Reply
SorenG says:

October 7, 2007 at 8:01 pm

Agree with Norbert (nmw) above. A simple registration process may be able to help — and a way to tag spam.

Reply
nmw says:

November 4, 2007 at 8:58 am

On another site, I have been a vehement advocate for “transparent” feedback — and I still feel this is a valid concept (indeed, I think a large part of digg’s success lies in the how adeptly they “manage” feedback issues — and Mr. Burka also recently posted a slideshow about this, which I have linked to on http://how-to-help.com )

Perhaps digg might someday “syndicate” it’s feedback expertise (oh no — get ready for it: “as a service” ;P)….

Since I have seen my comments quite frequently suppressed, I get the feeling that another kind of “syndicate” is also forming and/or growing — and since link-based and/or purely “algorithmic” search has been more/less written off (returning only stale “news” from ages ago), I feel we find ourselves at a watershed moment.

In order to be a strong brand in search, it must be based on the kind of quality characteristics we observe with http://digg.com — funny logos on the homepage do not a reliable “search engine” make (actually, every website is a “search engine” — if you really think about it [even Google admits that its results are only recommendations — and there are quite a few who believe that the results are “manipulated” by Google’s management — as in the case of “miserable failure”]).

In order to get at the “truly” relevant results (Google’s censorship of “miserable failure” and/or other similar value judgments show that the company is increasingly returning value judgments as “results” and that they have apparently abandaned algorithmic search themselves [but have yet to come up with a “rational” reason why Google’s value judgment ought to be considered better than yours or mine]), we may need return to the notion of the web as a “flat” system.

What is a reliable indicator of quality? Certainly, digg’s model works quite well. Facebook’s model, I would say, still remains to be proven. Google’s model of wholesale distribution, however, is almost certainly a folly. Don’t get me wrong: I’m not saying that alot of people won’t sink alot of money into the “open social” idea — all I’m saying is that it will be about as exciting as the “hohum” SERPs we all already have gotten accustomed to from what used to be a quite good “search engine”.

I think in the end, those sites which (like http://digg.com ) actually go beyond simply embracing the transparency and openness of the web — and actually build upon this model (and thereby even underscoring “non-anonymity” — or at least “separating the wheat from the chaffe” by doing more to differentiate among some “fly-by” quackery and arguments that are “backed up” by some kind of sign-on-the-dotted-line scenario) — those sites will become web 3.0

And I also strongly feel (hope, expect) that such sites will be created by communities (and their lifeblood will indeed be the conversations within [note: not “among” or “across”] those communities) — communities that care about this/that issue or topic….

It is the “community participants” that gather at such focused (and/or “targeted“) sites who are most motivated to collaboratively create high quality information.

So far, we can only see “bits” and “pieces” of such topically focused communities forming — on some sites, such as hotels.COM, not much has been done to promote conversation — yet. On other sites, such as help.COM, not much has been done to “bubble up” the best “answers” to a given problem/issue (or even to “bubble up” the “most important” problem/issue) — yet. On other sites, such as work.COM, not much has been done to validate the expertise (or even the authenticity) of persons “voting” for the “most useful” guides — yet.

The big question is, I guess: How long will it take before they do?

Because when they do, you might be lucky to find someone who is still willing to pay $7 for a share in the stock of some “old-fashioned” link-based algorithm…

;D nmw

Reply
chat.net.in says:

January 26, 2008 at 3:53 am

John,

I don’t know how far along you are with this now, but it seems like the current situation is penalizing the wrong people. I still see plenty of “sohbet” spam, but some users are complaining about not being able to post links (for example, see battellemedia.com/archives/004216.php#comment_128128 — which sought to link to a very interesting article by Jay Westerdal [ blog.domaintools.com/2008/01/google-to-kill-domain-tasting ], but the poster was apparently unable to do so; I found that Jay’s article stupendous — immediately dugg it and posted it on various forums to get the word out about the “good news”… and it is indeed now one of the top-ranking articles in “Tech Industry” news on digg.com).

I think it would be good if you could keep us posted on the progress you are (or aren’t) making about the spam issue. It seems that some of your most ardent users are frustrated — and that, I feel, is not a good state. I hope a better solution can be found soon — but even if this remains difficult, it would be good to keep the conversation going (so to say ;).

Thanks!

🙂 nmw

Reply
John Battelle says:

January 26, 2008 at 12:00 pm

The “human detector” I put in place is working, but I do admit it seems to randomly cut off some people, for reasons I don’t understand. The spam we see coming through is hand rolled, not automated, so it’s relatively easy for me to pick off as it comes in (though it does pile up when I am on the road…)

Reply

Share this:

10 thoughts on “F’in Spam”

Leave a Reply Cancel reply