free html hit counter Help Me Stop HubAdverts Dot Com! | John Battelle's Search Blog

Help Me Stop HubAdverts Dot Com!

By - October 19, 2012

I’ve been working with my site design partner Blend to try to track down a spammer who has taken my entire site and repurposed it as their own, replete with tons of ads and a clear intent to draft off Searchblog’s quality content (if I do say so myself) and, most likely, its pagerank as well.

The site is “hubadverts.com” and no, I’m not going to link to it. Each of my posts is ripped off as a URL including that domain – if you click on the domain, you get a scammy feeling ecommerce site. But at “hubadverts.com/on-data/” for example, you will see a recent post of mine, scraped in its entirety.

The funny thing about this site it that it scrapes my full text RSS feed, then rebuilds my site. Then it has spammy sites trackback to the rebuilt site, and leave comments there. Oddly, those trackbacks and comments are emailed to me as if I was the WordPress administrator of the site. Of course, the last thing I am going to do is try to log into the back end of the site, because that would give the spammers access to the backend login information of my own site. It’s phishing and blackhat SEO all rolled into one!

The “news hub” where my ripped-off posts reside includes an ad urging folks to “Unblock the Pirate Bay,” which concerns me, because just writing this post probably is inviting a DDOS attack. But I don’t think ripping off my site and damaging my reputation is defensible, and I’m speaking up about it.

I emailed Toni Schneider, the CEO of WordPress, for advice, and he suggested I change my RSS feeds so the scrape includes attribution. I did so, and sure enough, now the spam site attributes Searchblog and links back to it. (I am very fortunate to have Toni as a colleague!). However, while this proved the site was scraping my RSS feed, it doesn’t solve the problem. Toni suggested some other remedies, which we are looking into, but he also suggested I do what I’m doing now: Public shaming. After all, the site is violating my non-commercial Creative Commons license, and quite possibly damaging my own pagerank – Google doesn’t like it when spammy sites are seen as linking to you, and it hates duplicate content.

So I want it stopped. But a lookup of the site’s owners show the listing is private – I don’t have anyone to go after. And the site itself is an endless mousetrap of scammy ecommerce sites, among other things.

Hence, I’m asking you, the Searchblog readers, who are always smarter than I, to help me figure out a way to make this right. Any ideas?

Related Posts Plugin for WordPress, Blogger...
  • http://twitter.com/grumpyhawk Chris M. Carroll

    seems to be redirecting to thepartsdirect dot com now.

    • johnbattelle

      Yeah, it’s pretty slippery.

  • Ian

    If they are using the same ip for each rss request or trackback this is easy: dynamically insert a uuid in each served RSS response, log the uuid-client_ip pairings, write a spider that crawls the rip-off site to collect the uuids of stolen posts, dynamically block the client ip from the rss server.

    Presumably they are not, so you’ll end up building a roster of their rss client network, which might be effective over time, but doesn’t yield an immediate or decisive result. If they are an aggregation service, the IP address of their client should reveal that and you can work with them to identify the service user ID.

    • johnbattelle

      You lost me at “write a….” but I’ll make sure the right people see this thank you!

  • http://twitter.com/l1mb0 Eric Black

    They are hosted by Cloud Flare which is right here in SF. I’d start with abuse@cloudflare.com. You can also go for the source. Hubadverts admin and technical contact is Usman Farooq in Essex info@computershub.co.uk… Looks like they’re ripping off Slate and some other folks too.

    • johnbattelle

      Thanks Eric. I will send Cloudflare a .. flare! And I’ll send a note to that email, though I’m guessing that he’ll be unresponsive!

      • http://pencoyd.com/clock/ John B. Roberts

        John, I work at CloudFlare, and we got your first “flare” last week, and my colleagues responded with our abuse process, which is informed by http://blog.cloudflare.com/thoughts-on-abuse (long blog post from our CEO).

        Two notes:
        1) CloudFlare’s core security will let you block offending IP addresses, and more, at all levels.
        2) We have an app called ScrapeShield which will let you track content re-use very effective, even in an RSS feed (though that may require an extra step, depending on how generated).

        Both of those are no charge options. CloudFlare started with the goal of giving webmasters control over who visited their site, though we’re doing lots more than that now.

    • johnbattelle

      Thanks Eric. I will send Cloudflare a .. flare! And I’ll send a note to that email, though I’m guessing that he’ll be unresponsive!

    • http://www.maxminzer.com/ Max Minzer

      Emailing them will be useless, I think.
      I would actually avoid contacting them because they can later know who did this harm (below) to them and try to pay back.
      They hacked your site – there’s no reason for you to contact them.

      I’d go with contacting hosting and request a take down, like Eric said.
      Also DMCA notice request on top of that for infringing content.Do that for both domains.
      Try to take their hosting down to make sure they loose ALL their sites there. I hope they have like 50 so all of them are gone with that one hosting account.
      Gather all the evidence and details you can.

      I really hope you’ll hunt them down and have it all resolved, John.

      • johnbattelle

        Turns out, CloudFlare is not the host. Filing a DMCA takedown? Ugh.

        • http://www.maxminzer.com/ Max Minzer

          They infringed your content – that you can prove for sure.
          Use that to do whatever it takes.

          I’ve never been in situation like this myself. Just sharing what I know.
          And tried to spread the message to have someone else help.

          I’m sure you know a couple of good friends who know a lot about this. Even someone in G who works closely with DMCA and such.
          Don’t hesitate to ask someone for help directly.
          You’re very valuable to our industry.

        • http://www.maxminzer.com/ Max Minzer

          If you need quick answers from legal experts, John, feel free to ask:
          https://plus.google.com/108833983815923249955/posts/HCmfF1vzb9y

          • johnbattelle

            Thank you!

    • http://electrojams.com/ Jordan Meeter

      Cloudflare is not a host, they are more of a proxy / CDN.

      • johnbattelle

        Yep

        Sent from my mobile

  • http://www.nicheoptimizer.com/justin-lewis/ Justin Lewis

    Good call on not linking to the site ;) Don’t give them the credibility that you are connected to them in anyway possible.

  • John Larkin

    John, Perhaps you could use your .htacess file to combat this. This website refers to a method you can use to block an IP address. http://blamcast.net/articles/block-bots-hotlinking-ban-ip-htaccess I had trouble with referrer spam some years back and I had to resort tho these type of methods. Cheers, John

    • johnbattelle

      Thanks! I’ve sent your and other suggestions to my site ninja, who actually knows what you are talking about…

  • http://www.googlingsocial.com/ Chris Lang

    One last thought, if his WhoIs details are fake, you can go after the Domain registration since that is illeagal by ICANN standards. And he is registered thru GoDaddy :]

    • johnbattelle

      Thanks Chris. It’s such a mess. My IT pro found his hosting company, and US IP provider. If anyone wants to help, info below:

      • http://www.googlingsocial.com/ Chris Lang

        What surprised me is that most did not know what WhoIs was.

        I would just have my attourney deliver a cease and desist to his door in the UK. Then begin legal proceedings there if you really want to kick him where it hurts.

        Civil suits start at $15,000 to defend in the US. Depends on how mad you are :}

        • johnbattelle

          We’ll see.

  • http://electrojams.com/ Jordan Meeter

    @johnbattelle:disqus are you familiar with Google’s new “disavow links” feature? This may help at least tell Google that the links coming your way are not authorized or condoned.

    Link: http://googlewebmastercentral.blogspot.com/2012/10/a-new-tool-to-disavow-links.html

    • http://electrojams.com/ Jordan Meeter

      @johnbattelle:disqus just wondering if you tried this approach and what you thought of it?

      • johnbattelle

        Not yet but will try soon

        Sent from my mobile

  • joey89924

    I’ll send a note to that email, though I’m guessing that he’ll be unresponsive!
    http://www.hqew.net/product-data/BAV70

  • Peter Pottinger

    Why not just block him using htaccess or firewall? should be easy to get a pattern from the logs. You could (or I could easily) program a bot to auto block ips based on this pattern of scraping as to head off any future attempts.

    • johnbattelle

      I think we got it nailed a month or so ago. Thanks for the advice!