spam1Add my vote for this tag create new tag
, view all tags

chinese wiki spam

-- BoudRoukema - 14 Apr 2005

Since about Feb/March 2005 i've had chinese spam registrations on my own twiki ( a list of spammer usernames is here together with some IP info - http://adjani.astro.uni.torun.pl/cgi-bin/twiki/view/Main/SerweryChinski ) and it seems to be common. The goal appears to be improving google rank, not actually doing any serious vandalism. i couldn't find any references to this on the twiki.org page, so this is an attempt to collect together observations of the phenomenon, possible attempts to communicate with the spammer(s), and suggestions at solutions.

various experiences of the spam

chongqed wiki - summary of different antiwikispam strategies

  • http://wiki.chongqed.org//WikiSpam
  • http://wiki.chongqed.org//AntiSpamRecommendations
    • "noindex nofollow" on old revision pages
      • twiki (debian 20040902-1) seems to have noindex on old pages, but not nofollow
    • Content banning - regular expressions - blacklist
      • Would this be easy on twiki?
    • Surge protection
      • Probably not too difficult to implement??? (But maybe also not a priority.)
    • specific wiki engines: As of 15 April 2005, there doesn't seem to be any twiki antispam strategy: Specific Recommendations It's difficult to be specific without talking about specific wiki software. We are attempting to build a set of specific recommendations / instructions for the various wiki engines out there:
      OddMuse, UseMod, MediaWiki, MoinMoin, DokuWiki, Wakka, Wikka, PmWiki
      However this is a work in progress. We recommend you also take a look on the homepage of your wiki engine. You should find antispam recommendations there, although some engine developers are dragging their feet with this.

wikimedia solution - remove the motivation rel=nofollow

http://en.wikibooks.org/wiki/Wikibooks%3AVandalism_in_progress Chinese link spam

Note that this has subsided since before Christmas 2004. The Wikimedia software has been updated so that the exploit which ranks pages higher (Google results order is determined by the number of links containing search text to the page) by adding many links, known as "Linkspam", no longer works. The page source of each wiki page contains a certain "nofollow" command which now keeps the vandalism from succeeding in its desired effect. (cited 14 April 2005)

i couldn't find an exact example, but it seems to be something like

<a href="http://some.url" rel="nofollow">please buy time on my nice chinese server</a>

The rel=nofollow presumably tells google and other robots not to follow the link. In the long term the spammer(s) may realise this and give up spamming - if s/he/they really check their resulting google rankings related to individual sites.

brutal solutions

  • close off open registration frown
  • block IPs of one billion people living in a not-very-free country needing as much internet access as possible frown

automatic solutions fail since spammer(s) pass Turing test (humans)

  • http://lathi.net/diary/Internet/twiki_spam.writeback
    • Yesterday, I got tired of it an thought I'd try to fix it. Assuming this is some 'bot that's registering and then spamming, I edited my TWikiRegistration page to rename the register script to something silly and create a completely hidden form that submitted to the now-bogus register script.
      Imagine my surprise when I had more spam this morning. It turns out this spammer is creating these pages by hand! Checking the access log, I can see he's googling for "TWikiUsers", following the link, then clicking on the link to register and filling out the form. So renaming the register script doesn't do any good! No Turing test is going to fool this guy and still allow real humans to register because he is a real human!

direct contact

  • http://chongq.blogspot.com/2005/03/spam-huntress-talks-with-friendly.html
    • Wednesday, March 23, 2005 Spam Huntress talks with friendly spammer - Spam Huntress has been discussing things with a "friendly spammer." Well, I am not sure he is actually friendly, but Halz used that term so I will go with it. This admitted spammer thinks he has the solution to solve blog spam and wants to share it with antispammers.
      I suspect he thinks he has some real good ideas to prevent spam, but just because he is a spammer doesn't mean he is an expert at knowing how to prevent spam. There are a lot of great solutions to web spam, but most are inconvienant or totally unwanted by users. CAPTCHAs are annoying and only work against automated spamming. Redirects and nofollow means that even legitimate links do not recieve any PageRank influence. I don't really see a big problem with that on blog comments, but apparently many bloggers do. Most wiki users hate it too, though Wikipedia is using it.
      What he doesn't get is spammers are morons. Maybe he isn't, but he also isn't the usual Chinese spammer who probably gets paied a few dollars for a day's worth of spamming. We have lots of spammers hit our wiki and many work really hard to get around our spam prevention methods. Usually by posting a text only URL which is not useful for PageRank or getting hits. posted by Joe @ 3:54 PM

more thoughts

Will the wikimedia method work? Are there any ideas for twiki solutions?

The wikimedia method will work but its a solution that is curing the disease by killing the patient. by using the nofollow in your links not only are you stopping the spammer from gaining page rank, you are stopping yourself from gaining page rank as well [this is not correct, see below - SH]. If you don't care about your page rank why have the website on google at all? If it is a private site, get rid of the domain name all together and access it by ip address directly. no, wikimedia method will not work (for me) because My (portal wiki) sites success depends upon it being seen by the public. the only method I can find is; remove the spam manually as soon as you find it (before google gets a chance to crawl the page). If these clowns dont get the satisfaction they are seeking, hopefully they will move on to other things.

  • If the spamming grows exponentially it'll some time get beyond what any human admin can handle. A few spammers per week can be handled by a human, 100 spams per day would be several hours' work doing nothing but anti-spamming. -- BoudRoukema - 15 Apr 2005

If somebody can provide a better solution, I am all ears.

  • i added some more links above - chongqed wiki suggestions - there's quite a good list and summaries of the different approaches. i put the section near the top of this page because it seems to be a good wiki synthesis. -- BoudRoukema - 15 Apr 2005

-- TravisBarker - 15 Apr 2005

"by using the nofollow in your links not only are you stopping the spammer from gaining page rank, you are stopping yourself from gaining page rank as well"

That is not correct. You only add nofollow to external links; all the internal links still get followed, and links from other sites into the site still get followed, so there shouldn't be any loss of pagerank.

See my comment on DakarAntiSpamMeasures for what I think is the best method to combat spam, particularly registration spam.

-- SamHasler - 15 Apr 2005

TopicClassification TWikiDevQuestion
TopicSummary documenting chinese spam and suggested solutions
InterestedParties twiki admins, twiki developers

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r4 - 2005-04-15 - SamHasler
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.