archive_me1Add my vote for this tag extract_stuff1Add my vote for this tag create new tag
, view all tags
In order to help combat wiki spam I have coded the following measures into DakarRelease. The measures can't stop people spamming your wiki, but they can stop spam links from being followed by robots, such as the Google spider.

In TWiki.cfg there are two new options:

# Options added to external links (links to URLs that do not match
# {AntiSpam}{Clean}. Public sites should set this to 'rel="nofollow"'
# to prevent wiki spammers gaining any benefit from spamming.
$cfg{AntiSpam}{Options} = '';

# Regular expression that must match the start of any external links
# that are _not_ to have the {AntiSpam}{Options} added. The default
# is to leave links to twiki.org and to the server site untouched.
$cfg{AntiSpam}{Clean} = qr,(http://(www\.)?twiki.org\b|$cfg{DefaultUrlHost}|\W),io;
and a new funtion in TWiki.pm:
---++ StaticMethod spamProof( $text ) -> $text

Find and replace all explicit links (<a etc) in $text and apply anti spam measures
to them. This method is designed to be called on text just about to be printed to the
browser, and needs to be very fast.

Links to URLs that are escaped by $cfg{AntiSpam}{Clean} are left untouched. All
other links have $cfg{AntiSpam}{Options} added.

May all spammers burn in the fires of eternal damnation! (that goes for email too, folks)

-- CrawfordCurrie - 02 Apr 2005

We now have two methods to combat WikiSpam, this and BlackListPlugin, and they both work in different ways. This adds rel='no follow' to all links except those specifically excluded, whilst BlackListPlugin adds them only to new links (I assume it won't get added twice, not that it would cause much harm).

It's possable you might want to use the different methods on different areas of the site. Therefore it would be nice if DakarAntiSpamMeasures could be turned off on a per web basis.

Webs like Main/Sandbox don't get many people viewing their WebChanges regularly, as they are mostly user pages and therefore somewhat offtopic to the rest of the site. This means that links in them probably don't deserve to be followed as they are more than likely unrelated to the site. They are also less likely to be policed by the general populus to remove wiki spam, leaving the job of policing them to admins. These webs would benefit from DakarAntiSpamMeasures as it would mean that WikiSpam in them could be ignored as harmless. It could be cleaned up periodically but there would be no need to check it every day.

Hovever other webs e.g. Codev & Plugins on this site, probably should allow following of their links because they are more than likely related to the content and focus of the site, and therefore deserving of a boost in PageRank / Search Fu. (How many links to other wiki's/sites that deserve to be indexed and gain rank are there in the Codev & Plugins webs?) Changes to these areas of the site are followed closely by many people therefore they will mostly be self policing. BlackListPlugin would only be needed to stop any WikiSpam that does appear from being indexed before someone can clean it up.

On the other hand closing down some areas of a site whilst opening up other areas may move the spam from Main/Sandbox where it is currently to the other webs, which could end up annoying legitimate users. Although since the net effect should be that no spam gets indexed you'd hope that the spammers would move elsewhere before you had to totally lock down the site. I'd certainly like to try it.

-- SamHasler - 03 Apr 2005

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r4 - 2008-09-17 - TWikiJanitor
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.