Tags:
development1Add my vote for this tag spam1Add my vote for this tag create new tag
view all tags

WikiSpam

ALERT!NOTE: All administrators of public TWiki sites are encouraged to upgrade to the latest BlackListPlugin, version 2013-03-22. It prevents known wiki-spam from getting saved in TWiki topics and uploaded as HTML attachments, makes scripted registrations harder, and protects the site from excessive use by an IP address. It also protects TWikis from topics text and attachments with evil script eval() and escape().
ALERT!NOTE: TWiki administrators are encouraged to subscribe to the TWikiAnnounceMailingList to get alerted of security issues, new TWiki releases and important spam related announcements.
ALERT!NOTE: A new type of spam has emerged: HTML attachement spam. This spam is nasty because it might identify TWiki sites as spam sites.

Wiki spam is a growing problem on public wiki sites. Actually, it is not isolated to wikis; any website that can be updated by users is a potential target for spam, such as blogs and bulletin boards. What is considered spam on a website? In the broadest sense any content that is off-topic and considered unwanted by its website users. The most common spam on writable websites is Wikipedia:Link_spam: Spammers add links to their websites in many wikis, blogs, and bulletin boards, with the hope that search engines will raise the ranking of their page. In other words, spam is added not for human consumption, but for search engine spiders. This strategy works unfortunately, spam sites are listed in the first page of a search results as can be seen in this Google search for Alprazolam. You understand why if you search for Alprazolam and wiki.

What can you do as an administrator of a public wiki site?

  • Rule 1: Enable spam protection
  • Rule 2: Remove spam as quickly as possible when it happens. Reason: Spammers identify easy targets by searching sites for known spam keywords. It pays off to spam sites where spam survives long enough for search engines to spider the content.

Most of the wiki spam happens on wiki pages. Anonymous users or newly registered users add link spam to their wiki homepage and other wiki pages. Spammers are getting more sophisticated. A new type of spam has emerged recently: HTML attachment spam on wikis where users can attach HTML files to a wiki. This spam is nasty because it might identify wiki sites as spam sites. It works like this: A spammer attaches a web page to a wiki with lots of ads for what they sell. Then they add link spam to many wikis, blogs and bulletin boards to raise the page ranking of the HTML page attached to the wiki.

Public TWiki sites are spam targets for some time already. TWiki has a BlackListPlugin that is quite effective in fighting spam. The Plugin gets updated every time a new spam twist is discovered, such as an HTML redirect obfuscated in a JavaScript eval statement. The BlackListPlugin fights spam on several fronts:

  • Multiple registrations by the same IP address in rapid succession
  • Multiple page saves by the same IP address in rapid succession
  • Saving text with known wiki-spam (spam list is maintained and shared by TWiki, MoinMoin and Mediawiki sites)
  • Attaching files with known wiki-spam
  • Attaching files with JavaScript eval statements
  • Manually maintained BLACKLIST of malicious IP addresses
  • Automatically updated BANLIST of IP addresses with suspicious activities
  • Registration form with magic number in hidden form field to make scripted registrations harder
  • Add a rel="nofollow" parameter to external URLs to defeat the purpose of spamming TWiki sites

Administrators of public TWikis are strongly recommend to install the latest BlackListPlugin. The reality however is that there are still many public TWiki sites that do not even have this Plugin installed. To address the issue, we sent out several spam related alerts to the twiki-announce mailing list, and we sent personal e-mails to some site owners not on the list. Still, the awareness of wiki spam needs to be raised so that more site owners take actions.

(Above content largely taken from Peter Thoeny's Wiki Corner on Wiki Spam on Public Wikis)

Related links on wiki spam:

-- Contributors: PeterThoeny

Discussions

At TWiki.org we have once in a while an issue with people deleting or altering content by purpose or because they do not understand how Wiki works. This happens mainly in the Main web, and sometimes in the TWiki web. It is a minor annoyance that can be fixed quickly.

Yesterday we had the first case of Wiki:WikiSpam where HasitRuparel with IP address 219.65.75.99 was spamming over a dozen user home pages by adding these URLs: (edit page to see the URL)

The log suggests that this person edited the pages manually:

| 10 Feb 2004 - 03:20 | Main.HasitRuparel | save | Main.HasitRuparel | repRev 1.1 Main.HasitRuparel 2004/02/10 10:52:00 | 219.65.75.99 |
| 10 Feb 2004 - 03:27 | Main.HasitRuparel | save | Main.AndreaMarchetti |  | 219.65.75.99 |
| 10 Feb 2004 - 03:28 | Main.HasitRuparel | save | Main.AndreasKapp |  | 219.65.75.99 |
| 10 Feb 2004 - 03:30 | Main.HasitRuparel | save | Main.BillLeeney |  | 219.65.75.99 |
| 10 Feb 2004 - 03:31 | Main.HasitRuparel | save | Main.BillKelly |  | 219.65.75.99 |
| 10 Feb 2004 - 03:32 | Main.HasitRuparel | save | Main.AldenWilner |  | 219.65.75.99 |
| 10 Feb 2004 - 03:32 | Main.HasitRuparel | save | Main.BenjaminDrieu |  | 219.65.75.99 |
| 10 Feb 2004 - 03:33 | Main.HasitRuparel | save | Main.AllenBierbaum |  | 219.65.75.99 |
| 10 Feb 2004 - 03:33 | Main.HasitRuparel | save | Main.BalthasarSieber |  | 219.65.75.99 |
| 10 Feb 2004 - 03:34 | Main.HasitRuparel | save | Main.AndreaSterbini |  | 219.65.75.99 |
| 10 Feb 2004 - 03:58 | Main.HasitRuparel | save | Main.ChristopheVermeulen |  | 219.65.75.99 |
| 10 Feb 2004 - 03:58 | Main.HasitRuparel | save | Main.BrillPappin |  | 219.65.75.99 |
| 10 Feb 2004 - 03:59 | Main.HasitRuparel | save | Main.ChristopheVermeulen | repRev 1.4 Main.HasitRuparel 2004/02/10 11:59:00 | 219.65.75.99 |
| 10 Feb 2004 - 03:59 | Main.HasitRuparel | save | Main.BjornStadil |  | 219.65.75.99 |
| 10 Feb 2004 - 03:59 | Main.HasitRuparel | save | Main.BrillPappin | repRev 1.2 Main.HasitRuparel 2004/02/10 11:59:00 | 219.65.75.99 |
| 10 Feb 2004 - 04:00 | Main.HasitRuparel | save | Main.ChrisMcLennan |  | 219.65.75.99 |
| 10 Feb 2004 - 04:05 | Main.HasitRuparel | save | Main.CarrieCoy |  | 219.65.75.99 |
| 10 Feb 2004 - 04:10 | Main.HasitRuparel | save | Main.DacreWroe |  | 219.65.75.99 |

Now, a spammer could raise the Google ranking of his web site by spamming Wiki pages in an automated way, which is a scary thing for public Wikis.

Here is an interest related post on the WikiForum by Arno Hollosi:

John Abbe wrote: > Well, the NCDD wiki was found, and spammed by a robot before we even
> have gone public. I'm trying to nudge the team off a sudden interest
> in HardSecurity. At the same time, on Wiki:ReverseLinkDisabled
> there's mention of turning away IPs with a high request rate.
> Can anyone offer good starting settings for such a protection - how
> many requests in how short a time to trigger it?

On Sensei's Library (http://senseis.xmp.net/) which is one of the largest non-Wikipedia wikis I have a 3-step meassure:

  • limit requests/minute: anything beyond 30 requests within 60 seconds and the IP address is disabled for 5 minutes. If after that the maximum gets exceeded again within an hour then the IP address is disabled for 24 hours.

  • shield resource intensive requests (or edit links etc.) by checking for a HTTP referer header that originates from your site. Effective as well. Some browsers (privacy proxys, ...) supress the referer header. Those people have to set a (preference) cookie in order to access those functions.

  • one of the most effective meassures is adding a "trap link". I.e. if the link is followed the IP address is immediatly added to the block list (at Sensei's for 48 hours). Mark this link as "Disallow" in your robots.txt file so compliant robots don't follow the link. At Sensei's look at source and search for "Blockme" to find the trap link - users are not able to active it, as it contains no link-text.

In my experience of running this high traffic site, the trap link in combination with the referer header is most effective. The requests/minute is only there so that people don't mirror the wiki with wget or some other tool.

See also: http://senseis.xmp.net/?AccessBlocked

Is it time to write a WikiSpamPlugin?

-- PeterThoeny - 12 Feb 2004

Looks like the 'apocalypse' mentioned in SpamProofingOfComments (first comment) has finally arrived frown

There are two quite different sorts of WikiSpam, I believe:

  • SpamProofingOfComments - what just happened at TWiki.org, could be manual or automated. SlashDot style 'you can't edit another page for N minutes' tests could help here, particularly if people are creating unique userids and not using TWikiGuest. IP-based checking is useful for the TWikiGuest account. Another approach would be to use SpamAssassin to check the content of comments/edits to TWiki pages, which is probably the only defence against patient manual or automated comment spam (i.e. the user or robot does 1 comment every 20 minutes, say).
  • Excessive page views by robots (search engines or mirroring) - this is the target of the Sensei's Library feature mentioned above. It is a component in SpamProofingOfComments.

-- RichardDonkin - 12 Feb 2004

We had again Wiki Spam in the Main web, similar spam like above has been added to almost 100 pages by IP address 203.88.152.253 and 203.88.152.17. The BlackListPlugin is the answer to this spam; the Plugin is installed at TWiki.org. Please provide feedback on the Plugin at BlackListPluginDev.

Edit this topic in case you want to see the topic save logs of the spammer and the content of spammed user home pages.

-- PeterThoeny - 21 Mar 2004

Last night we had the same issue again (same spam content), this time registering 46 dummy users and spamming 36 existing user topics, using IP address 203.88.155.244 and 203.88.155.135. I updated the the BLACKLIST in BlackListPlugin accordingly. JohnTalintyre and I scubed the walls clean. Quickly removing graffiti is key, there are plenty of walls around elsewhere for the artist to use if the artwork disappears quickly on TWiki.org. The real solution of corse is to enhance the Plugins as described in BlackListPluginDev.

Edit this topic in case you want to see the topic save logs of the spammer and the content of spammed pages.

-- PeterThoeny - 22 Mar 2004

ApprovingRegistrations via email verification would at least make us a less easy target.

-- MartinCleaver - 23 Mar 2004

Just as with normal (email) spam, IP address blocking is a fairly blunt instrument that causes some 'collateral damage', particularly since WikiSpam is so new and not in the AUPs of ISPs yet. Content-based filtering may be the better way to go, as with email spam.

WebLogs have had this issue somewhat longer than TWiki, I think. Movable Type (Perl-based blog software) has an mt-blacklist plugin that addresses this sort of issue by blocking certain keywords and URLs - if it's GPLed and suitable, we could just adapt this.

Alternatively, we could just use SpamAssassin, though I'm not sure how well that would work on web pages rather than emails. It includes a very neat Bayesian filtering approach that gets better over time through learning what is spam and non-spam, but it also works 'out of the box' with many keyword and URL filters that stop spammers straight away.

-- RichardDonkin - 23 Mar 2004

I think it is premature to go for content-based filtering since Wiki spam content is not the same as e-mail spam; we would need to maintain a Wiki spam content database.

I released and installed a new Plugin version at TWiki.org that adds a BANLIST and a WHITELIST. See details in TWiki.BlackListPlugin and Plugins.BlackListPluginDev. It protects against abuse by robots (excessive page access rate) and Wiki spam as seen here on TWiki.org.

With the current configuration anyone is allowed to view topics once every 2 seconds, and to save topics once every 10 seconds. This is in average, measured over a period of 5 minutes. We can fine tune the numbers over time.

Let me know if you want to have your IP address added to the WHITELIST. We don't want to ban very active contributors!

-- PeterThoeny - 04 Apr 2004

This morning we had the first case of automatically added IP address to the BANLIST. Someone with IP address 132.185.132.12 was trying to pull content from TWiki.org in a rapid manner (although the URL was incorrect, using Codev/RcsLiteBugsraw=debug instead of Codev/RcsLiteBugs?raw=debug). A few lines of the access log:

| 06 Apr 2004 - 08:20 | Main.guest | view | Codev.RcsLiteBugsraw=debug |  (not exist) | 132.185.132.12 |
| 06 Apr 2004 - 08:20 | Main.guest | view | Codev.DistributedMembershipAndPreferencesraw=debug |  (not exist) | 132.185.132.12 |
| 06 Apr 2004 - 08:20 | Main.guest | view | Codev.AddLeftMenuraw=debug |  (not exist) | 132.185.132.12 |
| 06 Apr 2004 - 08:20 | Main.guest | view | Codev.WebRssraw=debug |  (not exist) | 132.185.132.12 |
| 06 Apr 2004 - 08:20 | Main.guest | view | Codev.LastDiffFeatureraw=debug |  (not exist) | 132.185.132.12 |

IP range 132.185.0.0 - 132.185.255.255 is used by employees of British Broadcasting Corporation. Someone with this IP address registered at TWiki.org a few days ago using a screen name and e-mail address owiki@owikiPLEASENOSPAM.org.

-- PeterThoeny - 06 Apr 2004

Automated or rapid downloading of content is not a spam issue. It's a "please take it easy on the server" issue. Consequently I don't think banning is the right response; better to just throttle the speed of responses to that IP, for a limited period of time (no need to permanently throttle an IP when the downloading "session" may only last for a day or two).

-- MattWilkie - 06 Apr 2004

Matt is right, it is not a spam issue. Folks on the BANLIST get a friendly note to contact the site admin if one got on the list by error.

The log files on 06 Apr showed that the 132.185.132.12 tried to grab as much content as quickly as possible, stressing the sever. I assumed a deliberate attack since I as the site admin did not get informed of a stress test in advance. In the mean time, MS sent an e-mail to the core team members, indicating that he did it and that it was "intended as a good faith test". Next time I would appreciate if the site admin gets a heads up in advance. His IP address is now removed from the BANLIST.

-- PeterThoeny - 10 Apr 2004

Wouldn't it be a solution to implement a CAPTCHA (Wikipedia:captcha) into the edit window? A user must solve the CAPTCHA to save the page. So no automatic spamming by a computer is possible anymore.

-- BeatDoebeli - 15 Jul 2004

There is a proposal on http://openwiki.com/ow.asp?WikiSpam to gradually introduce captcha to users others opt for. Interesting.

-- MattisManzel - 16 Jul 2004

http://wordpress.org/development/2004/12/fight-spam/ has some plugins the methods from which could be used to combat wikispam.

-- MartinCleaver - 02 Jan 2005

Google proposes a method: SpamDefeatingViaNofollowAttribute

-- ColasNahaboo - 19 Jan 2005

The BlackListPlugin supports now the rel="nofollow" attribute.

-- PeterThoeny - 22 Jan 2005

Wouldn't it be enough to put CAPTCHA in the TWikiRegistration form? I'd hate to put this on each and every topic save, that would be punishing the users for the spam...

-- TorbenGB - 18 Feb 2005

My site (which uses the RegisterCgiScriptRewrite) just got a bogus registration - I suspect that the individual is in the process of automating registration confirmation.

CAPTCHA would be a good idea: I'd welcome and would give support to someone volunteering to do it. I do not have the time needed to implement this myself.

We also need to streamline deleting registration and blocking certain sites/email addresses from registering.

Registration is a workflow that would benefit if workflow handling was in the core. Thomas - is your code ready for primetime? Thanks.

I note that TikiWiki already implements CAPTCHA and that there exists a Cpan module to do it.

http://www.ogre.com/tiki-view_blog_post.php?find=&blogId=3&offset=0&sort_mode=created_desc&postId=96#comments

The CPAN module depends on GD.

-- MartinCleaver - 01 Mar 2005

Wanna find public TWikiInstallations frown Then follow this link to a google search on wiki-spam. They all install email addresses to some account@126.com, an asian website.

Is there any centralized collection of wiki-spammers, i.e. on TWikiSites?

-- MichaelDaum - 31 Mar 2005

At this time we do not collect the Spammer info from the public TWikis.

TWiki.org keeps track of spammers at BlackListLog. I regularily remove accounts with ...@126.com addresses that do WikiSpam to Asian web sites.

-- PeterThoeny - 01 Apr 2005

Did anyone think of mailing abuse@126PLEASENOSPAM.com? They may be willing to have a sharp word or two with their resident spam merchants; most ISPs don't like them, either.

-- CrawfordCurrie - 01 Apr 2005

Additional gear agains WikiSpam:

  1. reject registrations with email addresses matching @126.com (and others) in bin/register
  2. remove the "Comments" field from the registration form in TWikiRegistration
  3. add %REMOTE_ADDR% to the templates/registration.*.tmpl
These are 30sec fixes to your TWikiInstallations (each?).

-- MichaelDaum - 04 Apr 2005

I would like to add, someone has started sending spam email from the disposable http://spamgourmet.org address I use for my Twiki registration. Shortly this address will reach its limit and I shan't recieve any more email via that address, so it's not so much a problem for me, but your other users are probably having their contact addresses harvested in the same way.

Apologies if this is not the correct place to address this issue, I did look around and this seemed the most approapriate I could find.

The abridged headers of the last two follow.

[headers with my address snipped]

Received: from smtp13.eresmas.com (smtp13.eresmas.com [62.81.235.113])
   by gourmet.spamgourmet.com (8.12.11/8.12.11) with ESMTP id j4N96OmJ027392
   for <twiki.5.wu-lee@spamgourmet.org>; Mon, 23 May 2005 02:06:25 -0700
Received: from [192.168.105.171] (helo=ma23.eresmas.com)
   by smtp13.eresmas.com with esmtp (Exim 4.10)
   id 1Da8Q6-0001pd-00; Mon, 23 May 2005 10:36:18 +0200
From: "MR BRUCE SMITH - BsmithRBS@eresmas.com"<+twiki+wu-lee+37876ab41d.BsmithRBS#eresmas.com@spamgourmet.com>
To: BsmithRBS@eresmas.com
Reply-To: "brucesmithrbs@box.az"<+twiki+wu-lee+6a202969b8.brucesmithrbs#box.az@spamgourmet.com>
Message-ID: <33667332ffcf.32ffcf336673@ma23.eresmas.com>
Date: Mon, 23 May 2005 08:36:18 GMT
X-Mailer: Netscape Webmail
MIME-Version: 1.0
Content-Language: en
Subject: LETTER OF INTENT! (twiki: message 4 of 5)
X-Accept-Language: en
Content-Type: text/html; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: 7bit

<table border=0 width="100%" cellpadding="8"  cellpadding="8"><tr><td bgcolor="#ffffff"><SPAN style="FONT-SIZE: 11px; COLOR: #000000; FONT-FAMILY: monospace">
<P><BR><BR>From the office of:Mr. Bruce Smith<BR>ROYAL BANK OF SCOTLAND<BR>United Kingdom.<BR>+447040116575<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <BR>LETTER OF INTENT!<BR>Dear Sir/Madam,</P>

<P>I am Bruce Smith. I am 50 years old,a native of<BR>Scotlland .I got your information while browsing<BR>through internet in search of a reliable person<BR>who will assist me.On 31th of July 2000, an Oil<BR>consultant/contractor, Mr. Kurt Kahle with&nbsp; ROYAL<BR>DUTCH PETROLEUM COMPANY, he made a numbered time<BR>(fixed) deposit for twelve calendar months,<BR>valued at&nbsp; US$20,700,000.00(Twenty Million Seven<BR>Hundred Thousand Dollars ) with ROYAL BANK OF<BR>SCOTLAND branch upon&nbsp; maturity. I sent a rountine<BR>notification to his forwarding address but got no<BR>&nbsp;reply.</P>

[rest of email snipped]

[headers with my address snipped]

Received: from mk-smarthost-3.mail.uk.tiscali.com (mk-smarthost-3.mail.uk.tiscali.com [212.74.114.39])
   by gourmet.spamgourmet.com (8.12.11/8.12.11) with ESMTP id j4KDKubO008640
   for <twiki.5.wu-lee@spamgourmet.org>; Fri, 20 May 2005 06:20:57 -0700
Received: from mk-webmail-1.b2b.uk.tiscali.com ([212.74.112.91]:3206)
   by mk-smarthost-3.mail.uk.tiscali.com with esmtp (Exim 4.30)
   id 1DZ7Oe-0003Au-C8; Fri, 20 May 2005 13:18:36 +0000
Received: from exim by mk-webmail-1.b2b.uk.tiscali.com with local (Exim 4.24)
   id 1DZ7Od-0002dS-Ik; Fri, 20 May 2005 14:18:35 +0100
From: "joykamara1000@i12.com"<+twiki+wu-lee+41c598c79e.joykamara1000#i12.com@spamgourmet.com>
Reply-To: "joykamara1000@yahoo.fr"<+twiki+wu-lee+0158ddd6f5.joykamara1000#yahoo.fr@spamgourmet.com>
Subject: Dear Sir,/Madam  (twiki: message 3 of 5)
Date: Fri, 20 May 2005 14:18:35 +0100
Mime-Version: 1.0
Content-Type: text/plain; format=flowed; charset="utf-8"
Content-Transfer-Encoding: 7bit
Message-Id: <E1DZ7Od-0002dS-Ik@mk-webmail-1.b2b.uk.tiscali.com>

Dear Sir,/Madam
Email   (joykamara1000@yahoo.fr) 

My name is Joy Kamara, 21 years from Sierra Leone. My father and I escaped 
from our country at the heat of the civil war after loosing my mother and
two of my senior brothers in the war. 

[rest of email snipped]

-- NickWoolley - 23 May 2005

TWiki Release 4.0 (DakarRelease) has better protection against e-mail harvesting.

-- PeterThoeny - 23 Jan 2006

See update on WikiSpammersOnTWikiSites, specifically on TWiki.org spamming.

-- PeterThoeny - 23 Jan 2006

A couple of months ago I wrote a little plugin (VisualConfirmPlugin) that asks for visual confirmation when a user registers. It is a bit unpolished yet, and untested with DakarRelease (although that will soon come as i am in the process of upgrading all my twiki installations to DakarRelease).

-- KoenMartens - 09 Feb 2006

Virtuall all TWiki sites I have installed have been attacked with spam that is added to the page during rendering. I'm not quick enough with the Twiki code to know where this is being done. I've looked in the obvious places, perhaps someone can help.

First, let me say that this is NOT editing the text data of the twiki page. It is added just after the BODY tag and some other tags. Here is the section, edited to eliminate the SPAM (and domain names are replaced with redacted to avoid being blacklisted myself!)

<body class="twikiViewPage" onload="initPage()"><a name="PageTop"></a>
<div class="twikiHidden"><a href="#Content">Skip to topic</a> | <a 
href="#PageBottom">Skip to bottom</a><hr /></div><div class="twikiTopBar"><div 
class="twikiTopBarContents"><form name="top" 
action="/twiki/bin/view/Gcca/WebHome"> <u style="display: none;">... no changes ... 
no changes ... no changes ... no changes ... no changes ... no changes ... no 
changes ... no changes ... no changes ... no changes ... no changes ... Have a nice 
day! More links: <a href="http://redacted" target="_top">cheap adipex</a> - <a 
href="http:/redacted" target="_top">generic valtrex</a>: generic valtrex, order 
valtrex, discount valtrex, See also  <a 

... Blah Blah Blah This goes on for about 50 pages.

href="http:/redacted/buy-cheap-valtrex-online/" target="_top">order valtrex</a> here; 

</form></div></div><div class="twikiMiddleContainer"><div class="twikiLeftBar"><div 
class="twikiWebIndicator"><b>Gcca</b></div>

Does anyone know what was modified to allow this to be hacked in? The site is nonfunctional with it in, although it might be invisible if it were added to many typical web pages.

See http://www.gccaweb.org to try the site. The hacked portion makes it so the page will not display but if you inspect the source, you can see what is happening. The text of the page was not altered by this attack, but it apparently affects all pages on the site.

NOTE: My other sites are also hacked but I didn't know it. They were not broken the way this one was. For example, http://www.pack362.org is also hacked, but you can't tell.

-- RaymondLutz - 13 Mar 2006

Thanks for reporting. It looks like they edited your TWiki.WebTopBar and TWiki.WebBottomBar. That will show the spam on every topic.

Spam signatures for BlackListPlugin: eteamz\.active\.com, itsmysite\.com, div-item\.com, zorpia\.com

-- PeterThoeny - 13 Mar 2006

Yes, thanks for your help. Also, they got TWiki.WebLeftBar. You've saved me a lot of time. BlackListPlugin here we go!

-- RaymondLutz - 14 Mar 2006

As of today, TWiki.org's local list is merged into the shared spam list at http://arch.thinkmo.de/cgi-bin/spam-merge. Thanks Thomas Waldman for enabling this! The local spam list is exported to http://twiki.org/feeds/_spam_list.txt

That means, spam signatures I add on TWiki.org will appear automagically on your TWiki site if you are using the BlackListPlugin. There is some delay: 10 min cache on twiki.org, 10 min cache on arch.thinkmo.de, plus 60 min cache by BlackListPlugin (by default.)

-- PeterThoeny - 30 Apr 2006

There is a notorious spammer on TWiki.org: A spammer is trying to add insurance-top\.com to various topics; 16 topics this month. The spammer is using a new IP address each time. Here is the log of the latest attack:

| 29 Jun 2006 - 02:57 | TWikiGuest | view | Main.DwayneCox |  Mozilla | 220.72.196.67 |
| 29 Jun 2006 - 02:57 | TWikiGuest | edit | Main.DwayneCox |  Mozilla | 203.113.20.70 |
| 29 Jun 2006 - 02:58 | TWikiGuest | blacklist | Main.DwayneCox | SPAMLIST add: 82.233.26.21, topic spam 'insurance-top.com' Mozilla | 82.233.26.21 |
| 29 Jun 2006 - 02:59 | TWikiGuest | save | Main.DwayneCox |  Mozilla | 211.219.155.41 |
| 29 Jun 2006 - 02:59 | TWikiGuest | view | Main.DwayneCox |  Mozilla | 59.10.50.60 |

IP addresses used: 194.165.130.93, 200.142.202.140, 201.66.64.10, 207.29.224.155, 211.162.62.161, 213.113.123.15, 217.10.190.36, 218.144.149.8, 220.72.196.71, 220.72.196.76, 221.187.80.175, 58.147.0.35, 68.230.22.107, 70.159.21.50, 80.58.205.40, 82.233.26.21.

The spammer saves the topic twice: Once with spam (which fails), once by adding a "display: none" section, which needs to be cleaned up: <u style="display: none;">... no changes ... no changes ... no changes ... no changes ... no changes ... no changes ... no changes ... no changes ... no changes ... no changes ... no changes ...  </u>

-- PeterThoeny - 29 Jun 2006

Hi... Someone created an account and about 50 porn pages on my inactive Twiki site. Does anyone have a script to roll back all changes made in the last 6 hours? Thanks!

-- CharlesHowes - 04 Jul 2006

Hah. The 50 porn pages were only noticed because someone created an account for themselves, which sent me an email. The rest of the site had been defaced a month before as 'TwikiGuest'. I decided to just restore from backup instead of rolling back the changes, as I use rdiff-backup hourly.

-- CharlesHowes - 04 Jul 2006

Oh yeah, and I changed the site's url, so that the search engines will go away, and the spammers searching for: "TWiki" TWikiGuest backlinks will not find it. Plus robots.txt now says: User-agent: * Disallow: / So we'll see how long it lasts this time.

-- CharlesHowes - 04 Jul 2006

Spammers are getting more active. Here are some stats on attempted spam on TWiki.org, based on number of blacklist log entries with REGEXPIRE or SPAMLIST:

Month Attempts
2006-01 62
2006-02 67
2006-03 94
2006-04 59
2006-05 99
2006-06 226
2006-07 569
2006-08 96
 

-- PeterThoeny - 21 Jul 2006

Unfortunately, this is typical spammer behaviour. The more you fight back, the harder they try. This is not to say that we should give in, but just a sad observation based on experience.

I don't know how many of you run mail servers, but those that do will notice that much or most of their spam comes from home users. This is not because these users are spamming intentionally, mind you, but one way that spammers have adapted is to write viruses or hire virus writers to infect machines. This way they get a highly distributed network of spamming machines, none of which spew much, and none of it comes from them directly. A big, sad sigh. Damn you, Microsoft!

-- MeredithLesly - 21 Jul 2006

I just saw another example of wiki spam or vandalism: the user HanBaochuan has defaced the DownloadTWiki page and replaced it with a lot of chinese text, all of them linking to the same site. I overwrote the changes with the previous version from the topic history, but it would be much nicer if the "More topic actions" link included an option to revert the last changes. This would help to fight against spam by hand when the automatic tools do not detect it.

-- RaphaelQuinet - 07 Aug 2006

Thanks for cleaning up the DownloadTWiki page. I additionally removed the last revisions so his junk is wiped also from the history.

As an admin you can remove the last revision by adding ?cmd=delRev to the edit URL - reload this and then save. You do not have to write anything. But since this action is irreversible it is not open to the general public.

I actually added this to my TWiki

In templates/oopsmore.pattern.tmpl I have added

%IF{"$ USERNAME = 'KennethLavrsen'" then='%TMPL:P{"oopsmoredelrev"}%'}%

And oopsmoredelrev contains

%TMPL:DEF{"oopsmoredelrev"}%
---+++++ %ICON{"trash"}% *%MAKETEXT{"Delete last revision"}%*
   * <b><a href="%SCRIPTURLPATH{"edit"}%/%WEB%/%TOPIC%?cmd=delRev" rel="nofollow">%MAKETEXT{"Delete last revision"}%</a></b>.
     %MAKETEXT{"Only use this feature to remove spam"}%%TMPL:END%

This is a terrible hardcoded (to my name) hack. I miss a context which is that the user is an admin to make this a clean enhancement.

-- KennethLavrsen - 07 Aug 2006

Nice teamwork, from the logs I have seen that the spam got removed after 30 minutes. I also removed the HanBaochuan account.

Actually, it is pretty shortsighted to spam a highly visible topic like DownloadTWiki.

-- PeterThoeny - 07 Aug 2006

I have documented the technique I use for keeping out spam. It's pretty simple: If you want to edit my wiki you have to prove that you aren't a spammer, and I add you to Main.NonSpammerGroup. It's a manual process, and I'm sure some people will "get in" by mistake. However its clean and simple. The problem is that it involves very... specific... steps. Therefore, I have documented in great detail the specific steps to get it working.

http://wiki.everythingsysadmin.com/twiki/bin/view/Main/TomAntiSpamTechnique

Feel free to copy my instructions into the twiki.org website or link to my page.

-- TomLimoncelli - 07 Aug 2006

You have "When the webmaster sees that someone has registered, if we know the person we add them right away. Otherwise, we wait for them to complain". On a usability scale from 0-10 I give that a -1.

It will for sure limit spam but I can assure you it also keeps 90% of potential good users away. When someone registers it is because they want to submit a support question or contribute with something or make a bug report. If they register and then cannot edit right away a major part of them leave and never come back. And that is a high price to pay to avoid spam. Spammers are already annoying as it is.

On some "family" or small community type TWikis this may fly. But on a wiki that supports a large community that grows sometimes with 10-100 each day such manual process is a pain. Both for new users that have to wait 1-2 days sometimes and to the admin that has to spend 5-30 minutes daily approving new users.

-- KennethLavrsen - 07 Aug 2006

I agree, this is a case by case decision. On TWiki.org we need to have a low barrier to contribute ideas, bugs, support questions, doc enhancements. Therefore I prefer to keep twiki.org open to edit to any newly registered user, and to keep the Support and Sandbox webs open to TWikiGuests. Collaboration is fostered, although though this requires more spam monitoring.

-- PeterThoeny - 07 Aug 2006

Traditionnal CAPTCHAs, based on recognition of characters displayed as a picture, have well known accessibility problems : they prevent blind people from registering, and thus are poor Turing-tests. Another type of CAPTCHA solves this issue : asking the solution to a simple algebraic operation both prevents (current) bots to answer, and allow blind users to. The Dotclear blog engine has got a plugin (FR) implementing this. It should be quite easy to implement this as an alternate non-spammer-proofing way into CaptchaPlugin.

-- BenVoui - 29 Sep 2006

This week I got two e-mail on guestbook spam linking to twiki sites. Example:

I am the owner of <site>. This is a member only website designed for the use and enjoyment of the members of my alumni association. On that site I maintain a guestbook, which is also for our members for the purposes of staying in touch. Recently, our guestbook has been inundated with spam from several of your members. Below is a list of those we have receive unsolicited spam from to date.

aids@bcartfj.com
mzkr@syiprfs.com
igde@torxsxc.com
dsiw@mkmddge.com
eyom@niyriwu.com
fepi@ptmeaxd.com
cjyd@lddpglz.com

I would very much appreciate your help in getting the spam from your users discontinued. I do not believe that it is your intent to provide a server for your members to harrass others in this manner.

This puts TWiki in a bad light. I have been sending out over 100 e-mails to TWiki site owners with known wiki spam. What else can we do to combat spam and keep TWiki from getting associated with spam?

-- PeterThoeny - 10 Oct 2006

mmm is this very TWiki site not protected against the "no changes ..." spammer mentioned by Raymond?. Just googled ["no changes" site:twiki.org]. Top link was Main / WebChanges. The version of that page cached by google on October 23, 13:21 GMT, showed that the pages MattWilkie and PeterMarshall had been hit by the spammer under the guise of Twiki Guest...

-- JoseI - 01 Nov 2006

TWiki.org topics are currently spam free from the 'no changes' spammer. There may be a few hours timelag between getting hit by the spammer and cleaning up the trash, so Google might pick up some 'no changes' text. (No harm done besides nuisance since just plain text is able to pass the BlackListPlugin, not links.)

-- PeterThoeny - 01 Nov 2006

One idea for a different and improved version of BlackListPlugin: use the Akismet web service, which is designed for use by blog software and apparently works very well at large blog sites such as GigaOM. They provide a PHP client for Wordpress blogs, and there is already a Perl client, CPAN:Net::Akismet, that has been used for Movable Type plugin (written in Perl).

One drawback is that it is free only for non-commercial use, but that is defined as "making less than $500/month from your blog", so even TWiki.org would qualify. Intranet sites wouldn't want to submit their comments to Akismet.com, but then they probably don't get WikiSpam anyway.

-- RichardDonkin - 24 Nov 2006

I am regularily sending out WikiSpamAlertMessage to TWiki site owners who's TWiki has been spammed. Please help scan the internet and alert people of spam.

-- PeterThoeny - 11 Feb 2007

Realised my site was spammed with hidden links. Updated the pages, but the fact that it still came up in the changes log bugged me (in fact I got a spam email based on the contents of the recentchanges page).

Decided to clear this out. There might be better ways, but this worked for me. First, had to find pages that had references to 'ringtones'. Ran this from a unix prompt, at the top of my twiki dir:

find . -type f -exec grep -il "ringtone" {} \;

then with that list, I'd open each file in vim and find the offending line. Not wanting to blithely delete things, I noticed the spam was wrapped in a 'hidden' style tag. vim has a handy way of removing a whole html/xml tag, the keyboard sequence is 'dat' (Delete A Tag). This removed the whole offending (u) spam (/u) block quickly, save, nice clean history logs now. Hope this helps others...

-- MattEstela - 08 Mar 2007

when I used the blacklist plug-in on a public TWiki site, the hosting company asked me to remove it as it was on the violation of their policy, and harmed their systems.

I want to ask if there is any thing in the code which could harm the system resources (e.g. over use of CPU, memory, etc.)

-- OmarMukhtar - 29 May 2007

I do not know why they are saying that it violates their policy. Please ask them for details. This plugin does not eat additional resources. The only thing I can think of is the 60 sec delay for blacklisted users. You can reduce the sleep time in initPlugin of BlackListPlugin.pm if needed.

-- PeterThoeny - 29 May 2007

Edit | Attach | Watch | Print version | History: r69 < r68 < r67 < r66 < r65 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r69 - 2007-05-29 - PeterThoeny
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.