NoSpamWebPreference < Codev

Tags: view all tags

Instead of putting $noSpamPadding in TWiki.cfg, it is handy to put this in the normal configuration system. This way it can be overridden per-web. I maintain a system that includes both public and private webs. In the private web (requires authorization to run view), it's handy to be able to turn off the NOSPAM stuff, while still leaving it on in Main.

The change is trivial, but does somewhat break backward compatibility, so it's worth thinking about how to roll it out.

diff -cr lib/TWiki.cfg /users2/rnapier/WWW/twiki/lib/TWiki.cfg
*** lib/TWiki.cfg       Wed Dec  5 12:01:22 2001
--- /users2/rnapier/WWW/twiki/lib/TWiki.cfg     Mon May 20 13:55:26 2002
***************
*** 109,123 ****
  #                   Mail program used in case Net::SMTP is not installed.
  #                   See also SMTPMAILHOST in TWikiPreferences.
  $mailProgram      = "/usr/sbin/sendmail -t -oi -oeq";
- #                   Prevent spambots from grabing addresses, default "".
- #                   I.e. set to "NOSPAM" to get "user@somewhereNOSPAM.com"
- $noSpamPadding    = "";
  #                   Pathname of mime types file that maps file suffixes to MIME types :
  #                   For Apache server set this to Apache's mime.types file pathname.
  #                   Default "$dataDir/mime.types"
diff -cr lib/TWiki.pm /users2/rnapier/WWW/twiki/lib/TWiki.pm
*** lib/TWiki.pm        Mon Dec  3 22:59:42 2001
--- /users2/rnapier/WWW/twiki/lib/TWiki.pm      Mon May 20 13:57:18 2002
***************
*** 1837,1843 ****
  {
      my( $theAccount, $theSubDomain, $theTopDomain ) = @_;

!     my $addr = "$theAccount\@$theSubDomain$TWiki::noSpamPadding\.$theTopDomain";
      return "<a href=\"mailto\:$addr\">$addr</a>";
  }

--- 1842,1850 ----
  {
      my( $theAccount, $theSubDomain, $theTopDomain ) = @_;

!     my $noSpamPadding= &TWiki::Prefs::getPreferencesValue("NOSPAMPADDING", $webName );
!
!     my $addr = "$theAccount\@$theSubDomain$noSpamPadding\.$theTopDomain";
      return "<a href=\"mailto\:$addr\">$addr</a>";
  }

Then add the following in TWikiPreferences:

Prevent spambots from grabing addresses. (Set to "NOSPAM" to get "user@somewhereNOSPAM.com")
- Set NOSPAMPADDING = NOSPAM

-- RobNapier - 20 May 2002

This looks like a good idea - see also SpamProofing for areas where the administrator's email is not spam-proofed (the robots.txt attached to RobotsBlackList should help, by disallowing all scripts other than view and viewfile, but it is probably optimistic to expect spambots to respect this file...)

One other issue with this and the current spam padding - using user@somewhereNOSPAMPLEASENOSPAM.com puts an increased load on the DNS root servers, whereas user@NOSPAMPLEASENOSPAM.somewhere.com puts a load only on the somewhere.com DNS servers. It's hard to tell how big an issue this is, but putting an additional load on the root servers seems like a bad idea, as there are only 16 or so of them.

-- RichardDonkin - 21 May 2002

The reson why I put anti-spam setting in TWiki.cfg is security. With preferences it is easy to forget to put them into the FINALPREFERENCES of TWikiPreferences or the WebPreferences. It is easy to circumvent the setting if you forget to do so. Also, TWiki.cfg settings are more secure because you need to have shell access to change it; this is more secure provided that you allow basic authentication for TWiki topic editing but require SSH log in to the shell.

Intersting question, load on DNS root servers vs load on DNS server of regular domains. Would user@NOSPAM.somewhere.com work? If I understand correctly, there are DNS servers that map user@subdomain.somehwere.com automatically to user@somehwere.com. That means, they would get the spam anyway.

-- PeterThoeny - 21 May 2002

I don't understand why security is an issue here. Anyone who could change the preferences already knows that they're on a twiki (i.e. it's not your standard harvester script). Once you know your trying to harvest from a twiki, removing the NOSPAM part is trivial (more discussion below), so there's no reason to bother messing with the configuration. Is the fear a malicious user who wants to open up the system to harvesters, but not harvest the addresses himself? A user actively trying to cause trouble would have such an easy time of it with a twiki, that the idea of turning off spam protection seems pretty minor. It sounds like a good idea to add this to the default TWikiPreferences and WebPreferences FINALPREFERENCES, though, to reduce the issue. Admins like myself can always remove it and maintain our own FINALPREFERENCES, but the normal users would be protected.

If the concern is security of the addresses from determined harvesters, then you need to mangle the address a lot more than this. If a harvester wants to get purposely obscured addresses, it's nothing to add some logic that will slice through this mangle and most other common ones:

s/[-+_.]?[A-Z]{2,}[-+_.]?//;    # Pulls out all-caps stuff like NOSPAM
s/[-+_.]?\w*spam\w*[-+_.]?//i;  # Pulls out _nospam or _spambad

I would run through each of these filters and if they pass, then add both the original and demangled versions to my list. Another quick piece of code would ditch everything but the final two part after the @ (so that @nospam.employees.org would be employees.org). Another quick test would check the final component and make sure it was legal (.org, .net, .com, and the like), and if not look through the string to find where the "real" end of the string is (so that employees.org.dontmailme would be fixed up). Pretty much anything that's easy for humans to figure out is going to be pretty easy to demangle.

My point of all this is that I'm not sure what attacker this security feature is trying to protect against, and whether it's worth the loss of functionality to protect against that attacker. I think any attacker that is thwarted by putting this in the .cfg file will easily bypass the whole system anyway.

BTW, the best mangle I've seen was the following, but you can see how hard it is to use for humans, too:

rnapiershorts@employees.org -- To mail me, remove the pants

-- RobNapier - 21 May 2002

I believe that spam harvesters avoid email addresses with the word 'spam' included, because they are likely to belong to people who are very anti-spam and will cause more hassle for the spammers. So the current SpamProofing is probably OK.

I'm not sure about foo@SPAM.somewhere.com being mapped to foo@somewhere.com - this would imply there's an MX record for the SPAM.somewhere.com address, and since each MX record has to be set up explicitly AFAIK, this seems unlikely.

-- RichardDonkin - 22 May 2002

Most SPAM harvesters are smart enough to handle the very simplistic email address name mangling done by TWiki or any of the proposed variations here. It's better than nothing, but why not solve the problem using the best known method available today? To protect an email address from SPAMmers, the best way is to not have it show up anywhere in the HTML code in the first place!

This can be done with a server side script looking up emails addresses keyed from a TWikiUserName. The page would show, in place of an address, a link something like "Click here to Email SoAndSo". An HTML form would appear. This form would allow the user to enter their email address/subject/text and click submit. The server script would then take over. WebNotify would have to be reworked. (NOTE: don't use formmail as the basis for this server side script - its security sucks and requires the address within the HTML anyway.)

Of course, this doesn't protect someone from manually send you SPAM or even writing a hack to use the form. But, so far, SPAMmers aren't really doing this. Besides, if you find someone did write a hack, changing the form layout would most likely break it easily and the SPAMmer still wouldn't have your email address. Additionally, you could set it up so the only email type you could get from the form would be text/plain with no attachments - A big security plus.

A nice byproduct of doing it this way would be that TWiki's core functionally of presenting pages would no longer be dependent on the client's email program to send email in these circumstances. This might simplify its porting to allow presentation on other types of devices such as a WAP enabled cell phone.

[ TomKagan 30 May 2002 ]

I'm not sure that all spam harvesters are that good - while they could easily remove the anti-spam munging, I heard that they just automatically deleted such addresses to avoid getting even more complaints, since people with spamproofed addresses are often quite active spam-reporters. Do you have any references to how spam harvesters work in this respect?

The server side script is an interesting idea, and might be a good option for people who prefer this approach. It would be less convenient, though, since mailto: pops up the user's normal email client. WebNotify should not require any re-work, since the mailnotify script works off the raw page before it is rendered - the only rework would be when normally rendering a page with mailto: links, to turn them into links to the server-side script.

-- RichardDonkin - 31 May 2002

For Intranet usage you often (usually?) do want to invoke the users email program. Additionally SPAM harvesters are not a problem. Given TWikiMission puts Intranets first I don't think avoiding harvesters should present this. However, it wouldn't be hard to alter the behaviour, initially a plugin could be used.

-- JohnTalintyre - 31 May 2002

While I understand TWiki's mission is geared toward the Intranets, just remember that companies are already making their Intranets accessible to the Internet. It would be passworded, sure. But, sooner or later, the "boss" will recognize the value in opening up more and more of the site to larger and larger audiences.

SPAM harvesters are getting smarter all the time. The SPAM sent is also getting more clever. This causes more people to, at least, open the mail. Since some virus are now turning up using regular HTML email exploits, Just reading the email can infect your machine without opening an attachment.

Don't forget about malicious people. Someone with a grudge against you could easily interpret a munged address and sign you up for tons of junk mail. (Actually, a grudge isn't necessary. Anyone who has seen the "frat-boy" mentality at many financial brokerage firms knows the depths of stupidity to which people will sink just to pull off a practical joke. smile )

Additionally, a trojan horse payload attempting to deliver a worm or virus couldn't care less if you tried to report the sender. Considering the nature of the email worms, it doesn't do much good to report some poor schlep who happens to have an infected machine, anyway.

A small list of best practices for web masters/web designers: http://www.bestprac.org/principles/wmd.htm

A short article on how SPAM harvesting works: http://www.private.org.il/harvest.html

If you still just want to munge the email addresses, at least get it right. Please note that some of the methods of munging the email address proposed above in this topic are considered no-nos. But, here is an article on how to munge an email address the right way: http://members.aol.com/emailfaq/mungfaq.html

To mention two things from the article in relation to what is being proposed on this topic: 1) SoAndSo@nospam.twiki.org does not stop delivery. The worst case is an email is sent to the domain's mail server upon which delivery is attempted to for a couple of days (I tested this). This tactic could be used for a DOS attack which would clog the mail server with undeliverable messages. 2) If you worry about causing overload on root servers (which, generally shouldn't be an issue), the only legitimate way for munging an address is to set the top level domain to ".invalid" This is a standard .TLD for sending DNS queries to the bit bucket.

You might be right about WebNotify not needing rework. Denying view access might be all that is needed to protect those email addresses contained there. However, personally, I never liked the idea of duplicating your email address on those pages anyway. If you already put the email address on your personal page, why does it need repeating anywhere else in any of the webs? Information on a TWikiUser doesn't belong anywhere else except on their personal page with their Signature as the key to look it up.

I still say that not sending the email address to the client browser in any form (munged or not) is the best way available to protect a user's privacy. And, if that's not enough for you to, at least, consider moving away from munging, you may want to consider that many ISPs actually have rules against munging a user's email address. It might be hard to munge in a way which does not violate any of the policies of your service provider. And what would be a good munge for one ISP's rules might not be okay with another ISP. Even with your email address safely tucked away and munged beyond recognition on a private Intranet, what will your ISP do to yours or the sender's account when, inevitably, someone mistakenly hits SEND without demunging the address? Just something to think about.

[ TomKagan 31 May 2002 ]

Interesting points, and thanks for the links. Looks like the existing munging is OK, as far as it goes, but a server-side mechanism using the TWiki plugin would be useful. Your point about intranets is important - quite a few intranet servers may be opened up as extranet servers as part of communicating better with customers and partners, so it makes sense to provide better spam-proofing as an option.

As for WebNotify, there is already a patch that removes the email addresses from this page, so you just put your WikiName on the page to subscribe to updates.

By the way, I have just implemented wpoison on my site to poison spam email harvesting robots - works pretty well, see http://donkin.org/bin/view/Main/SpamAssassin to try it out.

-- RichardDonkin - 31 May 2002

WebForm
TopicClassification	FeatureEnhancementRequest

Topic revision: r10 - 2002-05-31 - RichardDonkin

Account
- Log In
- Register User

Edit
Attach

Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.