create new tag
, view all tags

To do: Single Entry Point for System Calls

This is part of the TWikiCodebaseSecurityAudit: Enhance TWiki to have one secure global function to be used by core code and Plugin code to issue system calls.

Spec: TBD

-- PeterThoeny - 14 Nov 2004


There's a countermeasure that's been left out here, a good compromise between the "slow" option at the end and the 'filter out metacharacters' approach.

For some reason, when issues of this kind arise, the first thing somebody seems to think of is "filter out more characters." A moment to think about this. The original bug itself is not only a security issue, but also a usability issue: nobody wants a search to fail just because he's looking for "O'Meara" or "Johnson & Johnson". The "filter out more characters" approach addresses the security issue, maybe completely maybe if you are sure what shell is being used and got the right set of characters, but it leaves the usability issue unfixed; all it does is change the way that failing legitimate searches fail, or maybe even makes more legitimate searches fail.

What you really want to do is quote the search strings so they are passed uninterpreted by the shell. This requires being sure what shell is being used and a clear understanding of its quoting conventions, but is a very good solution to the problem.

In the Bourne shell--and all shells in its family--there are several styles of quoting, and one is especially well suited to this use: the 'single-quote' form. In Bourne syntax between single quotes, absolutely no character is special except the ending single quote. There isn't even a backslash or similar convention to let you include a single quote. The first single quote encountered always ends the quoting form. Nothing else is special. No $ variable substitution. Nothing.

So what do you do if the search string has a single quote in it? Here you need to remember that in Bourne shell syntax, quoted and unquoted runs of characters can be mixed and still be a single argument string as long as there is no unquoted white space. Outside of single quotes, you can use \' to represent a literal single quote. So the trick to quoting a string that may contain single quotes is to replace each one with these four characters: '\''

After doing that replacement, just put single quotes around the whole string. The magic four characters are: one ' to end the single-quoted form, one \' outside of single quotes, to be the literal single quote, and finally another ' to resume single-quoting for the rest of the string.

Nothing else in the string needs to be changed. You just global-substitute every ' to '\'' and put single-quotes around the whole thing, and you have a properly-quoted Bourne shell string, no matter what characters are in it.

Now to use this technique, you need to be sure the shell Perl is using to execute your command is a Bourne-family shell, and you had better understand exactly what quoting Perl itself is doing to turn your backticked Perl expression into a shell command. It is probably safer to use Perl OS calls to fork and exec the shell directly, and know exactly what you are passing to it. But this should not be hard (I don't write much Perl myself so I'd have to look up the details), and any system that has shells at all ought to have at least one Bourne-family shell probably called /bin/sh.

And then you have to be sure that what you're running in the shell isn't, say, a script that might eval its arguments and require further quoting; but in this case you're probably running a command like

fgrep 'pattern' *

and that will pose no problem. So this should not be a change much more complicated than adding more metacharacter filtering, but it is more reliable and also supports perfectly legitimate searches that contain metacharacters and would have failed before.

-- ChapmanFlack - 13 Nov 2004

I quite agree - and was surprised to discover that we were not quoting out search strings. Though, on thinking about it, I think that code change was in some work on spaced wikiwords that was defered from Cairo.... I'll look later - for now, i'm using this simplistic fix, and for the future, I'd like to gave all shell calls go through one function, rather than the adhoc way we do it now..

-- SvenDowideit - 13 Nov 2004

Why is a shell involved at all? Why not just invoke fgrep directly and avoid any shell parsing issues?

-- KennethPorter - 13 Nov 2004

Kenneth has a good point. There is no reason at all why the shell should be involved. We have the absolute path of the grep command, we are expanding the list of files to grep for within TWiki::Search. There is no need for the shell.

There in TWiki::Search::_searchTopicsInWeb() (line 223 of the unpatched Sept2004) the grep command is executed as a back-tick.

Camel-3 pp 566-568 warns about the risks of this and illustrates the use of the exec() function so as to avoid the use of shell. Look at pp 568 in particular.

The shell is not used to to expand the set of topics to search. That is done withn Search.pm since it needs to deal with, for example, excludetopic as well. So Kenneth is right, there is no need for the shell. The absolute path of the grep is known so exec() can be used. A fork/exec is a reasonable solution.

But here's the kicker. You might still need to check for metacharecters. At the moment, see line 223 again, TWiki::Search::_searchTopicsInWeb() build a single string for the back-tick. If we move this straight into the exec() we will still get the shell interpreting it!

What we need is somethng like ...

  $result = .... do the fork for the pipe 
       exec  $TWiki::egrepCmd or $TWikifgrepCmd \
             ... cmd arguments ... \
             ... @aset

... very roughtly.

At the moment, as you can see from the code immediate before line 223, all these componenets are squashed up into one scalar -- $acmd.

I see, again in Camel-3, that there is shorthand for this (pp 568):

   defined ($pid = open(SAFE_GREP, "-|", "TWiki::egrepCmd", "$grepParameters", "@aset") or die "...." ;

   if ($pid) {  # parent, do the read
       $result = read from SAFE_GREP ;
   } else {
      # paranoid error handling !
      # should never come here

All pretty much "by the book". It avoids using the shell, but is more than simply inserting the 'filter out shell metacharecters' fix. it allows "Proctor & Gamble" and the like. Tes, it will require some code re-arangement, but comapred to some patches it is quite minor.

Can we revisit this simplistic patch, treat it as a temporary band-aid, and not let it go into future releases.

And as Sven says, do it systematically and localise the functionality.

-- AntonAylward - 13 Nov 2004

I see the backtick operator used in two other places: Invocation of rcs (Store/RcsWrap.pm, UI/Manage.pm), and invocation of dirname in the Comment plugin (Plugins/CommentPlugin/build.pl). Not only should switching to exec make the system more secure, but it also improves performance and reduces memory demand because the system doesn't have to load another process to parse the command.

-- KennethPorter - 13 Nov 2004

please use TWikiCodebaseSecurityAudit to facilitate and track the effort to work on this. I would like to ask anyone with perl / CGI security experience (or desires to have such experience) to add their ideas, experiences, and code - it would be good for soneone to take on this work to make it happen in DEVELOP in parallel to the performance work, as it is incredibly important that we get this done ASAP.

-- SvenDowideit - 14 Nov 2004

Can someone explain what _searchTopicsInWeb is supposed to accomplish? A good first step is to flesh out the pod documentation at the top of the routine. (I'm new to TWiki. I'm not a professional Perl coder but have a fair academic understanding of the language.) Knowing the intent of the routine, a better implementation might be more obvious to contributors who aren't so "close to the code".

I don't see why fgrep is needed at all, given that Perl is a superset of grep. (Yeah, I know that's a vast understatement.)

-- KennethPorter - 14 Nov 2004

Reason for calling fgrep: Performance, performance, performance. One external fgrep that scans hundreds of files if much faster then scanning files in perl one by one.

-- PeterThoeny - 14 Nov 2004

I have been using for some years now:


=head2  start_job




    Starts the job without invoking the shell. Each option has to be a
    single element of the list passed.


sub start_job {
    my (@command_args)=@_;
    if (open(JOB, "-|")) {
   while (<JOB>) {
       push(@result, $_);
   close JOB;
    } else {
   exec @command_args;

I cannot remember, from where I got this code. I think I got the idea from the man-pages. Currently I am reading Perl Cookbook 19.6 and trying to get into fork

The difference I see compared to the example from camel-3 is the die on unsuccessful open. I guess combining both is what we need. Is there a need for the return value of the program which gets called?

-- FrankHartmann - 14 Nov 2004

Why not just use the --file=FILE argument to fgrep and just use a temporary file, named pipe, or /dev/fd/FD_NUMBER and keep the thing from ever appearing on the command line AT ALL? You wouldn't even need to care about how long it is. NEVER let the user control the command line if there is another way.

-- DavidAndersen - 15 Nov 2004

look into perldoc -f system; using something along the lines of system { $TWiki::fgrepCmd } 'fgrep', @fgrepParameters. to avoid the shell, call system with a list of arguments; then you're safe from the shell expanding wildcards or splitting up words with whitespace in them.

-- WillNorris - 15 Nov 2004

The die("..."); is just one of many ways of error handling. I also illustrate and if .. then .. else ... above. If you see so inclined you could also user a try { ...}


-- AntonAylward - 15 Nov 2004

perldoc -f open shows this idiom, a variation on start_job above:

open(FOO, '-|') || exec 'cat', '-n', $file;

-- KennethPorter - 15 Nov 2004

I have looked at RcsWrapDotPm (DEVELOP) a bit and tried to convert all backticks into calls to start_job Basically:

   $rcsout = `$cmd`


   $rcsout = start_job(split ' ',$cmd);

I found that:

  • the return value of the shell command is expected in
  • and IO redirection
    is used.

I am not convinced that both work using start_job but found no clear indications in the docs.

-- FrankHartmann - 15 Nov 2004

This patch is against TWiki20041030beta.zip:

--- Search.pm.000       2004-10-21 11:41:14.000000000 -0700
+++ Search.pm   2004-11-15 12:38:38.180506940 -0800
@@ -193,14 +193,13 @@ sub _searchTopicsInWeb
         unless( $theScope eq "topic" ) {
             # Construct command line with 'grep'.  I18N: 'grep' must use locales if needed,
             # for case-insensitive searching.  See TWiki::setupLocale.
-            my $cmd = "";
-            if( $theType eq "regex" ) {
-                $cmd .= $TWiki::egrepCmd;
-            } else {
-                $cmd .= $TWiki::fgrepCmd;
-            }
-            $cmd .= " -i" unless( $caseSensitive );
-            $cmd .= " -l -- $TWiki::cmdQuote%TOKEN%$TWiki::cmdQuote %FILES%";
+            my @cmd;
+            my $grepCmd = ( $theType eq "regex" )
+                          ? $TWiki::egrepCmd
+                          : $TWiki::fgrepCmd;
+            push( @cmd, $grepCmd );
+            push( @cmd, "-i") unless( $caseSensitive );
+            push( @cmd, ("-l", "--") );

             my $result = "";
             if( $sDir ) {
@@ -215,14 +214,25 @@ sub _searchTopicsInWeb
             my @set = splice( @take, 0, $maxTopicsInSet );
             while( @set ) {
                 @set = map { "$_.txt" } @set;                      # add ".txt" extension to topic names
-                my $acmd = $cmd;
-                $acmd =~ s/%TOKEN%/$token/o;
-                $acmd =~ s/%FILES%/@set/o;
-                $acmd =~ /(.*)/;
-                $acmd = "$1";                                      # untaint variable (FIXME: Needs a better check!)
-                $result = `$acmd`;
-                _traceExec( $acmd, $result );
-                @set = split( /\n/, $result );
+                my @acmd = @cmd;
+               # untaint token
+                $token =~ /(.*)/;
+                $token = "$1";
+                push( @acmd, $token );
+               # untaint filename list
+                foreach (@set) {
+                    /(.*)/;
+                    $_ = "$1";
+                }
+                push( @acmd, @set );
+                open( RESULT, '-|' ) || exec @acmd;
+                @set = ();                                         # reset list for output
+                while (<RESULT>) {
+                    chomp;
+                    push( @set, $_ );
+                }
+                close( RESULT );
+                _traceExec( join( ' ', @acmd ), join( ' ', @set ) );
                 @set = map { /(.*)\.txt$/; $_ = $1; } @set;        # cut ".txt" extension
                 my %seen = ();
                 foreach my $topic ( @set ) {

-- KennethPorter - 15 Nov 2004

Note that this assumes that $TWiki::egrepCmd is the path to an executable, and does not contain both a program name and initial options. So examples in WindowsInstallCookbook will need to be rewritten.

-- KennethPorter - 15 Nov 2004

split ' ',$cmd is a hack for testing, right? I don't think you can safely count on using space as a token delimiter. Instead, replace the code that assembles a command line with code that assembles an array.

Also see the security notes for perldoc -f exec. The exec should use the indirect object syntax:

exec { $command_args[0] } @command_args;

-- KennethPorter - 15 Nov 2004

split: Yes there has to be a cleaner solution.

Indirect object syntax is only needed when list has only one element, isn't it? Yes this should be done to be on the safe side.

I am currently perlsick(5.8.4) as it does not behave as documented: I am trying to find the exit value of the called command. My man perlfunc states that $? will be set by exec, but it isn't.

-- FrankHartmann - 15 Nov 2004

In all the cases used by TWiki to date, the list has multiple elements. But if start_job is to be the general API for launching processes, it should use the indirect object syntax to avoid a gotcha for a future developer.

-- KennethPorter - 16 Nov 2004

I'd also recommend a different name, like run_process. start_job suggests an asynchronous process that is joined to later.

-- KennethPorter - 16 Nov 2004

Related: UseASubroutineForAllSystemCalls

-- MartinCleaver - 16 Nov 2004

Check PurePerlBackendForSearch for a comparison between the current (non-patched, fgrep-based) SEARCH and a pure Perl implementation.

-- RafaelAlvarez - 19 Nov 2004

I have tried to get a prove of concept implementation using RcsWrapDotPm as victim using above discussion as start point.

Unfortunately things are as usual not easy and I need help to get the thing debugged and running. I have a attached a patch RcsWrap.patch against DEVELOP. It has the following properties

  1. It will make your DEVELOP install non-functional.
  2. it only addresses the usage of backticks in RcsWrapDotPm
  3. it will modify your TWiki.cfg. Please backup before and merge your settings after applying the patch.

-- FrankHartmann - 19 Nov 2004

When issues of this kind arise, another thing that always gets mentioned is "and we need to be sure which shell gets called".

system(3) quite clearly states that /bin/sh -c will be called. Just look up the man page!

Of course, perl might use execve instead and pass a different shell, but that's theory. Some perldoc page should have the answer; I am just not sure which, being a C programmer, not a perl hacker.

-- EberhardSchulz - 29 Nov 2004

"system(3) quite clearly states that /bin/sh -c will be called. Just look up the man page!"

Sure. Now, what is /bin/sh? I have worked on systems where /bin/sh has been (a) the Bourne shell, (b) the Korn shell, (c) bash, (d) csh flavours.

-- AndyGlew - 27 Dec 2004

The best approach is to do a safe pipe open, in some cases emulated using fork+exec since not all older Perl versions have this. I have a version working based on code from Florian Weimer, but I would really help from anyone who has got ActivePerl working for fork/exec, reading from child process.

-- RichardDonkin - 31 Dec 2004

The TWiki Sandbox (DEVELOP branch) contributed by FlorianWeimer implements a single entry point for system calls. Ready for merge, subject to RichardDonkin being satisfied that it works on all platforms..

-- CrawfordCurrie - 13 Feb 2005

Richard: I think I finally have my setup spinning along happily. It is as follows:

So I think I'm in a pretty good position to test anything you might want to. Unfortunately, I'm right in the middle of getting TWiki ready to pitch to my boss to see if I'll be able to really dedicate much time to the project. So I'm afraid I can't exhaustively test the DEVELOP branch right now (maybe in a week or two). But if you have any short examples you want me to try with my current set up, I'd be game.

-- JasonPierce - 14 Apr 2005

Jason stated on TWikiIRC that he did not make any code changes to get this working.

-- CrawfordCurrie - 28 Apr 2005

Only one code change:



  my $cmdQuote = "'";
  my $cmdQuote=$TWiki::cmdQuote;

-- JasonPierce - 02 May 2005

Hmmm - since that isn't in the DevelopBranch, I can only assume you haven't done any testing there. Oh well; I guess that since ThomasWeigert is using a windows box and hasn't reported any sandbox related problems (unless the RcsWrap problem is sandbox related).....

  • The RcsWrap problem was related to some code that was not updated when moving to Dakar style. I have fixed that. I now have no problems with RcsWrap (but remember, I am using cygwin... I have not tested other Windows installations) --TW

-- CrawfordCurrie - 03 May 2005

Haven't heard any reports to the contrary, so have to assume this works. Needs to be covered by Bugs:Item44

-- CrawfordCurrie - 19 Jun 2005

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatpatch RcsWrap.patch r1 manage 18.0 K 2004-11-19 - 20:33 FrankHartmann WARNING read topic before applying
Unknown file formatgz RcsWrap.patch.gz r1 manage 4.6 K 2004-11-22 - 22:03 FrankHartmann WARNING read topic before applying
Edit | Attach | Watch | Print version | History: r30 < r29 < r28 < r27 < r26 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r30 - 2005-06-19 - CrawfordCurrie
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.