How to Run a Long Background Process in a Web App

2011-09-07 - 05:48:08 by PeterThoeny in Development

Get Involved!

TWiki is an open source project with 10+ years of history, built by a team of volunteers from around the world, and used by millions of people in over 100 countries. The community is focusing on building the best collaboration platform for the workplace. We invite you to get involved!

What is TWiki?

A leading open source enterprise wiki and web application platform used by 50,000 small businesses, many Fortune 500 companies, and millions of people.

Learn more.

Web applications need to respond quickly to user actions. From a usability point of view, anything that takes longer than a second or two will distract the user from the task at hand. What can be done if a process takes longer, or much longer? When I implemented the BackupRestorePlugin I realized that taking a backup of a midsize TWiki site can take many minutes, which is too long for a web page to load - the browser might even time out.

Here is what should happen as seen by the administrator who takes a backup:

User sees a list of existing backups
User presses a [ Create backup ] button to start a new backup
User sees a "creating backup now" message with a [ Cancel ] button and some visual clue that work is in progress.
Once the backup is done, the newly created backup is shown in the list of existing backups

Screenshot of Backup & Restore Console, showing the "creating backup now" message:

Technically, the following needs to happen:

User sees a list of existing backups:
- done with a dynamic query that shows all backups
User presses a [ Create backup ] button to start a new backup:
- a daemon (background process) is started, and...
User sees a "creating backup now" message:
- ... page reloads with the "creating backup now" message
- check progress with a timed Ajax call.
- restart timer to check progress again in case backup is still going on, else...
Once the backup is done, the newly created backup is shown in the list of backups:
- ... page reloads with the dynamic query that shows all backups,
- ... plus any errors that might have occurred during backup

The following part has hands-on Perl and JavaScript code to explain how to run a long background process in a web application. Although it uses TWiki as an example, it is generic enough to be used in any web application programming environment (CGI, mod_perl, etc.). Since TWiki is written in Perl, I looked for suitable technologies, including CPAN modules:

Run daemon (background process):
- I found CPAN:Proc::Daemon to be suitable to run a background process. Although there is one disadvantage: It does not run on Windows.
Capture output of daemon process:
- The STDOUT and STDERR of the daemon script (and other script that it calls in return) need to be captured so that error messages can be shown properly by the CGI script. I decided to use CPAN:IO::CaptureOutput.
Ajax call to check backup status:
- I could have used jQuery or another JavaScript library. I opted for some simple homegrown JavaScript code to do the Ajax calls because I designed the plugin to run on old TWiki installations and I did not want to increase the dependencies.

Run daemon (background process)

The CPAN:Proc::Daemon module is well suited to the task of running a background process in a web application. Here is the basic Perl code:

    use Proc::Daemon;
    # build backup daemon command
    my $cmd = $this->{Location}{BinDir} . "/backuprestore create_backup $fileName";
    my $daemon = Proc::Daemon->new(
        work_dir     => $this->{Location}{BinDir},
        child_STDOUT => $this->{DaemonDir} . '/stdout.txt',
        child_STDERR => $this->{DaemonDir} . '/stderr.txt',
        pid_file     => $this->{DaemonDir} . '/pid.txt',
        exec_command => $cmd,
    );
    # fork background daemon process
    my $pid = $daemon->Init();

We initialize the daemon process, specifying the working directory, the STDOUT and STDERR file redirects, the PID (process ID) file, and the command to execute. The daemon command is backuprestore with parameters create_backup $fileName. That is basically it!

There was one complication though. The backuprestore script can be called as a CGI script and as a command line script. The script determines the mode based on environment variables. Even though the daemon is called as a command line script, it erroneously considered itself called in CGI mode. This was because the forked process inherits the environment variable of the CGI script. To work around this issue we explicitly delete the environment variables that determine CGI mode before spawning the daemon process. Revised code:

    my $SaveGATEWAY_INTERFACE;
    if( $ENV{GATEWAY_INTERFACE} ) {
        $SaveGATEWAY_INTERFACE = $ENV{GATEWAY_INTERFACE};
        delete $ENV{GATEWAY_INTERFACE};
    }
    my $SaveMOD_PERL;  
    if( $ENV{MOD_PERL} ) {
        $SaveMOD_PERL = $ENV{MOD_PERL};
        delete $ENV{MOD_PERL};
    }
    use Proc::Daemon;
    # build backup daemon command
    my $cmd = $this->{Location}{BinDir} . "/backuprestore create_backup $fileName";
    my $daemon = Proc::Daemon->new(
        work_dir     => $this->{Location}{BinDir},
        child_STDOUT => $this->{DaemonDir} . '/stdout.txt',
        child_STDERR => $this->{DaemonDir} . '/stderr.txt',
        pid_file     => $this->{DaemonDir} . '/pid.txt',
        exec_command => $cmd,
    );
    # fork background daemon process
    my $pid = $daemon->Init();
    # restore environment variables
    $ENV{GATEWAY_INTERFACE} = $SaveGATEWAY_INTERFACE if( $SaveGATEWAY_INTERFACE );
    $ENV{MOD_PERL}          = $SaveMOD_PERL if( $SaveMOD_PERL );

Capture output of daemon process

The Proc::Daemon module takes care of capturing STDOUT and STDERR to redirect to a file. However, CPAN:IO::CaptureOutput is needed if the daemon in turn calls external scripts (such as the zip command) and we want to capture its output. Sample _createZip Perl method:

    use IO::CaptureOutput qw( capture_exec capture_exec_combined );
    sub _createZip {
        my( $this, $name, $baseDir, @dirs ) = @_;
        chdir( $baseDir );
        my $zipFile = "$this->{BackupDir}/$name";
        my @cmd = split( /\s+/, $this->{createZipCmd} );
        my ( $stdOut, $stdErr, $success, $exitCode ) = capture_exec( @cmd, $zipFile, @dirs );
        if( $exitCode ) {
            $this->_setError( "ERROR: Can't create backup $name. $stdErr" );
        }
    }

The capture_exec() function executes a zip command and returns the exit code alongside STDOUT and STDERR. An error is set in case the exit code is not zero.

Ajax call to check backup status

As mentioned before, I use some homegrown code to avoid dependency on other libraries. The following JavaScript code to poll for backup status is placed into the TWiki page that shows the "creating backup now" message:

    <script type="text/javascript">
    function ajaxStatusCheck( urlStr, queryStr ) {
      var request = false;
      var self = this;
      if (window.XMLHttpRequest) {
        self.request = new XMLHttpRequest();
      } else if (window.ActiveXObject) {
        self.request = new ActiveXObject("Microsoft.XMLHTTP");
      }
      self.request.open( "POST", urlStr, true );
      self.request.setRequestHeader( "Content-Type", "application/x-www-form-urlencoded" );
      self.request.onreadystatechange = function() {
        if (self.request.readyState == 4) {
          if( self.request.responseText.search( "backup_status: 0" ) >= 0 ) {
              var url = '%SCRIPTURL{view}%/%WEB%/%TOPIC%';
              window.location = url;
          } else {
              checkStatusWithDelay();
          }
        }
      };
      self.request.send( queryStr );
    };
    function checkStatusWithDelay( ) {
      setTimeout(
        "ajaxStatusCheck( '%SCRIPTURLPATH{backuprestore}%', 'action=status' )",
        2000
      );
    };
    checkStatusWithDelay();
</script>

Reading from the bottom up, the checkStatusWithDelay() function is called at the time of page display. This function starts a timer that calls ajaxStatusCheck() after a 2 seconds delay. The ajaxStatusCheck() functions does an Ajax call to the backuprestore script, passing along parameter action=status. On readyState == 4, e.g. on successful return, the text is analyzed for the status. The script returns backup_status: 0 if the daemon is no longer running. In this case, window.location is set to load a new web page. Else, checkStatusWithDelay() is called to start the timer again.

The backuprestore script calls the following Perl method when action=status parameter is specified:

    sub _daemonRunning {
        my( $this ) = @_;
        my $pid = _untaintChecked( _readFile( $this->{DaemonDir} . '/pid.txt' ) );
        if( $pid && (kill 0, $pid) ) {
            return 1;
        }
        return 0;
    }

The _daemonRunning method reads the pid.txt file containing the process ID. A kill 0, $pid is issued to test if the process is alive.

I hope this advanced web application topic is useful for your own projects. To learn more I invite you to download the BackupRestorePlugin and to examine the code.

Comments

I will have a session on this topic tomorrow Saturday 09:45am at the Silicon Valley Code Camp, http://www.siliconvalley-codecamp.com/Sessions.aspx

, room 4203. The SVCC is a free event with 3000 attendees, come and join us!

-- Peter Thoeny - 2011-10-08

Topic revision: r4 - 2011-09-22 - PeterThoeny

Account
- Log In
- Register User

Edit
Attach

Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.