How to Run a Long Background Process in a Web App
Get Involved!
TWiki is an open source project with 10+ years of history, built by a team of volunteers from around the world, and used by millions of people in over 100 countries. The community is focusing on building the best collaboration platform for the workplace. We invite you to
get involved!
What is TWiki?
A leading open source enterprise wiki and Web 2.0 application platform used by 50,000 small businesses, many Fortune 500 companies, and millions of people.
Learn more.
Web applications need to respond quickly to user actions. From a usability point of view, anything that takes longer than a second or two will distract the user from the task at hand. What can be done if a process takes longer, or much longer? When I implemented the
BackupRestorePlugin I realized that taking a backup of a midsize TWiki site can take many minutes, which is too long for a web page to load - the browser might even time out.
Here is what should happen as seen by the administrator who takes a backup:
- User sees a list of existing backups
- User presses a [ Create backup ] button to start a new backup
- User sees a "creating backup now" message with a [ Cancel ] button and some visual clue that work is in progress.
- Once the backup is done, the newly created backup is shown in the list of existing backups
Screenshot of Backup & Restore Console, showing the "creating backup now" message:
Technically, the following needs to happen:
- User sees a list of existing backups:
- done with a dynamic query that shows all backups
- User presses a [ Create backup ] button to start a new backup:
- a daemon (background process) is started, and...
- User sees a "creating backup now" message:
- ... page reloads with the "creating backup now" message
- check progress with a timed Ajax call.
- restart timer to check progress again in case backup is still going on, else...
- Once the backup is done, the newly created backup is shown in the list of backups:
- ... page reloads with the dynamic query that shows all backups,
- ... plus any errors that might have occurred during backup
The following part has hands-on Perl and JavaScript code to explain how to run a long background process in a web application. Although it uses TWiki as an example, it is generic enough to be used in any web application programming environment (CGI, mod_perl, etc.). Since TWiki is written in Perl, I looked for suitable technologies, including
CPAN modules:
- Run daemon (background process):
- I found CPAN:Proc::Daemon to be suitable to run a background process. Although there is one disadvantage: It does not run on Windows.
- Capture output of daemon process:
- The STDOUT and STDERR of the daemon script (and other script that it calls in return) need to be captured so that error messages can be shown properly by the CGI script. I decided to use CPAN:IO::CaptureOutput.
- Ajax call to check backup status:
Run daemon (background process)
The
CPAN:Proc::Daemon module is well suited to the task of running a background process in a web application. Here is the basic Perl code:
use Proc::Daemon;
# build backup daemon command
my $cmd = $this->{Location}{BinDir} . "/backuprestore create_backup $fileName";
my $daemon = Proc::Daemon->new(
work_dir => $this->{Location}{BinDir},
child_STDOUT => $this->{DaemonDir} . '/stdout.txt',
child_STDERR => $this->{DaemonDir} . '/stderr.txt',
pid_file => $this->{DaemonDir} . '/pid.txt',
exec_command => $cmd,
);
# fork background daemon process
my $pid = $daemon->Init();
We initialize the daemon process, specifying the working directory, the STDOUT and STDERR file redirects, the PID (process ID) file, and the command to execute. The daemon command is
backuprestore with parameters
create_backup $fileName. That is basically it!
There was one complication though. The
backuprestore script can be called as a CGI script and as a command line script. The script determines the mode based on environment variables. Even though the daemon is called as a command line script, it erroneously considered itself called in CGI mode. This was because the forked process inherits the environment variable of the CGI script. To work around this issue we explicitly delete the environment variables that determine CGI mode before spawning the daemon process. Revised code:
my $SaveGATEWAY_INTERFACE;
if( $ENV{GATEWAY_INTERFACE} ) {
$SaveGATEWAY_INTERFACE = $ENV{GATEWAY_INTERFACE};
delete $ENV{GATEWAY_INTERFACE};
}
my $SaveMOD_PERL;
if( $ENV{MOD_PERL} ) {
$SaveMOD_PERL = $ENV{MOD_PERL};
delete $ENV{MOD_PERL};
}
use Proc::Daemon;
# build backup daemon command
my $cmd = $this->{Location}{BinDir} . "/backuprestore create_backup $fileName";
my $daemon = Proc::Daemon->new(
work_dir => $this->{Location}{BinDir},
child_STDOUT => $this->{DaemonDir} . '/stdout.txt',
child_STDERR => $this->{DaemonDir} . '/stderr.txt',
pid_file => $this->{DaemonDir} . '/pid.txt',
exec_command => $cmd,
);
# fork background daemon process
my $pid = $daemon->Init();
# restore environment variables
$ENV{GATEWAY_INTERFACE} = $SaveGATEWAY_INTERFACE if( $SaveGATEWAY_INTERFACE );
$ENV{MOD_PERL} = $SaveMOD_PERL if( $SaveMOD_PERL );
Capture output of daemon process
The
Proc::Daemon module takes care of capturing STDOUT and STDERR to redirect to a file. However,
CPAN:IO::CaptureOutput is needed if the daemon in turn calls external scripts (such as the
zip command) and we want to capture its output. Sample
_createZip Perl method:
use IO::CaptureOutput qw( capture_exec capture_exec_combined );
sub _createZip {
my( $this, $name, $baseDir, @dirs ) = @_;
chdir( $baseDir );
my $zipFile = "$this->{BackupDir}/$name";
my @cmd = split( /\s+/, $this->{createZipCmd} );
my ( $stdOut, $stdErr, $success, $exitCode ) = capture_exec( @cmd, $zipFile, @dirs );
if( $exitCode ) {
$this->_setError( "ERROR: Can't create backup $name. $stdErr" );
}
}
The
capture_exec() function executes a
zip command and returns the exit code alongside STDOUT and STDERR. An error is set in case the exit code is not zero.
Ajax call to check backup status
As mentioned before, I use some homegrown code to avoid dependency on other libraries. The following JavaScript code to poll for backup status is placed into the TWiki page that shows the "creating backup now" message:
<script type="text/javascript">
function ajaxStatusCheck( urlStr, queryStr ) {
var request = false;
var self = this;
if (window.XMLHttpRequest) {
self.request = new XMLHttpRequest();
} else if (window.ActiveXObject) {
self.request = new ActiveXObject("Microsoft.XMLHTTP");
}
self.request.open( "POST", urlStr, true );
self.request.setRequestHeader( "Content-Type", "application/x-www-form-urlencoded" );
self.request.onreadystatechange = function() {
if (self.request.readyState == 4) {
if( self.request.responseText.search( "backup_status: 0" ) >= 0 ) {
var url = '%SCRIPTURL{view}%/%WEB%/%TOPIC%';
window.location = url;
} else {
checkStatusWithDelay();
}
}
};
self.request.send( queryStr );
};
function checkStatusWithDelay( ) {
setTimeout(
"ajaxStatusCheck( '%SCRIPTURLPATH{backuprestore}%', 'action=status' )",
2000
);
};
checkStatusWithDelay();
</script>
Reading from the bottom up, the
checkStatusWithDelay() function is called at the time of page display. This function starts a timer that calls
ajaxStatusCheck() after a 2 seconds delay. The
ajaxStatusCheck() functions does an Ajax call to the
backuprestore script, passing along parameter
action=status. On
readyState == 4, e.g. on successful return, the text is analyzed for the status. The script returns
backup_status: 0 if the daemon is no longer running. In this case,
window.location is set to load a new web page. Else,
checkStatusWithDelay() is called to start the timer again.
The
backuprestore script calls the following Perl method when
action=status parameter is specified:
sub _daemonRunning {
my( $this ) = @_;
my $pid = _untaintChecked( _readFile( $this->{DaemonDir} . '/pid.txt' ) );
if( $pid && (kill 0, $pid) ) {
return 1;
}
return 0;
}
The
_daemonRunning method reads the
pid.txt file containing the process ID. A
kill 0, $pid is issued to test if the process is alive.
I hope this advanced web application topic is useful for your own projects. To learn more I invite you to download the
BackupRestorePlugin and to examine the code.
Comments
PeterThoeny - 2011-10-08:
I will have a session on this topic tomorrow Saturday 09:45am at the Silicon Valley Code Camp,
http://www.siliconvalley-codecamp.com/Sessions.aspx, room 4203. The SVCC is a free event with 3000 attendees, come and join us!