Tags:
create new tag
view all tags

ModuleLoadingPerformanceEnhancements

Migrated from CommonFrontEndCgiScript:

Performance related to loading of modules: Someone made the comment that if we have a CommonFrontEndCgiScript, it would actually make it fatter (even though there are only a few lines of code in the main:: package), because it must load all the support libraries and services. This is true. However, it is possible to delay compilation of un-needed code until it is requested by the program by applying the SelfLoader.pm module to each of TWiki's perl modules. Although all the code is being read in, it is not all compiled until it is required, resulting in significant reduction in overall startup time.

As promised, here are a few more details about how SelfLoader.pm could help out in that area (I'm paraphrasing heavily here from Programming Perl, chapter 7, The Standard Perl Library):

SelfLoader.pm is used for delayed loading of perl functions that are packaged within each of TWiki's module files. This gives the appearance of faster loading.

Example:

    package TWiki::Access;
    use SelfLoader;
    
    use strict;
    
    use vars qw(
        %allGroups @processedGroups
    );
    
    # =========================
    sub initializeAccess
    {
        %allGroups = ();
        @processedGroups = ();
    }
    
    
    __DATA__
    
    #all the rest of the subroutines go here
    

In the example above, SelfLoader tells its user (TWiki::Access) that functions in the TWiki::Access package are to be autoloaded after the __DATA__ token.

The __DATA__ token tells Perl that the compilation phase is finished, and everything after the __DATA__ token is available for reading via the TWiki::Access::DATA file handle.

The SelfLoader will read from the TWiki::Access::DATA file handle to get definitions for functions placed after __DATA__, and then eval the requested subroutine the first time it is called. The costs are the one-time parsing of the data after __DATA__, and a load delay for the first call to any autoloaded function. The benefits are a speeded up compilation phase, with no need to load functions that are never used.

There is no advantage to putting subroutines that will always be called after the __DATA__ token.

I'd like to enlist this as a general practice as we make headway on our CommonFrontEndCgiScript, to prevent unrequested services from being loaded for the ModPerl and SpeedyCGI challenged.

-- PeterNixon - 12 Jul 2002

This sounds like a good idea, as long as it doesn't complicate debugging (e.g. how do CGI::Carp and the Perl/Tk debugger mentioned in PerlPtkdb work with this?) I would like to see some stats on before and after load times - the Apache benchmark program is a good way of testing this.

-- RichardDonkin - 18 Jul 2002

I added SelfLoader statements to all modules in lib/TWiki and measured viewing times before and after the change. This was on a very slow old HP-UX box with serious performance problems, hence my interest in tweaking it. The Apache Benchmark results for twiki/bin/view/Main/WebHome on our server are as follows:

SelfLoader min total access (ms) avg total access (ms) max total access (ms)
not used 3875 4490 7203
used 3421 3488 3593

This is certainly a nice improvement for such a small change. Unfortunately, it isn't enough to solve our local performance issue, so I'll have to fiddle with the likes of ModPerl and SpeedyCGI. (Anybody ever managed to build SpeedyCGI or PPerl on HP-UX?)

-- ClausBrod - 30 Aug 2002

Interesting - see also ProperIncludeUrls, where I managed to avoid some external modules being loaded unnecessarily by doing the require/import at run-time. Your work shows a big improvement since there are so many TWiki modules that get loaded - worth including in the core IMO. Can you supply a patch?

I had a go with SpeedyCGI on Linux but couldn't get it to work reliably. You might also like to investigate the CPAN PPerl module, which is similar, though I'm not sure how stable it is - probably requires Perl 5.6.1 or higher.

-- RichardDonkin - 30 Aug 2002

I attached a cpio archive of the changed .pm files in lib/TWiki for the more adventurous of you to try. The changes were made on the files as distributed in TWiki20011201.zip, and I have only tested viewing so far.

Today, I re-ran the Apache benchmark against a TWiki installation on a faster server:

SelfLoader min total access (ms) avg total access (ms) max total access (ms)
not used 843 853 890
used 781 800 843

As you can see, the fast server still benefits somewhat, but the improvements are much less significant there. Thanks for the suggestion on PPerl, Richard. Unfortunately, PPerl, SpeedyCGI and ModPerl builds on HP-UX all failed for me. Must be me 8-(

-- ClausBrod - 30 Aug 2002

One thing about your first set of stats is that the 'min' figures were quite close, about 10% or so higher for the non-SelfLoader version; but the 'max' figures were almost 100% worse, over 2.5 seconds apart. This suggests that your original box is OK when unloaded, but some web transactions hit when the CPU is already heavily loaded.

So... this should be a win for any heavily loaded website, which probably describes virtual-hosted Internet TWikis (see TWikiOn for links). Since I have one of these on a system whose load average is sometimes 4 or 5, I will have a go smile

Thanks for the diffs, these made it much easier to apply to alpha code. Just tried this on TWikiOnCygwin and it worked OK, though I haven't done any performance tests.

It would be good to avoid creating a requirement for SelfLoader.pm, since this has been known to have some side effects and needs quite a bit of testing. Also, the __DATA__ breaks syntax highlighting in the VimEditor... Perhaps the easiest thing to do is include the relevant code all in one place, commented out, making it easy to uncomment it for more performance. Since use SelfLoader is done at compile-time, it is possible to just write the following in all modules, after any code that should run at compile-time:

# Uncomment these lines to use SelfLoader - delays compilation of all
# code after this point, which may improve performance on some systems.
##use SelfLoader;
##1;
##__DATA__

We should also look at autouse and AutoLoader - these both have some advantages. The former delays the 'use' of the whole module, somewhat like SelfLoader, until one of the functions is called; the latter only compiles the subroutines actually called, but it requires modules to be AutoSplit into a file per subroutine. Both these options require changes in the way that use Module is done.

I think SelfLoader is a good compromise really - it provides some benefit and requires minimal code changes. Definitely much easier than going through TWiki.pm to try to do conditional 'require' statements. The real answer is of course SpeedyCGI and co...

UPDATE: I did a benchmark on my laptop TWiki (Cygwin Apache 1.3.24, Cygwin Perl 5.6.1, TWikiAlphaRelease code) using =ab -v1 -n 20 http://localhost/bin/view/Main/RichardDonkin=, and I found that SelfLoader actually slowed things down slightly... No idea why this is happening other than perhaps that one subroutine in every module is called, and there is extra overhead due to SelfLoader of course. The same benchmark using ModPerl showed a big speedup, so I don't think it is a benchmark artefact.

Without SelfLoader:

Connnection Times (ms)
              min  mean[+/-sd] median   max
...
Total:       2344  2367   19.2   2367  2416
With SelfLoader:
Connnection Times (ms)
              min  mean[+/-sd] median   max
...
Total:       2363  2699  132.1   2701  3123

Q: Does SelfLoader compile the whole __DATA__ section when any subroutine in that section is required? If so, these results could be explained if the view script ends up calling just one subroutine from every module (a bit unlikely though...).

-- RichardDonkin - 01 Sep 2002

Hmmm... so maybe my first test system is a bit special in how it reacts to the SelfLoader changes. After all, the system is a 99 MHz HP-UX workstation running Perl 5.005 and Apache 1.3x under HP-UX 10.20, so it's a bit dated. I haven't looked at autouse and AutoLoader yet, but will do so at lower work tide - thanks for the pointers!

-- ClausBrod - 02 Sep 2002

SelfLoader should only be reading in the __DATA__ section, but not actually compiling it until it's needed, but that may be different on Microsoft/Cygwin machines. It's been my experience that things on Cygwin don't always work quite as well as when run in a more normal unix environment.

-- PeterNixon - 26 Sep 2002

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatcpio selfloader.cpio r1 manage 153.5 K 2002-08-30 - 11:03 UnknownUser lib/TWiki files with <nop>SelfLoader changes
Unknown file formatdiffs selfloader.diffs r1 manage 4.4 K 2002-08-30 - 11:04 UnknownUser lib/TWiki files with <nop>SelfLoader changes
Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r10 - 2002-09-26 - PeterNixon
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.