Tags:
performance1Add my vote for this tag create new tag
, view all tags

Refactoring Proposal: Don't use require for mandatory Perl modules

Motivation

For TWiki 4.2, a lot of use statements have been replaced by require, mainly motivated by performance reasons. This change, however, makes code inspection, debugging and benchmarking more difficult, and even fails to meet its performance goal in persistent environments.

Description

All modules which are needed anyway for a particular TWiki request should be compiled with use and not with require.

-- HaraldJoerg - 16 Dec 2007

Impact and Available Solutions

WhatDoesItAffect: Performance, Refactoring
AffectedExtensions:  
HaveQuickFixFor:  

Documentation

For TWiki 4.2, many occurrences of use Module; have been replaced by require Module; throughout the code. This has been motivated by performance, but it has some drawbacks - even in the area of performance.

The benefits are:

  • According to CC's measurements, require is faster than use. This can be rationalized by the fact that require does not even attempt to load any symbols into the caller's namespace.

The drawbacks are:

  • In a persistent environment, require is actually slower than use. And again, there's a simple rationale: use is done at compile time, i.e. once for a persistent process, and does not leave any traces in the opcode for every request. On the other hand, require is a runtime operation, so even if require finds out that the module in question has already been compiled, it has to do so over and over again, once for each invocation of the script.
  • Module dependencies are no longer visible by looking at the module's head area. The require statements are spread throughout the files.
  • Performance measurements are skewed if require takes place in different routines, because the time needed for require is added to the runtime of routines where it occurs.
  • Debugging is more difficult because it is no longer possible to set breakpoints immediately after starting the program. Yes, there is b postpone, but this doesn't offer any checking of possible typos, whereas a plain b will complain if the breakpoint symbol is not defined.

The remaining reasonable use cases of require are:

  • If the module is only required if some conditions are met (see the practice for require locale in TWiki code)
  • If the compilation is wrapped in a string eval, as an additional hint to the reader that this is a runtime compilation

Examples

For the performance aspect in persistent environments, compare the runtime of the following two snippets:

#!perl -wT
use strict;
for my $i (1..1000000) {
    use Benchmark;
}

#!perl -wT
use strict;
for my $i (1..1000000) {
    require Benchmark;
}

On my machine (3GHz Pentium 4) use takes about 0.12 seconds, and require about 0.32 seconds.

For the debugger annoyance, start perl -dT with either of the scripts and then try any of the following commands:

  • b Benchmark::new
  • f Benchmark

Implementation

Just use use, as in previous versions of TWiki, unless there is a real chance that, in "plain CGI" environments, a module might escape compilation altogether by being required only in certain paths.


Discussion

I am tempted to agree to your findings. However, you are measuring 1000000 loops and still only get a difference of 200 ms? That's not significant. Are there any better benchmarks to base your proposal on (legacy hardware wink )?

-- MichaelDaum - 17 Dec 2007

Not significant indeed. You can easily improve the benchmark by using ten modules instead of one. This will slow down the hash lookup in %INC for each iteration of require, but not for use. TWiki has more than ten modules.

But anyway: The replacement of use by require has been introduced as a performance improvement, which it obviously fails to be with persistent interpreters. By using use we could make the code more readable, easier to debug and profile, and faster by whatever tiny amount with mod_perl, at the expense of some (I don't recall the actual figures and can't find the Codev topic) small slowdown for plain CGI.

-- HaraldJoerg - 18 Dec 2007

I do agree with your conclusions as I use persistent interpreters all the time and don't want to see them slower as they must. CDot has made his own statistics on the base of which he reworked the code from use to require. Could you please compare?

-- MichaelDaum - 18 Dec 2007

If you review the code, you may see that I adopted the following strategy:

  1. Where a module is always required, then require it at the top level of the package. I understand that a top-level require behaves the same as a use, and is evaluated at compile time (leaves no opcodes). (Later: see below)
  2. Where a module is often, but not always, required, make a judgement (just that, no benchmarks) as to whether it should be used or required.
  3. Where a module is obviously conditional, then embed a require as early as possible in the module code.
I'm sure I didn't execute on this strategy perfectly, and there are cases where I have used an embedded require where a top level require would be more appropriate.

As far as I know, the only argument for using a use rather than a require is where the import of symbols is essential. There are very few modules like this - the only example I am aware of is Storable.

-- CrawfordCurrie - 19 Dec 2007

Later: I just did this:

package Module1;

require Benchmark;
#use Benchmark;

sub wibble {
   return shift() + 1;
}

1;
and a caller Module2.pm
package Module2;

require Module1;
#use Module1;

my $j = 0;
for my $i (1..1000000) {
   $j = Module1::wibble($j);
}
then ran it using time perl -I . Module2.pm

Within the limits of measurement available to me, there is no performance difference in this example whether you use use or require at the top level (Module1 exports no symbols). Moving the require in Module2 into the inner loop obviously affects performance, but it's a case-by-case tradeoff whether the cost of the require check is higher or lower than the cost of unconditional compilation. For obvious reasons you should always avoid embedding require statements in tight, CPU intensive loops.

The bottom line is that there is no single "best way" that applies to every situation. Because of the cost of importing name tables, I can be fairly certain that using use rather than require in the same place in code is almost always a bad idea, however.

I really don't think this is worth worrying about too much. The performance gains to be found here are orders of magnitude less than the performance gains from algorithmic improvements (such as template precompilation).

-- CrawfordCurrie - 19 Dec 2007

Agreed: when require is at the top level, then there's no measurable performance effect. But on the top level, there's no conditional compilation either.

I admit that I haven't reviewed every occurrence of require in the code, but I started investigating with TWiki::new and found four unconditional require statements within that routine. I've found four require TWiki::Attrs in TWiki.pm as well, and though all can of them can be considered sort of conditional, the occurrence in sub _expandTagOnTopicRendering is telling a lot: this routine is called 373 times for a simple view of TWiki.WebHome in SVN.

You write:_Because of the cost of importing name tables, I can be fairly certain that using use rather than require in the same place in code is almost always a bad idea, however._ Importing can be suppressed by explicitly specifying an empty list of symbols to be imported:

   use Module ();
use Module (); is the exact equivalent of BEGIN { require Module }

Writing use in most of these cases would bring TWiki closer towards what the rest of the Perl world is using. That alone should, in my opinion, make it worth considering.

-- HaraldJoerg - 19 Dec 2007

Sure, I would support and encourage any refactoring that makes code easier to read, as long as there is no (or only small) performance penalty. The require calls in TWiki::new are there because there is an ordering dependency in the BEGIN process that I never fully worked out (you have to make sure @INC is fully set up before using or requiring certain modules). I just never moved some requires that are known to be called once-only during the new process - it just didn't seem worth worrying about. But if use is easier for most people to read - go for it!

-- CrawfordCurrie - 21 Dec 2007

I guess we'll know more soon - I just wrote a Blog post on the observation (using the DTrace probes I'm adding to perl) that there are definate benifits to using the my $var = shift method of getting function parameters - and When I work out where to put the probes for use and require...

-- SvenDowideit - 29 Dec 2007

Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r9 - 2007-12-29 - SvenDowideit
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.