It would be nice to be able to analyse TWiki performance using tools like DProf; indeed, DProf can be pretty useful. However, it soon becomes very frustrating as it exhibits wild variance in results. This is probably due to the fact that it uses a sampling technique to measure time spent in functions. Because the functions run fast, the sampling often misses.
An alternative, and much more accurate, approach is to use code instrumentation to perform analysis.
TWikiBench is a more generic approach to the existing
writeDebugTimes calls seen scattered around the code. These calls are limited in their usefulness, because they require a use of TWiki.pm and an edit to TWiki.pm to enable. Usually when you are examining performance you want to move calls around, and look at delta between widely separated code sections, and run isolated sections without TWiki.pm even being present. The resolution, which is seconds, is also far too coarse to be useful.
TWikiBench is a trivial little code module that you put in
lib/TWiki that provides a really simple methodology for examining performance of code sections. Anyone who can improve on it should do so!
How do you use it? Well, in its current form, the module is a very simple package that goes in
lib/TWiki and provides three simple methods:
| Method |
Description |
TWiki::Bench::start() |
Marks the start of a measurement sequence |
| =TWiki::Bench::mark($message) |
Puts a timestamp into the history that shows 2 times; absolute time since the last start and delta time since the last mark |
| -TWiki::Bench::report($message) |
Calls mark and returns a string containing the current sequence, suitabe for printing (to the browser, or to STDERR, or wherever. |
You instrument the code manually, with at least one call to
TWiki::Bench::start(), which you can then follow with as many calls to
TWiki::Bench::mark until you are ready to
TWiki::Bench::report. Here's an example of where =bin/view has been instrumented to generate a global report:
use TWiki::Bench;
TWiki::Bench::start();
my $query = new CGI;
my $thePathInfo = $query->path_info();
my $theRemoteUser = $query->remote_user();
my $theTopic = $query->param( 'topic' );
my $theUrl = $query->url;
my( $topic, $webName, $scriptUrlPath, $userName ) =
TWiki::initialize( $thePathInfo, $theRemoteUser,
$theTopic, $theUrl, $query );
TWiki::UI::View::view( $webName, $topic, $userName, $query );
print TWiki::Bench::report("Finish");
You then manually instrument the code with mark points you want to monitor. For example, in the middle of
TWiki/UI/View.pm,
$tmpl =~ s/%REVINFO%/%REVINFO%$mirrorNote/go;
use TWiki::Bench; TWiki::Bench::mark("PointA");
$tmpl = &TWiki::handleCommonTags( $tmpl, $topic );
TWiki::Bench::mark("PointB");
if( $viewRaw ) {
$tmpl =~ s///go;
Here's an example report:
Before hct + 0.536802 wallclock secs ( 0.13 usr 0.03 sys + 0.00 cusr 0.01 csys = 0.17 CPU) (0.536802 wallclock secs ( 0.13 usr 0.03 sys + 0.00 cusr 0.01 csys = 0.17 CPU))
SOFQ + 0.51689 wallclock secs ( 0.14 usr + 0.01 sys = 0.15 CPU) (1.05369 wallclock secs ( 0.27 usr 0.04 sys + 0.00 cusr 0.01 csys = 0.32 CPU))
EOFQ + 0.567191 wallclock secs ( 0.26 usr + 0.03 sys = 0.29 CPU) (1.62088 wallclock secs ( 0.53 usr 0.07 sys + 0.00 cusr 0.01 csys = 0.61 CPU))
SOSQ + 0.000882864 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) (1.62177 wallclock secs ( 0.53 usr 0.07 sys + 0.00 cusr 0.01 csys = 0.61 CPU))
EOSQ + 0.584362 wallclock secs ( 0.26 usr + 0.00 sys = 0.26 CPU) (2.20613 wallclock secs ( 0.79 usr 0.07 sys + 0.00 cusr 0.01 csys = 0.87 CPU))
After hct + 0.19883 wallclock secs ( 0.07 usr + 0.01 sys = 0.08 CPU) (2.40496 wallclock secs ( 0.86 usr 0.08 sys + 0.00 cusr 0.01 csys = 0.95 CPU))
After grv + 1.34725 wallclock secs ( 0.63 usr + 0.02 sys = 0.65 CPU) (3.75221 wallclock secs ( 1.49 usr 0.10 sys + 0.00 cusr 0.01 csys = 1.60 CPU))
PointA + 0.030447 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) (3.78266 wallclock secs ( 1.49 usr 0.10 sys + 0.00 cusr 0.01 csys = 1.60 CPU))
PointB + 0.265723 wallclock secs ( 0.07 usr 0.02 sys + 0.01 cusr 0.00 csys = 0.10 CPU) (4.04838 wallclock secs ( 1.56 usr 0.12 sys + 0.01 cusr 0.01 csys = 1.70 CPU))
PointC + 0.132708 wallclock secs ( 0.06 usr + 0.01 sys = 0.07 CPU) (4.18109 wallclock secs ( 1.62 usr 0.13 sys + 0.01 cusr 0.01 csys = 1.77 CPU))
PointD + 0.222073 wallclock secs ( 0.11 usr + 0.00 sys = 0.11 CPU) (4.40316 wallclock secs ( 1.73 usr 0.13 sys + 0.01 cusr 0.01 csys = 1.88 CPU))
PointE + 0.214375 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) (4.61753 wallclock secs ( 1.73 usr 0.13 sys + 0.01 cusr 0.01 csys = 1.88 CPU))
Finish + 0.00060606 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) (4.61814 wallclock secs ( 1.73 usr 0.13 sys + 0.01 cusr 0.01 csys = 1.88 CPU))
Each mark has an associated message, which is output first. Then the delta from the last mark, and finally the delta from the
start.
Bench.pm is attached to this topic.
--
CrawfordCurrie - 18 Sep 2004
Ideas for improvement
- Automatically instrument code to count entry/exit and time spent in function. Someone must have done this for Perl, but if they have, I can't find it!