r18 - 09 Aug 2007 - 07:40:50 - KoenMartensYou are here: TWiki >  Codev Web > BenchmarkFramework
Tags:
benchmark 1 Add my vote for this tag, performance 2 Add my vote for this tag, , create new tag

A Framework for Working on TWiki Performance

Contents

Motivation

All this talk about benchmarks is fine; but when is someone going to deliver a test data set and scripts? Is analysis paralysis setting in?

BTW note that "simple" benchmarks are no longer enough; the configuration has to be recorded. There are many optional switches that can affect performance. I would suggest a fragment of LocalSite.cfg that can be 'do'ed into LocalSite.cfg for running the benchmarks. Or alternatively, and probably better, a unit testcase that initialises all that stuff.
-- CrawfordCurrie, in MeasuringTWikiPerformance

Sorry for the delay, Crawford. I appreciate your impatience, which is known to be one of the virtues of a Perl programmer. And sorry, this is still far from finished. It is a first step.

A journey of a thousand miles begins with a single step. -- Lau Tzu (unregistered at twiki.org)

If he is taking his ease, give him no rest. -- Sun Tzu, The Art of War (also unregistered)

Wishlist

Refactored to include the current status and priority - please add/modify requirements, and vote. The list is so long that I don't see a chance to complete it in a 4.1 timeframe, but perhaps we can agree on the most important bits. -- HaraldJoerg - 05 Sep 2006

Description Current Status Must Have Nice To Have Irrelevant Harmful
Pluggable: Should be installable on existing TWiki installations, not just SVN Implemented by delivering as a Contrib with own bin/profile driver HaraldJoerg, CrawfordCurrie      
Implement as TWiki Fn: Allow %BENCHMARK{topic="TWiki.WebHome" method="view"...}% Not implemented (yet)     HaraldJoerg, CrawfordCurrie  
Web Interface: Run from a TWiki form Implemented as proof of concept, but needs some fixup before publication as Contrib HaraldJoerg CrawfordCurrie    
Choice of Topic: Obvious. Maybe one needs to create topics exploiting "expensive" variables to get reasonable data. Implemented HaraldJoerg, CrawfordCurrie      
Choice of Revision: Related to the choice of topic, but maybe required if benchmarking a topic in a live TWiki. Not Implemented yet. HaraldJoerg     CrawfordCurrie
Store results: as TWiki topics Implemented as proof of concept: the benchmarking result will be stored in a topic in the Benchmarks web, with a topic name like Web_Topic_XXXXXXXXXXX. Still just proof of concept because it abuses bin/save to do the X replacement, so it can not attach raw material HaraldJoerg CrawfordCurrie    
Choice of script: pre-compiling techniques will be good for view, but how do they affect save? Present in the form, but only view is supported. Needs thinking about how to provide params to other scripts (e.g. the text for save) HaraldJoerg, CrawfordCurrie      
Choice of skin: How expensive is Pattern skin? Implemented HaraldJoerg, CrawfordCurrie      
Choice of viewing user: Admins perhaps are allowed to "see" more Implemented, available to members of TWikiAdminGroup only HaraldJoerg, CrawfordCurrie      
Choice of profiling method: Devel::DProf ? Devel::SmallProf ? Others? Present in the form, but only Devel::DProf is available. Parsing the results is a pain. HaraldJoerg   CrawfordCurrie (one is fine)  
Drilldown: to subroutine level Implemented, using Devel::DProf KennethLavrsen, HaraldJoerg, CrawfordCurrie      
Choice of configuration: Compare with plugin enabled/disabled, with/without localization or hierarchical webs, ... (see #ChangeParamsOnTheFly) Not implemented yet. Necessary provisions in TWiki core have been committed (Bugs:Item2826) but perhaps ConfiguringConfigure will get into the way HaraldJoerg, CrawfordCurrie      
Choice of modules to exchange: Needed to compare different revisions, or alternate implementations (maybe even before committing to SVN) Not implemented yet. Does not need contributions in the core, but a good interaction design for the web interface HaraldJoerg     CrawfordCurrie (if you mean interactive)
Allow to batch-process lists of Topics: Define a "standard topic list for benchmarks" Not implemented yet CrawfordCurrie PeterThoeny, HaraldJoerg    
Average over several runs: Single measurements are too varying Not implemented yet HaraldJoerg, CrawfordCurrie      
Create detailed reports: Maybe you need to reproduce it much later? Only the raw table material and the CGI params provided are recorded. Plugin settings or other relevant material is missing   HaraldJoerg    
Report performance differences: against a predefined snapshot Not implemented yet. Would need adding a "reference topic" from the results web   PeterThoeny, HaraldJoerg CrawfordCurrie  
Report performance of a predefined, known configuration This rolls up several of the above lines, but is more; it allows the caputing of a complete configuration, so you can compare apples with apples CrawfordCurrie      

Random Technical Quirks

Here is a collection of items which came into my way when trying to get it working. Maybe I've overlooked something, if so, please correct me.

Profiling With Devel::DProf and Others

The Perl modules Devel::DProf (part of the Perl distribution) or Devel::SmallProf (available from CPAN) are dropping their results as file with a fixed name (tmon.out in the case of Devel::DProf). In the case of TWiki, the file ends up in the bin directory. Unless it is picked up quickly after creation, the next profiling will overwrite it mercilessly (and you don't want someone to point his browser to bin/tmon.out, do you?).

Anyway: Whoever is doing it needs write access to the bin directory, which is not what one would usually allow frown

So what we need is one step to create the profiling information, and another step to store them in a secure place, where you can do evaluations whenever you want.

For convenience, both steps should be done in one swoop - if only to avoid user errors. This is difficult to do with today's web interface to view, so an extra procedure which combines both steps would be desirable.

TWiki's Unit Tests And Measurements

Within TWiki's unit test framework, test cases have easy access to the TWiki configuration because they are using TWiki.pm and creating the TWiki object themselves. However, for benchmarking and profiling this is not very sensible. Compiling TWiki.pm and its sub-modules and creating the TWiki object are a major part of the story, and some of the optimisations might need to be done in these parts. So while the unit tests allow easy modification of the settings, it takes one of the most important objects out of the "visible area" for measurement.

Profiling with Devel:DProf can't be done for one simple call of a subroutine, it needs to be done as a command line parameter. So running a unit test under the profiler will profile all the unit test framework as well. It seems somewhere between difficult and impossible (read: I don't want to do it) to separate the data collected during test case setup from those collected during the test case itself.

The unit tests do, of course, provide a framework to compare two competing implementations of a subroutine with the standard Benchmark module.

Therefore, for the current purpose, I'd prefer not to use the unit test framework, but am proposing another solution: run the complete scripts, like view, with -d:DProf given as parameter. There are still two options for that:

  • run the programs from the command line
  • run a copy of the programs, with #!/usr/bin/perl -wT -d:Dprof as a shebang line from either the command line or from the web.

Changing Configuration Parameters On-the-fly

If I want to measure (or profile) a complete view procedure, there is no easy interface to change configuration parameters on a per-request basis. The view script itself doesn't allow configuration changes, and normally shouldn't, since one would not want web users to fiddle with the settings, would we?

Even though this could be worked around by restricting access to the config variables to TWikiAdminGroup (or to somebody who knows the configure password), I do not propose that: Many configuration variables are clumsy to handle, both as URL parameters and on the command line. And there are many of them.

So I'd want to look for another way.

I think I've found an acceptable solution to cope with that, though it needs a change in TWiki.pm to be fully effective. The change would be that TWiki.pm, in its BEGIN block, reads TWiki.cfg and LocalSite.cfg only if a certain key in the %cfg hash (say, $cfg{ConfigurationComplete} is undefined.

Now all we need to do is to define the configuration beforehand, including a value for $cfg{ConfigurationComplete}. This can be accomplished with Yet Another command line parameters to Perl, this time the -m and -I flags. Let's demonstrate this with a simple example:

Contents of lib/Benchmark/DisableHierarchicalWebs.pm

package TWiki;
do '../lib/TWiki.cfg';
do '../lib/LocalSite.cfg';
$cfg{EnableHierarchicalWebs} = 0;
$cfg{ConfigurationComplete}  = 1;

Run from the command line

perl -wT -d:DProf -I ../lib/Benchmarks -m DisableHierarchicalWebs view

This will include the module lib/Benchmarks/DisableHierarchicalWebs.pm which will be run even before the BEGIN block in lib/TWiki.pm.

Still, having a "module" for each setting is more cumbersome than simply setting the parameters directly on the command line :-/

But at least, the parameter settings will be recorded in a file and can be passed around, to be used on other installations with other operating systems.

Proposed Solution

Summary

Create a BenchmarkAddon which contains:

  • A Benchmarks web which contains:
    • A WebHome offering a form to start measurements on arbitrary topics
    • A (growing, hopefully) collection of benchmark topics, hand-crafted to exploit border cases
    • Documentation about benchmarking TWiki, maybe including "typical" results
  • Helper programs in bin so that benchmarks can be run from the browser (linked from the appropriate documentation topic, for example)
  • A TWiki/Benchmark module tree used by the helper programs.

The Benchmark Web

This is the starter set I am considering:

  1. WebHome - contains a form allowing to select the parameters given above to perform a measurement / profiling
  2. EmptyTopic - just a topic with zero bytes. May come in handy to compare the performance of different skins/covers, or to profile the general TWiki overhead.
  3. WebList - a topic consisting of nothing but the %WEBLIST tag. Mostly included to support the example below.
  4. Every benchmark should add the results in a new topic within this web, reporting the relevant settings as well.

The timethis Helper

This is a wrapper around one TWiki request with the following parameters:

  • Web and topic, taken from $ENV{PATH_INFO}
  • method (view or save or ...)
  • Configuration module(s) to add
  • Number of iterations
  • Switch for "conventional" or "persistent" processing
  • All other parameters are passed to the TWiki request
The TWiki request itself is issued as a command line program.

As advised by RafaelAlvarez in the discussion, this will be postponed.

The profile Helper

This is a wrapper around a command line program containing a -d: switch. It supports the following parameters:

  • Web and topic, given as parameters or taken from $ENV{PATH_INFO}
  • method (view or save or ...)
  • Configuration module(s) to add
  • Profiler (defaults to Devel::DProf)
  • All other parameters are passed to the TWiki request

-- Contributors: HaraldJoerg

Discussion

I know that this is pretty incomplete, but here and then other things are getting into my way.

A journey of thousand miles continues with the second step. -- Larry Wall

Maybe. But not today smile

-- HaraldJoerg - 02 May 2006

At some point in the past weeks, I hacked the AthensMarks scripts to accept "scenarios" for testing. Each scenario is a list of installations (with the url to access them, the physical path and the skins), and a set of topics. I have attached the non-documented version. I'll continue this development after I finish what I'm focusing in now.

-- RafaelAlvarez - 03 May 2006

Thanks for the example. Are you considering a mechanism to change configuration as well as skin and topic? Are you eventually going to add profiling as well? Or does it mean that I should concentrate on the profiling part and just abandon timethis?

-- HaraldJoerg - 03 May 2006

I was thinking about providing a mechanism to change LocalSite? .cfg/TWiki.cfg "on-the-fly" to test the same installation with different settings. That's a quick hack, so I think that you should concentrate on profiling smile

-- RafaelAlvarez - 03 May 2006

Fine. Actually, I don't want to change LocalSite.cfg or TWiki.cfg because I'd prefer to do profiling in "real" installations where other users might be surprised if they hit another configuration. (I've updated some other sections as well).

-- HaraldJoerg - 09 May 2006

Note that you need some way to eliminate variability in profile results. Running with DProf and SmallProf? , you get wildly different results between runs, even when you average over long periods. Perl does some wierd cacheing stuff in the background which, when added to variability in disc and CPU performance, make normalisation very hard. So you need some way to specify a configuration (including the platform) that is the "normal form" for running profiles. In the past, we have had people spending ages optimising code pointlessly; because of local effects that make a particular area look bad on their systems, but don't impact anyone else.

-- CrawfordCurrie - 18 Jul 2006

I agree, I have seen a variability in the experiments I've made so far, though I would not call them wildly different. I'd guess that running a small number of iterations will average that out - but well, that is on the agenda, we can check it. As noted on IRC, the first iteration will have to be discarded to eliminate swap/disc cacheing artefacts.

And I agree that background processes which "at random" consume disc and CPU performance will here and then make the results questionable. I am pretty sure, though, that Perl does not do wierd cacheing stuff between two invocations of the same program.

On the other hand, a "standard configuration" for running benchmarks is exactly what I didn't have in mind. Comparing absolute numbers across installations is, in my opinion, futile. I want people to be able to reproduce benchmark runs by specifying how the figures have been obtained, so that we can see whether a certain change will benefit all installations or, in some situations, be faster for one installation but slower in another. Like we're testing on many platforms, let's benchmark on many platforms.

-- HaraldJoerg - 18 Jul 2006

I am mainly interested in a framework all developers can use to see how code changes affects the performance. With this we can raise the awareness on performance. That is, we need a simple way to profile the code before and after a code change or configuration change. If this profiling can be done easily (e.g. web-based), developers are more likely to use it.

Harald, at the EdinburghReleaseMeeting2006x07x18 you showed us a very promising solution on your site, http://munich.pm.org/cgi-bin/view/Benchmarks/Tools_MiniAHAH_5. This could be turned into a profiling framework & tool for our TWiki development. It can be used (a) to identify areas where performance can be improved on existing code, (b) test how much ones code changes / configuration changes affect the performance.

I have a dream, here we go:

  • Point and click what parts of TWiki to profile
  • Do a profile test run and take a snapshot
  • Make code changes and/or configuration changes
  • Do same profile test run again and take a snapshot
  • Compare the two snapshots:
    • Table based report, with color coded numbers (green for better, yellow for same, orange for sightly worse, red for worse)
    • Bar graphs to show change visually
    • Highlight areas where biggest impact can be expected on performance improvements

-- PeterThoeny - 19 Jul 2006

I think it is more important that we can see a detailed profile of what each function call in TWiki spends of CPU time when you view any Twiki topic with any feature enabled.

We can benchmark already today with ab. What we need is clear answer to where the time is spent inside TWiki.

There are many theories and oppinions. What we need is the numbers to show us what each feature or function takes of time so we can attack and optimize the time consumers.

-- KennethLavrsen - 04 Sep 2006

Detailed profiles at subroutine level are available with CPAN's Devel::DProf package. There's Devel::SmallProf for line-oriented benchmarking, but this gives incredible masses of data.

Note that the figures change vastly between topics. Especially routines like getRenderedVersion are extremely sensible to the amount of markup.

-- HaraldJoerg - 05 Sep 2006

There is also Devel::FastProf which is a follow on to Devel::SmallProf and appears better, but that requires a perl version which is not available under cygwin (5.8.8).

-- ThomasWeigert - 03 Dec 2006

I need to make clear my votes above.

Having a benchmarking framework that allows you to fiddle with the configuration is great fun, but what is really needed is a way to compare performance against a known benchmark (hence the name). There needs to be a central definition of what "TWiki performance" actually means, otherwise you are always comparing apples with oranges. For example, TWiki starts up on most platforms with a default configuration (TWiki.spec with no local mods except paths). This can be taken as the benchmark that the performance of changes can be compared against.

-- CrawfordCurrie - 03 Dec 2006

Can you give some guidance on how to effectively use Devel::DProf? The standard run just generates useless information:

Total Elapsed Time = 0.607131 Seconds
         User Time = 0.386131 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 18.1   0.070  0.208     25   0.0028 0.0083  TWiki::BEGIN
 12.9   0.050  0.268      7   0.0071 0.0383  TWiki::UI::View::BEGIN
 5.18   0.020  0.030      2   0.0100 0.0148  TWiki::Merge::BEGIN
 5.18   0.020  0.020     16   0.0012 0.0012  CGI::_compile
 5.18   0.020  0.040      6   0.0033 0.0066  TWiki::Store::RcsWrap::BEGIN
 5.18   0.020  0.089      9   0.0022 0.0099  TWiki::Attach::BEGIN
 5.18   0.020  0.298      6   0.0033 0.0497  main::BEGIN
 5.18   0.020  0.020      7   0.0028 0.0028  TWiki::Plugins::BEGIN
 2.59   0.010  0.010      1   0.0100 0.0100  warnings::BEGIN
 2.59   0.010  0.020      2   0.0050 0.0098  Carp::longmess
 2.59   0.010  0.010      3   0.0033 0.0033  TWiki::Prefs::BEGIN
 2.59   0.010  0.010      3   0.0033 0.0033  warnings::register::import
 2.59   0.010  0.010      6   0.0017 0.0017  AutoLoader::BEGIN
 2.59   0.010  0.010      3   0.0033 0.0033  CGI::BEGIN
 2.59   0.010  0.010     11   0.0009 0.0009  Exporter::as_heavy

There should be a way of going deeper into the actual calls. I played with -f but did not get anywhere...

-- ThomasWeigert - 02 Dec 2006

With the standard settings you don't get much indeed - just some figures which prove that compilation (that's the BEGIN stuff) takes much of the time. As if you hadn't already known that smile

  • The first option you want to set is -O (Oh, not zero) of dprofpp to 50 or so - the default of 15 does indeed not give very much insight.
  • You can add -G BEGIN to collect (almost) all compilation stuff into one line. Some compilation stuff has explicit names (for example CGI::_compile), and some packages (like TWiki.pm) do real stuff in BEGIN blocks.
  • You need to benchmark several times to average out cache effects of the file system.
The other thing which is apparent that there is no single culprit for TWiki performance. It will need many small steps until the overall experience will improve significantly.

-- HaraldJoerg - 02 Dec 2006

Thanks, Harald. If you don't mind, another question... I usually debug TWiki scripts via

perl -dT view "TWiki.TWikiTemplates"
for example.

But when I add the DProf flag, no matter what topic I check it takes the same amount of time, and further, it takes much less time than when I run the topic without the -d switch, which makes no sense.

Could it be that profiling on a windows machine does not work?

-- ThomasWeigert - 03 Dec 2006

I can't tell for windows in general, but I have a pure cygwin installation where profiling works fine, as far as I can tell. Check whether each run of perl -d:DProf -T view "TWiki.TWikiTemplates" creates a new tmon.out in the bin directory, and check the output.

-- HaraldJoerg - 03 Dec 2006

Each run generates a new tmon.out but (i) the runs are visibly faster than when running without the -d:DProf flag and (ii) are almost the same when comparing the tmon.out file. Something is wrong at least in my perl, it seems.

-- ThomasWeigert - 03 Dec 2006

I found TWikiBench and adapted that code. That seems to provide the most useful info so far. I'll make a contrib out of it...

-- ThomasWeigert - 03 Dec 2006

Just my 2 cents to this (old) discussion: a benchmark is interesting. You can measure performance over time, or see how code changes affect performance. But there is often an immediate need to pinpoint performance problems. Ultimately, there would be a single switch that you could flip to have TWiki write out lots of timings to a log file. So that, for example, you can pinpoint a certain plugin that is slowing down your actual TWiki installation. This is like turning on debugging, or enabling asserts.

-- KoenMartens - 09 Aug 2007

 
Topic attachments
I Attachment Action Size Date Who Comment
zipzip newbench.zip manage 2.1 K 03 May 2006 - 15:37 RafaelAlvarez  
Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r18 < r17 < r16 < r15 < r14 | More topic actions
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback SourceForge.net Logo