Tags:
archive_me1Add my vote for this tag dakar1Add my vote for this tag performance2Add my vote for this tag stale_content1Add my vote for this tag create new tag
, view all tags

Dakar Performance Issues

Performance Numbers

Skin Plugins i18n Additional Page load starts Page fully loaded Athensmarks
Athens Benchmark
Classic None None Some rendering errors 0.31 0.54 100
Cairo Benchmark
Pattern 1 * None - 0.67 0.90 46.2686567164179
Pattern 18 + 1 None - 0.99 1.20 31.3131313131313
TWiki 4.X
Pattern 18 All - 2.61 3 11.8773946360153
Pattern None All - 2.38 2.7 13.0252100840336
Classic None All - 1.08 1.33 28.7037037037037
Pattern None No - 1.06 1.40 29.2452830188679
Classic None No - 0.85 1.10 36.4705882352941
Pattern None All Removed Language Selector 1.31 1.65 23.6641221374046
Pattern None All Disabled %LANGUAGES% in LanguageSelector 1.31 1.55 23.6641221374046
Pattern None All I18N::available_languages
Commented out foreach my $file ( @all ) block
1.03 1.31 30.0970873786408
Pattern None All I18N::available_languages
Commented out all except return
1.03 1.28 30.0970873786408
Pattern 18 All After SVN 9526 1.64 1.91 18.9024390243902
Pattern None All After SVN 9526 1.30 1.66 23.8461538461538
Classic None All After SVN 9526 1.08 1.36 28.7037037037037
Pattern 18 All After SVN 9883 1.90 2.16 16.3157894736842
Pattern 18 All SVN 9877 1.72 2.02 18.0232558139535
Pattern 18 All SVN 9878 1.91 2.21 16.2303664921466
Pattern 18 All SVN 9526 1.72 2.05 18.0232558139535
Pattern 18 All SVN 9883 565 users 1.97 2.28 15.7360406091371
Pattern 18 All SVN 9877 565 users 1.75 2.07 17.7142857142857
Pattern 18 All SVN 9885 565 users
and ALLOWTOPICVIEW on topic
1.97 2.32 15.7360406091371
Pattern 18 All SVN 9877 565 users
and ALLOWTOPICVIEW on topic
1.75 2.04 17.7142857142857
Pattern 18 All SVN 10114 565 users
and ALLOWTOPICVIEW on topic
1.77 2.09 17.5141242937853
Pattern 18 None SVN 10114 565 users
and ALLOWTOPICVIEW on topic
1.48 1.82 20.9459459459459
Pattern 18 All SVN 10257 565 users 1.73 2.08 17.9190751445087
Pattern 18 All SVN 10270 565 users 1.73 2.08 17.9190751445087
Pattern 18 All SVN 10270 modperl 565 users - note 1 1.53 2.51 20.2614379084967
Pattern 18 All SVN 10270 modperl 565 users - note 2 0.75 1.07 41.3333333333333
Pattern 18 All SVN 10368 565 users + Item2327 patch 1.74 2.08 17.816091954023
Pattern 18 All SVN 10684 565 users 1.72 2.01 18.0232558139535
Pattern 18 None Cairo - recalibrating
July 2006 - DONE no change
0.97 1.21 31.9587628865979
Pattern 18 None SVN 11018 565 users 1.38 1.71 22.463768115942
Pattern 18 All SVN 11018 565 users 1.65 1.94 18.7878787878788
Pattern 18 None TWikiFn SVN 11074 565 users 1.38 1.72 22.463768115942
Pattern 18 None SVN 11140 565 users 1.35 1.67 22.962962962963
Pattern 18 None SVN 11898 565 users 1.37 1.67 22.6277372262774
Pattern 18 All-2 SVN 11898 565 users 1.74 2.04 17.816091954023
Pattern 18 None Patch04x01 SVN 14020 565 users 1.40 1.91 22.1428571428571
Pattern 18 All-2 Patch04x01 SVN 14020 565 users 1.77 2.24 17.5141242937853
Pattern 18 None MAIN SVN 14020 565 users 1.42 1.90 21.830985915493
Pattern 18 All-2 MAIN SVN 14020 565 users 1.74 2.23 17.816091954023

Discussion

(for previous discussion, see DakarPerformanceIssuesArchive2005)

From Bug item 1572 I copied this description of how to measure timing using browser (IE) and Ethereal because it has a general interest beyond the scope of the bug report.

  • Measurements are done from another computer then the one running the TWiki server so that the browser does not take CPU time.
  • Measurements are done using a text topic that can be seen on the screen without scrolling.
  • Measurements are done with the same topic in Cairo and Dakar.
  • The page has been loaded several times so that images and style sheets are cached since this is how 99% of all pages are viewed.
  • Browser is setup to "Check for newer versions of the pages" -> "Every time you visit the page" as this is the way users have to have their browser setup to use CMS sites and Wikis. Otherwise they keep on missing the update they just did.
  • When browser is checking versions it does not download the entire images and style sheets. It starts by doing a http GET of the page. followed by a sequence of GETs for each element with "if modified since" in the http protocol. The Apache server returns a "304 Not Modified" instead of retransmitting the entire page.
  • The measurements are done by capturing the Ethernet traffic on the client machine using Etherreal.
  • The definition of page loaded is when the browser Acknowledge the last received info as this is the same moment that it displays the resulting page.
  • Internet Explorer is always used because I have noticed that the browser does not send the final ack until it actually shows the page.
  • The reason for this method is that the user could not care less which part of the delay comes from TWiki code, pattern skin downloading, and browser rendering. He sees the sum. And the many speed improvements my investigations have triggered have happened both in code compiling, skin related delays like unnecessary searches and javascript execution. An extra advantage is that the Etherreal traces gives you all the timing. You get the time from the first GET to the first TCP segment of the HTML returned which is in practical the perl compilation/execution time. You get the time it takes to check that all page elements are not changed, and finally you get the time from the last element is checked and till IE decides to acknowledge the page received which in practical is the time it takes the browser to execute the javascripts and calculate image positions etc.

-- KennethLavrsen - 03 Feb 2006

Interesting methodology, but I disagree with the cache settings 'check for newer version of pages every time you visit the page'. If used in normal TWiki usage, this might well re-create the BackFromPreviewLosesText bug as discussed in BrowserIssues. It's fine for benchmarking though.

-- RichardDonkin - 03 Feb 2006

IMHO this is a difficult methology to keep konsistent, too many variable factors - just to name a few:

  • Graphic card rendering times (yes, they can be surprisingly slow in some setups, disturbing results)
  • Bogomips available at the client and server
  • Network factors
    • Delay (can be varied, delays are "timed up" on number of connections done)
    • Bandwidth (some browsers does not repect if-modified-since on reload, skewing timing results on slower connections)
    • HTTP-connections reused or not
  • Browser issues (we all know browser rendering speeds vary tremendously)
  • Don't forget to add local desktop setup variations, (known/not known) firewall/proxy implications, (known/not known) local software / os / cache setting implications, etc etc

Important to me, is that the small but important deviations in timings of the single components (i.e. core performance) are easily lost in the "big view"

But Kenneth, why don't you create a topic descriping this metholody (KjlMarks / TWikiRelease4UserExperienceMarks), and try to decide on something that is "index 100" in it (e.g. default 4.0.1-installation, ie version x.x.x, other needed (server/client) versions, network setup etc) allowing others to redo your findings - and then you can easily link to it, whenever you need to post results (and we will be able to see and understand exactly what you are trying to do).

AthensMarks is an example of a measurement methology that has an index 100 strategy and therefore is very useful in that it is easily used to re-prove findings in different environments.

-- SteffenPoulsen - 03 Feb 2006

Nothing you say is wrong Steffen. With measurement and analysis it is rare that there is one method which is the right one. Ethereal is one of many methods that you need to analyse performance.

Let me address a few.

  • Graphics cards vary but in 2D you hardly notice the difference. In 3D things change major.
  • Bogomips - when I do an AB test I normally use a server that does nothing else than being Webserver. When I measure on production machine results vary up to 300% so you have to repeat the measurment 10-20 times and pick the best.
  • Network factors - yes - and this is important. When I measure at home the Network is idle and I am the only one really using it. In the real Internet you need to think network performance and design your applications so that you avoid too many connections to be established and closed.
  • Browser differences. YES. Exactly. When I found the Javascript performance issue it was using Ethereal and testing with different browsers and it turned out that by changing Javascript we reduced the time it took to view large topics from 30 seconds to 5 seconds. For those of you hacking Perl code this may seem irrelevant but for the end user 30 seconds is 30 seconds. It is a performance issue no matter where the delay occurs.

The purpose of my numbers is to give some tools to identify where design can be improved to make the total experience faster. Comparing TWiki 4.0 with Pattern Skin with Athens with the ugly old classic skin does not always make sense. It is useful to do Athensmarks on the perl code but when measuring the total performance the comparison makes no sense.

An Ethereal trace now and then to check performance is something I encourage all of you to do as a supplement and mainly compare with your own previous measurements and especially look at the details because it gives a much more detailed view than an Athensmark score. If you want to analyse the Perl code only - it is probably some more advanced profiling tools that should be used or maybe even implemented in special versions of the code to find where the bottlenecks are or to experiment with better ways to only compile the code that is really needed for each topic view or experiment with copying functions from CPAN libs instead of including the whole lib etc etc. I have seen many ideas from different people.

Different tools for different purposes.

-- KennethLavrsen - 03 Feb 2006

Time for new benchmark before 4.0.2 release.

Things are worse than ever. Again measurements are done using the most true way - using a browser - tracing network with Ethereal. Server is unloaded. Network connection is 100 Mbits. Tests repeated many times. BEST results uploaded.

URLs:

Case Time to first response (perl execution) Time to completion
Cairo 0.93 1.19
Dakar classic skin 1.37 1.59
Dakar Pattern skin locale OFF 1.33 1.72
Dakar Pattern skin locale ON 2.64 2.94

Conclusion: Code needs to be optimized again. Without localization Dakar is much slower than Cairo. With localization feature Dakar 4.0.2 is in my view useless. I would recommend anyone not to activate it.

-- KennethLavrsen - 16 Mar 2006

One of the later changes that ArthurClemens was to fraction the Pattern templates more (more files included). I wanted to see what impact this has on performance. So I "handcompiled" view.pattern.tmpl into one big file.

Conclusion. I may see a few milliseconds difference. Nothing that matters at all. So this is not the root cause of the slowdown.

-- KennethLavrsen - 16 Mar 2006

Can you instrument the view script to leave a log of the total time spent in the server? It would be useful to actually know how much is spending processing the topic and how much is spending in the comunication+rendering.

-- RafaelAlvarez - 16 Mar 2006

It's good to know that the pattern skin refactoring is not a problem, because that makes it much easy to customise things. It's not good that things are so slow.

Can you identify any plugins that might be a problem? They're getting in whether or not they're being used, after all.

-- MeredithLesly - 16 Mar 2006

Kenneth, thanks for bringing this to our attention. It would be helpful to compare 4.0.1 and 4.0.2 to see the performance difference.

-- PeterThoeny - 16 Mar 2006

Hmm, quick finding: r9192 (Bugs:Item1828) adds ~30% overhead to 4.0.2. in my setup.

-- SteffenPoulsen - 16 Mar 2006


I was shocked to see yet another report about Dakar being faster than Cairo when you can in fact "feel" that it is slower just by browsing.

In connection with Bugs:Item1961 I did a lot of benchmarking measureing on the client side using semi-long topic with no searches and no fancy features. Just normal TML and a TOC.

Here are all the numbers.

Table updated and moved down -- KennethLavrsen - 22 Apr 2006

-- KennethLavrsen - 26 Mar 2006

Rafael later clarified that the performance numbers for Cairo was WITH many plugins and the numbers for Dakar was without. That is why the comparison and looked like Dakar was much faster than Cairo. Thanks for clarifying this. It gives more hope for the validity of the CLI benchmark numbers.

-- KennethLavrsen - 26 Mar 2006

I tested my theory that ACL was slowing things down. I did it in the crudest possible way: bypassing the checks altogether. Alas, it seemed to make no difference. (This was a pretty uninteresting page, so the numbers weren't swamped by searches or other expensive operations.)

It's going to be a lot harder to figure out how to speed this up. One person said that it's all the regexes. I don't know enough about perl to know, but something is seriously wrong.

-- MeredithLesly - 30 Mar 2006

ML, I'm curious, in what way did you bypass the ACLs - by bypassing the loading of the preference hierchy altogether or just the code working out the effective ACLs?

Did you by any chance save the patch you created somewhere? I'd like to take a look.

-- SteffenPoulsen - 30 Mar 2006

Updated my numbers and again moved them to the end to avoid this topic to become too repetitive.

-- KennethLavrsen - 22 Apr 2006

AthensMarks measure only the clientserver side time. What it's measuring is "raw core" performance. It could be the case that between versions the core is faster but on the client side the rendering is slower. But what couldn't happen is that the core is slower but the rendering is faster (not if measuring using the same skin)

I have to say that in every benchmark I performed with exactly the same conditions (same number of plugins installed and enabled, no sessions enabled in Dakar, using classic skin), Dakar has consistently being slower since July 2005. Before that that I can testify that Dakar was at least 20% faster, and that is consistent with what my users reported (and that's why I'm still running that old revision pre-July 2005).

As an example, my last benchmark (the one you pointed out before to show a flaw in the tool and later retracted) shows that Dakar wihout any plugin installed is just barely faster than Cairo with all the default plugins installed and enabled. Given that each plugin will add a considerable overhead, I would say that it shows that Dakar is slower than Cairo. Yours show also that the new PatternSkin is twice as slow as the ClassicSkin

I'll try to get some time this weekend to run the benchmark against TWiki4, and post the results here. I bet that they will be consistent (but not the same) with yours.

-- RafaelAlvarez - 22 Apr 2006

I'm current running a slightly modified develop. Here are some results on the same page

Configuration Minimum time % of no plugins
No plugins 1350 100%
1 plugin 1376 101%
6 plugins 1478 109%
14 plugins 1575 116%

So, at first glance, it's between 1 and 2% cost per plugin.

-- MeredithLesly - 22 Apr 2006

Kenneth's Speed Measurements on TWiki

  • Server
    • Pentium 4 1.8 GHz
    • RAM 512 MB
    • IDE harddisk 7200 rpm 160 GB.
    • Network is 100 Mbit standard ethernet
    • Distribution: Fedora Core 4.
    • Cron disabled doing measurements

  • Client
    • Pentium 4 2.4 GHz
    • RAM 512 MB
    • IDE harddisk 7200 rpm 160 GB
    • Network is 100 Mbit standard ethernet
    • Client browser is Internet Explorer
    • OS is Windows XP
    • Ethereal 0.10.13 with filter for the server IP address and tcp only

Similar results has been measured on my production machine which is a 2.8 GHz machine with 1 GB RAM.

And at work on a DUAL XEON 2.8 GHz with 2GB RAM and two SCSI disks running at 10000 rpm I measure the same relative difference between Cairo and Dakar (measured at late night at no load). Load time per page during daytime is currently round 1.5 seconds for pages with no searches. At times with load the time increases to 2-5 seconds. But average is round 1.5 at daytime.

Results

I have renamed the last column to Athens Index. Just to ensure that noone think I am using the benchmark tool to generate those numbers. That column is a mathematical calculation of the "page load starts" numbers relative to Athens. And they are done so the maintainer of the perl benchmark tool can compare the calculated athensmarks with my index. The numbers should not be the same but they should be similar.

Table moved down the topic -- Kenneth Lavrsen

  • In Cairo SessionPlugin is enabled to compare with TWiki4 which also runs with sessions (to compare apples with apples)
  • Athensmarks are calculated based on 'Page load starts' to measure mainly the code execution. But the true benchmark would be to look at the page fully loaded numbers.

Conclusions.

  • SVN 9526 really works. Great job.
  • The benchmark tool and its athensmarks is dead wrong. There is no connection between what it says and what you measure on the client side.
  • SVN 9883 is a very bad. We are again going in the wrong direction.
  • SVN 9878 is the one that added 200 ms of the delay
  • Increased number of users from 5 to 565.
    • Before SVN 9878. 1.75 -> 1.75 s = No real change
    • After SVN 9878. 1.91 -> 1.97 = Small increase
    • Conclusion. Users cost more with SVN 9878
  • ALLOWTOPICVIEW does not add time

We really need a better tool than what I use because it is too slow to perform in practical use. But it is probably the best tool we have to measure real performance as seen from the customer. And I will be happy to perform the tests on a regular basis. But we need to find out how we can have a perl based benchmark tool that measures something completely different than measured from the client. And when we find the reason we have found a major bottle neck.

I have been heavily attacked on IRC about my latest numbers and being accused that they were not to be believed and that hardware was probably to blame. I fail to see the scientific argument how a relative measurement on the same machine, the same setup, the same method - with only one SVN checkin as a variable can trigger a major hardware difference. The absolute numbers are for sure very depending on hardware. Look at the relative difference. And focus on the time from GET to first byte returned because this is what varies with the code changes. The time between first byte and page fully loaded changes when people changes the skin and add plugins that adds more css and js.

I hope more people will verify my numbers so I do not stand alone against the unscientific arguments.

-- KennethLavrsen - 22 Apr 2006

Sadly I don't have the complete output here, but I ran a benchmark (AthensMarks) this weekend against Cairo, TWiki 4.0.2 (release package) and an old revision (rev 4664, IIRC). The numbers where:

  • Cairo (Classic): Around 50
  • Cairo (Pattern): Around 48
  • Dakar (Classic): Around 42
  • Dakar (Pattern): Around 38
  • 4662 (Classic): Around 15
  • 4662 (Pattern): Around 13

All those measurements where performed using only InterWikiPlugin and DefaultPlugin.

As I predicted, the numbers are quite consistent with yours. There is something very significant in the fact that PatternSkin don't add that much overhead to the core, but acording to your benchmark it really adds a lot in the "transference" part.

I don't believe that your numbers are "wrong". But also I don't believe that the AthensMarks benchmark is wrong either.

-- RafaelAlvarez - 24 Apr 2006

Yes, there are different aspects to performance. An important aspect is measuring the server end only, since total time to full client view has to be larger than that (ignoring any client-side caching, of course).

-- MeredithLesly - 24 Apr 2006

CategoryPerformance

-- CrawfordCurrie - 25 Apr 2006

ok. Now I have the numbers. These tests where ran using AthensMarks. The benchmark was run under Cygwin, on a PII machine with 256Mb of ram. Session support and internationalization were turned off. Subwebs setting has the default value. All languages where disabled (just in case, I don't know that code).

TWiki core code benchmarks (WhatIsWikiWiki)

Release Skin Plugins Time per page AthensMarks
athens   DefaultPlugin, InterwikiPlugin 1.625 100
DEVELOP r9906 pattern DefaultPlugin, InterwikiPlugin 5.34 30.4307116104869
DEVELOP r9906 default DefaultPlugin, InterwikiPlugin 5.335 30.4592314901593
twiki4 pattern DefaultPlugin, InterwikiPlugin 4.83 33.6438923395445
twiki4 default DefaultPlugin, InterwikiPlugin 4.825 33.6787564766839
cairo classic DefaultPlugin, InterwikiPlugin 2.85 57.0175438596491
cairo pattern DefaultPlugin, InterwikiPlugin 3.175 51.1811023622047

Having a sudden rush of inspiration, I commented out the template handling (ie, make readTemplateFile always return '%TEXT'). Afterwards, I optimized the common case where the skin is totally defined in the .tmpl files. Here are the numbers (only TWiki4 numbers shown):

Release Skin Plugins Time per page AthensMarks
TWiki4 (w/o template loading) classic DefaultPlugin, InterwikiPlugin 3.575 45.7342657342657
TWiki4 (w/patch) default DefaultPlugin, InterwikiPlugin 4.68 34.6628272251309
TWiki4 (w/patch) pattern DefaultPlugin, InterwikiPlugin 4.63 35.0450980392157

In my excitement, I forgot to copy the patch... I'll upload it later today.

-- RafaelAlvarez - 25 Apr 2006

With or without sessions? With session cleaning in core enabled, or backed off to cron job? With hierarchical subwebs enabled?

-- CrawfordCurrie - 25 Apr 2006

I added those bits of information. Also, attached a zip with the LocalSite.cfg for DEVELOP and TWiki4.

-- RafaelAlvarez - 25 Apr 2006

I would like to see your Athensmark numbers for configurations that are more like a real live installation. Very few will run a TWiki with only plugins and Session support is so essential and smart that it should be enabled also.

In Cairo I install the Session Plugin. And in Dakar I simple enable sessions.

Also try and measure with and without i18n.

When I have said I did not believe in the Athensmarks then it is because a few developers have reported that Dakar was faster than Cairo which is not anywhere near what you experience when you use it. But maybe they simply did not compare apples to apples. You did and that it good. I am just very curious to see if you get similar Athensmarks as I get when you have Sessions enabled (use a negative expiry value in Dakar) and enable all the default plugins plus a few extra so you have 18 like I used. The choice of 18 is based on the assumption that a typical TWiki has all the default plugins plus a few extras.

-- KennethLavrsen - 06 May 2006

when I get the chance, I'll do a benchmark with those conditions.

-- RafaelAlvarez - 06 May 2006

I did some performance tests with a large number of users. I created 60K users and one large group containing all 60K users. I also created a test web, which is view access restricted to that group. TWiki version is SVN 10760 (TWiki 4 branch), which is almost 4.0.3. This is on an entry level server (Celeron 2.0 MHz, 1 GB RAM, RedHat Enterprise), with a very light load.

Here are the benchmark numbers for accessing WebHome in the view protected web as a non-admin user:

2 sec: Before creating 60K users, accessing protected web with two users in group, total 3 registered users, total 3 TWiki groups

3 sec: After creating 60K users, accessing protected web with two users in group, total 60K registered users, total 4 TWiki groups, one with 60K members

18 sec: Accessing protected web with 60K users in group, total 60K registered users, total 4 TWiki groups, one with 60K members

As you can see, there is a big performance impact if there are 60K users in a group.

-- PeterThoeny - 30 Jun 2006

Time to see where 4.0.1, 4.0.2, 4.0.3, 4.0.4 + up till date has brought the benchmark numbers.

Again I post the entire table here. And remove it from above. Test method is still Ethereal to seperate code time from skin/browser time.

numbers moved to topic start

So we have gotten nowhere in practical. Looking at what people have added to WhatIsIn04x01 it is clear that performance has no priority with developers. That is very very sad because this is the biggest challenge TWiki has if it wants to survive. I have added it now. And I will insist that it stays. The "feature" that needs to be implemented is profiling. It is clear that those that created Dakar has no clue where the performance disappeared since Cairo. We need to know where the time is spent to fix things. We need to be able to build in measuring points in the code so we can isolate the time consumers.

Also - As you can see. I18N as feature is still far too expensive in performance to be something I would ever consider enabling. There is a proposal to do more i18n including plugins. You must have patient users. Not with the current I18N implementation.

-- KennethLavrsen - 15 Jul 2006

But this test would give a clue where to spend our efforts. %LANGUAGES% seems to be responsible for 6 points for example.

-- ArthurClemens - 15 Jul 2006

My users are getting 19 seconds-45 seconds or more to log in. I'm hoping this won't kill the project but it might. Why would logging in take so long?! I'm even using fcgi, for whatever that's worth.

And, yeah, I'm with Kenneth. Have been for a while. (Surprise!) Features are useless unless they're at best performance-neutral and ideally performance improvements. I don't know how to figure out timing on a finer grain though.

-- MeredithLesly - 15 Jul 2006

Time to get involved in BenchmarkFramework, isn't it smile It is on my agenda, but as I wrote occasionally, I am making progress very slowly.... I'm as far as http://munich.pm.org/cgi-bin/view/Benchmarks and http://munich.pm.org/cgi-bin/view/Benchmarks/Tools_MiniAHAH_6 (Sorry, you can't run benchmarks on that TWiki - it is not yet public) but have postponed it....

-- HaraldJoerg - 15 Jul 2006

Re I18N - I'm not sure if you mean the basic WikiWord charset support, which is virtually performance neutral and has been around since the Feb 2003, but you probably mean the UserInterfaceInternationalisation features added in Dakar. The latter seems to be responsible for about 0.20 to 0.30 seconds of 2.09 seconds or so (comparing a few pairs of tests with and without /UserInterfaceInternationalisation).

Given that you can turn off I18N if you have an English-only site, paying an 'I18N tax' of about 10-15% of performance doesn't seem too bad, compared to the alternative of much more expensive or less well supported translations of TWiki. I18N gives access to a large number of new users, particularly in Asia-Pacific countries. Of course, it would be great to improve I18N performance through profiling, which I'm sure can be done.

-- RichardDonkin - 19 Jul 2006

This may already be the case but I am not sure....

I would suggest that

  • in the out-of-the-box installation all features that eat substantial performance are turned off, or at least
  • there is mention in the installion documentation which features eat substantial performance and how to turn them off.

-- ThomasWeigert - 01 Aug 2006

Like HierarchicalWebs, which are now considerably improved thanks to Peter, but still slow things down.

-- MeredithLesly - 03 Aug 2006

I18N is already turned off in Dakar (and all other releases) as shipped - you have to explicitly turn it on. This applies to both the 'WikiWord I18N' added in 2003 and 'Message I18N' added in Dakar.

-- RichardDonkin - 06 Aug 2006

Even without I18N Twiki4 (Dakar) is round 30% slower than Cairo.

What went wrong? When you turn off I18N and subwebs there is hardly any new features in TWiki4. And yet it runs 30% slower than Cairo.

Why? This is a question that must be answered. And fixed.

-- KennethLavrsen - 06 Aug 2006

I will check back into the code, but as far as I remember, hierarchical webs doesn't add much of anything besides a few extra terms to some regular expressions, and if you're actually in a hierarchical web, it loads the preferences for all the webs above it in the hierarchy (top-to-bottom) overriding as it goes. One of the biggest problems with TWiki is that it doesn't (or can't) use one-time compiles for some of the bigger regexes, which forces the interpreter to re-compile an expression every time it's hit. I think this has always been a problem, so it wouldn't explain the sudden slow-down.

How about using -d:DProf on a non-mod_perl installation? It'd tell you where the majority of the time is being spent. I have a TWiki4 installation I could try it on.

-- PeterNixon - 07 Aug 2006

Kenneth, I really, really wish you would stop making sweeping statements like yet it runs 30% slower than Cairo. Where is your evidence? It doesn't run 30% slower on my system, when I normalise the configurations so that I am comparing apples with apples (AthensMarks). Show me benchmarks that I can reproduce!

To compare Dakar with Cairo, you have to switch all the new features off, otherwise you are just screaming in space.

-- CrawfordCurrie - 16 Aug 2006

I tried turning everything off (in the hope that the monster can't breathe vacuum) and what do the AthensMarks show? A 30% slowdown w.r.t Cairo, exactly as reported by Kenneth. Sorry Kenneth. Now, I wonder what has changed since I last ran the AthensMarks?

-- CrawfordCurrie - 16 Aug 2006

Actually, from the users point of view, the performance of a standard "out of the box" TWiki installation should be measured, because this is what an admin and the users experience.

-- PeterThoeny - 16 Aug 2006

I have taken a standard new TWiki. Added my Motion web from a certain date and my users, approx 550. And added a few extra plugins adding up to a total of 18. I chose 18 because that is approx what I run with at Motorola and at home.

And I use different test topics to test different features. But for my published benchmark I have chosen a semi long topic where only feature used besides headers and text is the TOC. I have also tried without TOC.

And when I measured Cairo - I used an installation with more or less the same plugins and I installed SessionPlugin so that Cairo would be slowed down by handling sessions the same way that TWiki4 must be doing. Again to compare apples with apples as much as possible.

I have tried to add more and less plugins. And this has never really made any significant difference between the two. I have tried totally without plugins also. The difference is always very similar, but the plugins do add about 2% of the naked core time for each plugin.

All my numbers are above or in attachments. And I have confirmed the approx 30-40% slowdown on 3 machines with 3 different generations of Fedora/Redhat. Rafael also published the same kind of numbers (measured with Athensmarks). And others have followed with either numbers or confirmations about the experienced slowdown. I must admit that I got very upset when I read Crawfords first posting from 16th Aug above. I have provided so much evidence and with others confirming the numbers that could not understand the claim that I had no evidence. I have documented my setup and attached many many files to this topic. And I have measured so many times over a long period with consistant results.

But let us forget that now. I am happy that Crawford now also has seen the same 30% and this is even when using Athensmarks.

Some experience I have made. It is very important that the test machine is much under control.

  • Cron must be killed.
  • Web access from users must be blocked.
  • GUI "stuff" must not run. A screensaver kicking in on a GUI that is normally only accessed using VNC can change results dramaticly. So always make sure Xvnc does not run. Turn off services that are not needed. You never know when they run.
  • Running ab from remote machine vs locally should in principle make a difference but my experience is that the difference drowns in noise. ab seems to use very little CPU time to do its job.
  • Make sure that the URL is in your localhost file or use raw IP address. You do not want DNS lookup to vary results.

It is valid to run fractions of TWiki to try and benchmark changes to a certain feature. But it it always important to come back to a normal usecase.

I found that running ab many times and throw away numbers that are worse than average is a good approach. And I still use Ethereal to sanity check how long the skin takes to load. This is important because some code can suddenly take 30 seconds to execute in the browser because of browser bugs. But Ethereal is a slow measurement and if people work in the perl code the ab results seems very consistant and very easy to run.

But ab must be run with at least -n 10 and this must be done at least 10 times. Check that number do not consistantly raise. Make sure the session are not expired by Twiki (negative value in configure). Throw away unnormal long durations. In fact the fastest of the measurents is what I normally consider the most correct because this means the code ran undisturbed. If numbers vary too much you need to find the process that still runs and disturb the result.

For a comparison between Cairo and TWiki4 to be valid the plugin environment, and the number of webs/topics and users must be the same or very near the same. Cairo must run with SessionPlugin because otherwise the Cairo install gets an unfair advantage.

Apache server must be same. Authentication method must be same. Same number of users in .htpasswd.

All this is good for benchmarking towards Cairo, and for before and after checking of code changes. But to see where the time is spent we need the profiling and here I have my hopes that Harald will help with guidance in getting a good profiling framework working.

-- KennethLavrsen - 16 Aug 2006

Updated the numbers and moved the table to the top of this topic. And it is going the wrong way again. Without i18n the numbers are almost unchanged since last measurement. With i18n the performance has gone from bad to worse. It is still my impression that performance gets near zero attention. The WhatIsIn04x01 shows this. All proposals except mine are about adding more features. And noone are trying to use the new profiling framework for anything.

When I test i18n I now select (All - 2) languages because more languages have been added and for each language added TWiki becomes slower. Which is actually very strange?? Does Twiki read in all the language files and not just the current? The two I do not select are the chinese ones.

-- KennethLavrsen - 06 Nov 2006

Ken, agree 100%.

On the table above, can you explain what was done to get some of the results. E.g., "Commented out foreach my $file ( @all ) block"... how do I get to that point on my current release, as I like the performance numbers in that version?

Finally, can you add some pointers to the Performance benchmark and how to use it? I was not even aware that it existed. So sorry frown

-- ThomasWeigert - 06 Nov 2006

The actual setup I use is explained in the text above (Ethereal). But I have later found that a simple ab test pretty well gives the numbers "Page Load Starts". The page fully loaded is a measure of how long it takes to load all the "skin stuff" including the time it takes IE. to render it. It is normally mainly given the time to check all the extra files for updates. Normally fairly constant extra but sometimes some misfortunate javascript triggers a bug in IE so that is one of the reasons I measure this also.

Benchmarking is easy. Turn off anything that can vary results. Ie. turn off crond. Stop all services that you do not need. Make sure nothing else load your test server. And start using ab.

What Harald has probided is a profiling framework. See BenchmarkFramework. But it is not giving any answers easily. It is hard work to find out where the time is consumed. I personally believe the key to cutting down the execution time is to find out exactly where TWiki is spending the time. Not just guess but actually measure it and then work on coding things smarter. But nothing is going to come easy because then it would have been found a long time ago. And it is going to be 100+ small improvements that makes the numbers. But Haralds good initiative deserves more attention - even if cutting off 10 ms is less sexy than making a new feature.

-- KennethLavrsen - 06 Nov 2006

I believe we can win some performance by changing po files to compressed mo files as described in MAKETEXT.

-- ArthurClemens - 06 Nov 2006

Sorry Ken, what I meant was "how do you created the version that was described as 'Commented out foreach my $file ( @all ) block'" as that version seemed to have much better performance than similar releases.

-- ThomasWeigert - 07 Nov 2006

Thomas. The "foreach" was related to the language list where I turned off the feature this way. Not relevant as the slow code was later eliminated.

I have done a fresh benchmark here the 03 Jun 2006 on SVN 14020. I benchmarked both the Patch04x01 branch and the MAIN branch.

I recalibrated with Cairo and got same numbers as earlier.

Last benchmark was run shortly after release 4.0.5 and before we released 4.1.0.

Without languages enabled the slowdown is 1.37 -> 1.42 seconds for the Perl code to execute. A very small increase which can be a noise. With languages the performance was the same as last benchmark. But measuring additionally how long the skin takes to load things are going the wrong direction. Without languages the time has gone from 1.67 -> 1.90. And with languages the increase is 2.04 -> 2.23.

So a simple FAQ pages with languages now takes 2.23 second to show on a server with no load at all. It is going the wrong way.

And compared with Cairo - comparing the numbers only without languages since Cairo was English only.

  • Perl code: 0.99 -> 1.74.
  • Perl code + Pattern Skin load: 1.20 -> 1.90

So for the customer TWiki 4.2.0 takes 58% longer to show simple pages than Cairo.

The numbers also say that Patch04x01 and MAIN run the same speed. This means that since 4.1.2 the speed is unchanged. The slowdown happened round the development of 4.1.0

I am a but surprised that we do not see just a small improvement because I had gotten the impression from various discussions that new user code would be more efficient. There was some optimisations discussed related to creating user objects. On a mid size TWiki with round 600 registered users I cannot see a positive trend. Maybe if someone runs with 10000 users you will see a difference?

We know that the speed of the new query type search and GROUPS if you have a large group definition is a problem. But this benchmark does not take this into consideration at all. This benchmark runs a very plain text topic with a TOC and no searching.

I hope the next weeks of bug fixing and optimizations will give us at least a small improvement. If we are going to look at the latest numbers with positive eyes then at least the last large code refactorings have not added huge delays except for the tql search and GROUPS issues which need to be addressed.

-- KennethLavrsen - 03 Jun 2007

For anyone interested in delving deeper into performance issues, I worked out a way to get reproducable performance information for the startup (compile) and runtime phases. See InvestigatingTWikiPerformance.

-- CrawfordCurrie - 04 Jun 2007

Topic attachments
I Attachment History Action Size Date Who Comment
Compressed Zip archivezip TWiki4vsCairovsDEVELOP.zip r1 manage 1.1 K 2006-04-25 - 16:17 RafaelAlvarez Configuration of my the benchmarks
Texttxt cairo-motion-faq-15mar2006.txt r1 manage 5.6 K 2006-03-16 - 05:06 KennethLavrsen Cairo - Motion FAQ - Ethereal trace
Texttxt cairo_motion_timb.txt r1 manage 2.9 K 2005-11-26 - 00:48 KennethLavrsen  
Texttxt cairo_twiki_timb.txt r1 manage 2.9 K 2005-11-26 - 01:23 KennethLavrsen  
Texttxt cairo_twikivars.txt r1 manage 21.8 K 2005-11-26 - 01:23 KennethLavrsen  
Texttxt cairo_twikivars_firefox.txt r1 manage 12.7 K 2005-11-27 - 20:55 KennethLavrsen  
Texttxt cairo_twikivars_from_dakar_ie.txt r1 manage 22.0 K 2005-11-29 - 23:32 KennethLavrsen  
Texttxt dakar9306-classic-motion-faq-15mar2006.txt r1 manage 6.2 K 2006-03-16 - 05:07 KennethLavrsen Dakar SVN 9306 classic skin Ethereal trace
Texttxt dakar9306-locale-off-motion-faq-15mar2006.txt r1 manage 9.7 K 2006-03-16 - 05:08 KennethLavrsen Dakar SVN 9306 Pattern skin - locale off - Ethereal trace
Texttxt dakar9306-locale-on-motion-faq-15mar2006.txt r1 manage 10.5 K 2006-03-16 - 05:09 KennethLavrsen Dakar SVN 9306 Pattern skin - locale on - Ethereal trace
Texttxt dakar_motion_timb.txt r2 r1 manage 6.0 K 2005-11-26 - 01:26 KennethLavrsen  
Texttxt dakar_twiki_timb.txt r1 manage 10.8 K 2005-11-26 - 01:24 KennethLavrsen  
Texttxt dakar_twikivars.txt r1 manage 42.8 K 2005-11-26 - 00:50 KennethLavrsen  
Texttxt dakar_twikivars_firefox.txt r1 manage 17.9 K 2005-11-27 - 20:56 KennethLavrsen  
Texttxt dakar_twikivars_postfix_firefox.txt r1 manage 31.3 K 2005-11-29 - 23:30 KennethLavrsen  
Texttxt dakar_twikivars_postfix_ie.txt r1 manage 31.0 K 2005-11-29 - 23:31 KennethLavrsen  
Edit | Attach | Watch | Print version | History: r90 < r89 < r88 < r87 < r86 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r90 - 2007-06-04 - CrawfordCurrie
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.