Bug: Cairo Performance Issues
I am evaluating TWiki for use at work. And so far it is thumbs down because TWiki is too slow. And it is getting worse and worse.
Here are some benchmark numbers.
My environment is Fedora Core 2, Apache 2.0.50, Perl 5.8.3, mod_perl 1.99_12,
SpeedyCGI 2.22.
Hardware is Intel Pentium 2.8 GHz, 1 GB RAM. A quite snappy machine with no other traffic on it. Only one user for now (me).
The TWikis tested are fresh with no modules installed.
Test run is ab -n 10
http://localhost/twiki/bin/view/TWiki/WebHome
- result given is requests per second.
I have run with
SpeedyCGI and with normal
CGI. I have also run mod_perl which gave results between
SpeedyCGI and normal.
TWiki official release 20030201: With
SpeedyCGI 5.45, with normal
CGI 3.1.
TWiki beta release 20040507: With
SpeedyCGI 1.6, normal 1.2.
Twiki alpha release from 20040725 (close to being Cairo): With
SpeedyCGI 1.0 and 0.9.
So speed has dropped from 5.45 to 1.0 / 3.1 to 0.9.
I have tried TWikis installed in our US offices and I found them too slow. This speed decrease with modules added makes TWiki useless as a business tool - even with mod_perl or
SpeedyCGI. I would not release Cairo until you have the speed back at least to where it was in February 2003.
Test case
Environment
--
KennethLavrsen - 25 Jul 2004
Follow up
Looking at the benchmarks in
CairoPerformanceExperiments, the performance did not suffer that much, especially if you consider how much system performance increased in the last three years. Summary:
Percentage indicates performance relative to
AthensRelease.
--
PeterThoeny - 22 Aug 2004
I am very much for performance. Your numbers worry me. Could you try to tun the test on a topic other than a WebHome topic, like on TWiki.WikiWord?
The TWiki.WebHome page includes the SiteMap which has a lot of dynamic stuff. Did you upgrade the
SiteMap as well? The latest version has much improved speed. See
SiteMapIsSlow.
One reason for slow speed could be the style sheet includes as topics.
Profiling is needed.
--
PeterThoeny - 26 Jul 2004
I did some profiling of
view yesterday after talking with Kenneth.
| Step |
Percentage of total user time |
What is it |
| Require |
25 |
BEGIN block in view up to just before TWiki::initialize |
| Before handleCommonTags |
0.78 |
TWiki::UI::View::view up to just before the second call to =handleCommonTags |
| handleCommonTags |
66.41 |
|
| Rest of view |
7.81 |
From handleCommonTags return to end of view |
Notes:
- Run on 1612 with standard plugins only
- I removed the CSS imports from the templates. Otherwise I was getting 5 script invocations for every 1 call of view.
- That first 25% (Require) is the bit that should be eliminated by an accelerator like ModPerl or SpeedyCGI.
- Profiling was done using Benchmark
- Page profiled consists of about 10 search tags
I dove into =handleCommonTags as well, by commenting out sections and rerunning the benchmarks. By commenting out the invocation of plugin hooks in hCT (3 calls) the profile changed to:
| Require |
31 |
| Before handleCommonTags |
1.57 |
| handleCommonTags |
66.14 |
|
| Rest of view |
7.87 |
So, plugins are not the problem. Putting plugins back and commenting out the calls to
handlePreferencesTags (2 calls):
| Require |
65.96 |
| Before handleCommonTags |
21.28 |
| handleCommonTags |
2.13 |
|
| Rest of view |
10.64 |
i.e. preferences handling accounts for ~60% of the view time.
It gets a lot harder to analyse at this point because there are 130 calls to handleCommonTags in total to render a simple page. But focusing on the execution time of
handleCommonTags and tweaking the Prefs implementation is revealing. The base execution time for
handleCommonTags on my machine is 0.61 seconds. By recoding
replacePreferencesTags very slightly (and counter-intuitively), this reduces to 0.46 seconds i.e. a 6% of the total view time! This suggests that small performance gains are to be found through recoding critical sections of
handleCommonTags such that the perl compiler is able to achieve better optimisations.
Here's the new impl of
replacePreferencesTags:
my $hash = $self->{prefsHash};
foreach my $key (@{$self->{prefsKeys}}) {
$_[0] =~ s/\Q%$key%\E/${$hash->{$key}}/g;
}
The old impl executing in 0.61 seconds, this one in 0.46. Interestingly enough, pre-testing if $key occurs in the text raises execution time to 0.57 seconds, and precompiling the RE using
qr// raises it further to 0.63.
When I comment out the substitution completely, execution time drops to 0.01 seconds, reducing overall page rendering time by 60%.
Again interestingly, changing the
replacePreferencesTags implementation to:
my $text = $_[0];
my $hash = $self->{prefsHash};
foreach my $key (@{$self->{prefsKeys}}) {
$text =~ s/\Q%$key%\E/${$hash->{$key}}/g;
}
$_[0] = $text;
makes
absolutely no difference to the execution time! This suggests that all those "use $_[0] for performance" comments may be misleading in many cases.
OK, that was interesting, but what about the rest of handleCommonTags? The next thing to do was disable the searches, by commenting out the 2 calls to
handleInternalTags. Zowie; execution time drops to 0.01s! How can that be? Well, it can be if the internal tags contain all the references to preference variables that would otherwise be expanded by the replacePreferenceVariables function. And looking at the page would tend to confirm that. So, how about selectively targeting SEARCH? By re-enabling handleInternalTags, but commenting out SEARCH, execution time wavers up to 0.02 - so the performance of SEARCH (for this page at least) is accounting for 60% of the page render time. No great surprises there, I guess. But it does suggest that even 1 search on a page is going to dominate the execution time.
Rather than delve into SEARCH (the most %TERRIFYING% module in TWiki) I switched over to another page with no search on it to examine the performance there.
Just to confirm the prefs change above, I ran
ab -n 100 -c 3 http://localhost/svn/bin/view/Crawford/PageWithForms
first with the old code and then with the change above. Performance went from 1.60 requests per second (rps) to 1.71 rps with the change in; an improvement of almost 7%.
See also
CairoPerformanceExperiments
--
CrawfordCurrie - 26 Jul 2004
Interesting research, thanks Crawford
On preferences handling, I am wondering how much impact the recent
UsingFormsForSettings addition has.
--
PeterThoeny - 27 Jul 2004
I ran the experiment and updated the results.
--
CrawfordCurrie - 27 Jul 2004
For whatever it is worth, I have installed
TWikiBetaRelease2004x07x29 on my PC (Dell D800, 2Mhz) an compared it to the Beijing release. Note that this was not a thorough study; rather I just ran
ab -t 50 -c 3 http://host/path-to-twiki/bin/view
on both "fresh" installations. The results are not good:
| |
TWikiBeta20040729 |
Beijing |
| Document Length: |
26457 bytes |
11158 bytes |
| Concurrency Level: |
1 |
1 |
| Time taken for tests: |
58.227 seconds |
53.858 seconds |
| Complete requests: |
6 |
9 |
| Failed requests: |
0 |
0 |
| Broken pipe errors: |
0 |
0 |
| Total transferred: |
163974 bytes |
106110 bytes |
| HTML transferred: |
162910 bytes |
104590 bytes |
| Requests per second: |
0.10 [#/sec] (mean) |
0.17 [#/sec] (mean) |
| Time per request: |
9704.50 [ms] (mean) |
5984.22 [ms] (mean) |
| Transfer rate: |
2.82 [Kbytes/sec] received |
1.97 [Kbytes/sec] received |
| Connection time (min): |
7986 |
4775 |
| Connection time (mean): |
8232 |
5432 |
| Connection time (meadian): |
8251 |
5223 |
| Connection time (max): |
8541 |
7593 |
As you can see, there is a significant performance degredation in the Cairo release. Raising the concurrency level does not change the picture.
--
ThomasWeigert - 01 Aug 2004
Thanks Crawford and Thomas for the numbers.
FYI, the
WebHome pages can be somewhat missleading due to the
SiteMap. In Cairo we made some performance improvement on all web search, but this requires the latest SiteMap version containing a
topic="WebPreferences" parameter. Better to measure topics other then
WebHome.
We should attack the low handling fruits for
CairoRelease, and work on better performance in
DakarRelease.
One is to disable the LINKTOOLTIPINFO setting in the
TWikiPreferences. Which I just changed in the internal Beta.
--
PeterThoeny - 05 Aug 2004
Peter, could you please identify or create one (or more) page(s) which you think would give us representative insight. We should then all use these same pages for testing, so that results are comparable easier. Once you identify these pages, I shall rerun my tests.
--
ThomasWeigert - 05 Aug 2004
I wouldn't bother; the differences due to server architecture, server load, and network bandwidth swamp out all the other factors. You can only do meaningful comparisons between runs on a single server, and as long as you don't use a different page each time, the stats should be meaningful. After all, we are primarily interested in comparing the results of code changes, and you have established a baseline for doing that.
FYI in all my test runs I use two pages, a page called "PerfectAndAbsoluteBlank" and a copy-paste of the content of
TWikiVariables.
He had bought a large map representing the sea,
Without the least vestige of land:
And the crew were much pleased when they found it to be
A map they could all understand.
"Other maps are such shapes, with their islands and capes!
But we've got our brave Captain to thank:
(So the crew would protest) "that he's brought us the best--
A perfect and absolute blank!"
--
CrawfordCurrie - 05 Aug 2004
We are releasing Cairo as it is. The only thing done is turning off the LINKTOOLTIPINFO setting in the
TWikiPreferences.
Performance tuning will be done in Dakar.
--
PeterThoeny - 14 Aug 2004
I mark this as
BugDuplicate since the
PerformanceImprovementsInDakar are summarized there.
--
PeterThoeny - 28 Sep 2004
Fix record