Question
Since the upgrade to 4.0.3 I have monitored and compared the response times of
view with that of the Cairo version.
The Cairo version averaged 2.5 seconds to return a given topic. The Dakar version averages 4 seconds and I receive alarms during the day of the reponse time being over 30s and more.
An
strace -p of the view process shows that when
view is executed by the browser it does a
stat64 of every single file on the system (all topics in all webs), example:
.
.
.
stat64("/www_doc/twiki/data/Sandbox/T0Tasks.txt", {st_mode=S_IFREG|0644, st_size=217, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/T0Tasks.txt,v", {st_mode=S_IFREG|0444, st_size=402, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/T0UsefulLinks.txt", {st_mode=S_IFREG|0644, st_size=611, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/TestIt.txt", {st_mode=S_IFREG|0644, st_size=181, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/T0UsefulLinks.txt,v", {st_mode=S_IFREG|0444, st_size=1532, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/TestIt.txt,v", {st_mode=S_IFREG|0444, st_size=367, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/TwikiWebPage.txt", {st_mode=S_IFREG|0644, st_size=239, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/TwikiWebPage.txt,v", {st_mode=S_IFREG|0444, st_size=425, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/NewsPluginExampleInclusion.lease", {st_mode=S_IFREG|0644, st_size=53, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/LaurenceFieldSandbox.lease", {st_mode=S_IFREG|0644, st_size=49, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/HelenHaywardSandbox.lease", {st_mode=S_IFREG|0644, st_size=53, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/ForwardSCTMonitorsandbox.lease", {st_mode=S_IFREG|0644, st_size=53, ...}) = 0
stat64("/www_doc/twiki/data/TWiki/TWikiSandboxDotPm.txt", {st_mode=S_IFREG|0644, st_size=1321, ...}) = 0
stat64("/www_doc/twiki/data/TWiki/TWikiSandboxDotPm.txt,v", {st_mode=S_IFREG|0644, st_size=1518, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/WebPreferences.txt", {st_mode=S_IFREG|0666, st_size=4839, ...}) = 0
stat64("/www_doc/twiki/data/Sandbox/WebHome/WebPreferences.txt", 0x8e060c8) = -1 ENOENT (No such file or directory)
stat64("/www_doc/twiki/data/Sandbox/WebHome.txt", {st_mode=S_IFREG|0644, st_size=2796, ...}) = 0
.
.
.
Note also that it tries to look at files in directories that do not exist
stat64("/www_doc/twiki/data/Sandbox/WebHome/WebPreferences.txt", 0x8e060c8) = -1 ENOENT (No such file or directory)
I read
WhyIsTwikiSlowAndReadingALLTopics and disabled every single Plugin to check but nothing changed.
- Is it usual behaviour of view to look at every file on the system?
- Why does view try and look at files in no-existant directories?
We now have well over 10,000 topics on our twiki site and its growing rapidly and so with this behavior of
view surely the performance will get worse as more and more topis are created.
Response time is a major concern to the users at the moment and would be very gratefull for any help.
Environment
--
PeterJones - 05 Jul 2006
Answer
If you answer a question - or someone answered one of your questions - please remember to edit the page and set the status to answered. The status selector is below the edit box.
It certainly shouldn't be looking at every file and, if it is, as your report suggests, there's something very wrong. (Searches are quite another matter, of course.) The fact that it's looking at files that don't exist, however, suggest something may be odd with your configuration.
Hopefully someone will have something more constructive to respond than my
yikes!, I hope.
--
MeredithLesly - 05 Jul 2006
It's highly likely that you still have session scrubbing switched on. Check /tmp for cgi_sessid files, and read the heklp on sessions in
configure
--
CrawfordCurrie - 05 Jul 2006
How does that explain all the stat64 on every page?
--
KennethLavrsen - 05 Jul 2006
Just a guess - I made a similar observation when
$cfg{EnableHierarchicalWebs} was set to a true value, and a
WEBLIST variable was being expanded in the left bar. TWiki would have to
stat every file, in every web, just to find out whether it is a directory leading to another webs hierarchy.
--
HaraldJoerg - 05 Jul 2006
That certainly appears to be the case, judging by the code:
sub getWebNames {
my $this = shift;
my $dir = $TWiki::cfg{DataDir}.'/'.$this->{web};
if( opendir( DIR, $dir ) ) {
my @tmpList =
sort
map { TWiki::Sandbox::untaintUnchecked( $_ ) }
grep { !/$TWiki::cfg{NameFilter}/ &&
!/^\./ &&
-d $dir.'/'.$_ } readdir( DIR );
closedir( DIR );
return @tmpList;
}
return ();
}
--
MeredithLesly - 05 Jul 2006
This is not tested, but try to add the red line in
TWiki/Store/RcsFile.pm:
sub getWebNames {
my $this = shift;
my $dir = $TWiki::cfg{DataDir}.'/'.$this->{web};
if( opendir( DIR, $dir ) ) {
my @tmpList =
sort
map { TWiki::Sandbox::untaintUnchecked( $_ ) }
grep { !/$TWiki::cfg{NameFilter}/ &&
!/^\./ &&
-d $dir.'/'.$_ }
grep { !/\.(txt|txt,v|lease)$/ }
readdir( DIR );
closedir( DIR );
return @tmpList;
}
return ();
}
This will exclude all topics and lease files before the slow directory test.
--
PeterThoeny - 06 Jul 2006
I doubt that this helps, Peter, because all directories still are opened and read.
So why must TWiki discover all of its (sub)webs again and again? Creating a web is so infrequent compared to the number of pure view accesses. How about storing the list of all known webs in a plain text file, updating it when we create/delete/rename a web. Maybe this could be generalized for similar time-space tradeoffs in the code.
- This would be a lot more sensible except that one can create webs through the shell. I do this a lot, since it's a lot faster than going through TWiki. So I really don't know what the solution is, other than not having hierarchical webs. (This one was biting me as well, but fortunately I just moved my last subweb into a top-level one.) -- MeredithLesly - 06 Jul 2006
--
MichaelDaum - 06 Jul 2006
I first implemented what Harald suggested.
$cfg{EnableHierarchicalWebs} = 0
The result of this is very good. The average response time is under 2s and I nolonger receive alarms from out monitor. However, I can't use the subweb feature.
BTW, can someone supply a command line debugging command. I have tried to use something like
strace ./view
DefaultWeb/WebHome
from the linux shell but this fails on
oopsaccessdenied;def=no_such_web;param1=view
Perhaps one of you has a better method?
--
PeterJones - 06 Jul 2006
PeterJones, could you test my suggested workaround I posted above with hierarchical webs enabled? It should speed up your large web with 10,000 topics quite a bit.
Agreed with Micha, caching the weblist is a sensible thing to do.
--
PeterThoeny - 06 Jul 2006
I added a routine "_WEBLIST_CACHED" (plus
WEBLIST_CACHED => \&_WEBLIST_CACHED
around line 219) to my TWiki.pm and a parameter "WebListCache" to refresh every 12 hours, and it greatly improved performance. You can just replace "WEBLIST" with "WEBLIST_CACHED" in TWiki/WebLeftBarWebsList or wherever you are using it.
Probably not the prettiest or most efficient code, but it works. It should probably be a plugin, but I just needed something quick.
sub _WEBLIST_CACHED {
my $_wl_cachedir = TWiki::Func::getPubDir() . "/UserWebListCache";
my $wikiusername = TWiki::Func::getWikiName();
my $_wl_cachefile = "$_wl_cachedir/$wikiusername";
if ( -e $_wl_cachefile ) {
my $cacheTime = (stat $_wl_cachefile)[9] || 100000000000;
my $refresh = $TWiki::cfg{"WebListCache"};
my $now = time();
if ( $cacheTime < ($now - $refresh) ) {
print STDERR "Refreshing Cache\n";
TWiki::Func::saveFile( $_wl_cachefile, _WEBLIST(@_));
}
} else {
TWiki::Func::saveFile( $_wl_cachefile, _WEBLIST(@_));
}
my $text = TWiki::Func::readFile( $_wl_cachefile );
return $text;
}
--
BrianFreeman - 06 Jul 2006
Peter, I added the line that you suggested. I cant catch the
view process with strace as the process finishes before I enter the strace command so I attach a graph below that shows the response times (in 100th seconds).
Yesterday I disabled sub-webs at 10:00 and today at 12:00 I re-enabled subwebs and added the line that was suggested. As you can see
without subwebs performance is far better and is a little better than by enabling sub-webs with the added line.
--
PeterJones - 07 Jul 2006
Thanks Peter, this shows that the one line fix makes TWiki significantly faster (as I expected).
I am closing this question now.
--
PeterThoeny - 07 Jul 2006
Peter, I think you misread what he said. He said that that disabling subwebs improved performance a lot; the one line improved performance a little.
--
MeredithLesly - 07 Jul 2006
I think the graph speak for itself regarding performance. What tool did you use to feed mrtg/rrt for that one, Peter?
--
SteffenPoulsen - 08 Jul 2006
Yes, the graph speaks for itself. Actual benchmarking would give concrete numbers.
--
PeterThoeny - 08 Jul 2006
The graph is extremely distorted by the huge spikes in the beginning. Scaled properly, removing subwebs appear to cut the view time in half, having subwebs plus the extra grep line by about a third. I've attached a rescaled graph, removing the huge spikes, to better illustrate the respective savings. The grep line does save considerably more than CDot and Micha expected, but it's still about 33% slower than no subwebs.
--
MeredithLesly - 08 Jul 2006
Here is an improved version for better speed, changed grep indicated in red:
sub getWebNames {
my $this = shift;
my $dir = $TWiki::cfg{DataDir}.'/'.$this->{web};
if( opendir( DIR, $dir ) ) {
my @tmpList =
sort
map { TWiki::Sandbox::untaintUnchecked( $_ ) }
grep { !/\./ &&
!/$TWiki::cfg{NameFilter}/ &&
-d $dir.'/'.$_ }
readdir( DIR );
closedir( DIR );
return @tmpList;
}
return ();
}
See benchmarks at
Bugs:Item2594
for hard numbers.
--
PeterThoeny - 08 Jul 2006
Thanks
BrianFreeman for the caching fix. This needs to be done in the Store module since the renderer cannot assume a file based storage backend.
--
PeterThoeny - 08 Jul 2006