Speed and Compatibility
Note: I did not create this to argue about the "100% compatibility" criteria. I understand why 100% compatibility is considered vital by many. Nonetheless, the implications of this constraint need to be discussed, since pretending they don't exist will not get us anywhere.
Lazy compilation is being experimented with as a way to speed of TWiki, in large part because it is a way to try and improve performance while retaining 100% compatibility. So far, it appears that things like
LazyCompilation will not give much of a performance gain: an artificial example gave a 1% improvement. Each plugin appears to add 1%-3% to run time; TWikiTags will eliminate that for those plugins that can be recoded to be
TWikiTags but, while helpful, that isn't going to make much of a diference either.
Unless there are some really good ideas out there that I don't know about, I don't see how TWiki can become more than, say, 10-15% faster while retaining 100% compatibility. Is that enough of a speed gain (assuming that that performance gain scales to thousands or tens of thousands of documents) to make large existing sites tolerable to use?
CDot
does some ideas to dramatically speed up Twiki:
- Recode selected components (such as permissions, store and preferences) in a faster language
- Finish coding the store abstraction, implement store in a more efficient medium (eg RDB)
- Take permissions out of band
- Cacheing
- Cache global & per-web preferences for fast load
- Cache "mostly pre-rendered" topics, with scripted inserts for dynamic variables
- Pre-compile templates
- Reduce the number/scope of configuration options
But he then notes:
IMHO maintaining 100% compatibility is impossible for any of the above, except maybe 1 which I think has limited benefit.
There's nothing
wrong with people looking for small ways to speed things up, of course. But we need to be discussing how to speed TWiki up dramatically, and we need to be discussing it now. (I seem to have come around to Kenneth's point of view: that the number one goal right now should be speeding up TWiki. A lot.)
So if people have ideas on how to speed things up more than a trifle with 100% compatibility, we need to hear about them ASAP and then see if they work.
--
Contributors: CrawfordCurrie,
MeredithLesly
Discussion
I have to note that there's a certain irony in all this, as speeding up
Cairo was a major goal for
Dakar.
--
MeredithLesly - 24 Apr 2006
and it accomplished that goal
at the very least in the sense that
DakarRelease runs under
mod_perl, whereas it was flaky, at best, with
CairoRelease
--
WillNorris - 24 Apr 2006
While, alas, apparently slowing it down if you
don't use mod_perl.
--
MeredithLesly - 24 Apr 2006
Just some thoughts on Crawford's list, from my experience with using
TemplateToolkit:
- Recode selected components: I would put this only as a last resort. Having a language mix makes more difficulties in producing for various platforms, in testing, and in installing. At the moment I doubt that TWiki's spec is sufficiently settled by docs and test cases :-/.
- Finish store abstraction: Good for architecture, but personally I doubt that it would be a big performance gain. As long as most accesses are "by key" (filename in the current implementation), the file system isn't that slow. RDBs have their merits for mass operations, but these aren't what TWiki usually needs.
- Taking permissions out of band: Agreed. I still hope that it can be done in a 100% compatible way though I can't prove it right now.
- Cacheing: Caching global and web preferences is rather obvious, and I'm rather surprised that nobody seems to have tested this. I'll take this on my immediate agenda. Caching pre-rendered topics and pre-compilation of templates looks much more difficult (both these techniques are used with TemplateToolkit, for example).
- Reduce the number of configuration options: Is that really a problem? I would be surprised, but I have been surprised by TWiki several times in the past
--
HaraldJoerg - 24 Apr 2006
"As long as most accesses are "by key" (filename in the current implementation), the file system isn't that slow."
That argument fails for
SEARCH in particular, which will scale very badly as a site grows. And given that
SEARCH is the standard construct for
many pages, it's a real problem. For example, I counted roughly 180 out of 373 topics in TWiki that uses
SEARCH.
As to 3, the first step to taking ACL out of band is to get it out of the topic text and into META.
--
MeredithLesly - 24 Apr 2006
Taking ACL out of band is quite easy, actually... as soon as I get the chance to integrate into DEVELOP the
PluggableAccessControlImplementation patch. The migration from the current Access Control "inlined" to a new one "out-of-band" is another matter altogether.
--
RafaelAlvarez - 24 Apr 2006
Harald said
Finish store abstraction: Good for architecture, but personally I doubt that it would be a big performance gain. As long as most accesses are "by key" (filename in the current implementation), the file system isn't that slow. RDBs have their merits for mass operations, but these aren't what TWiki usually needs.
This is pretty wrong. The key point is that adding store abstraction will allow to
use and control caching. In a way that's what the
DBCacheContrib /
DBCachePlugin already provide.
Alas this stuff is suboptimal as not every cgi process should need to load the cache from disc on
every request. Instead, cgis should contact a
Memcache process that delivers
net data either from memory or populates the cache first by fetching data from some store.
Livejournal works like that. They use perl+mysql and serve 10,091,608 user accounts by now.
--
MichaelDaum - 25 Apr 2006
Michael, ok - I'll take that back. I had in mind that there's a long way from store abstraction to a store
implementation which really is pushing up performance - maybe too long, compared to using a ramdisk, or one of those nifty (but expensive) disks with own (and pretty large) cache. But you're absolutely right in that only a TWiki store abstraction will allow to
control caching intelligently.
--
HaraldJoerg - 25 Apr 2006
CategoryPerformance