Improving the plugins start-up times
I am timing the init times of a lot of plugins to speed-up TWiki.
On my machine it takes 7 seconds to do a click! (it's a dual AMD 2000+ with RedHAt 8.0, Apache 2,
ModPerl).
Installed Plugins:
I have installed all the plugin that I have found:
- Installed Plugins: ActionTrackerPlugin, AgentPlugin, BeautifierPlugin, BugzillaLinkPlugin, CalendarPlugin, ChartPlugin, CommentPlugin, ConditionalPlugin, DatabasePlugin, DefaultPlugin, DoxygenPlugin, EditTablePlugin, EmbedPDFPlugin, EmbedQTPlugin, EmptyPlugin, EncryptedPagesPlugin, ExplicitNumberingPlugin, FeedbackPlugin, FindElsewherePlugin, FormFieldsPlugin, FormPivotPlugin, FormQueryPlugin, GaugePlugin, GnuSkinPlugin, HeadlinesPlugin, ImageGalleryPlugin, IncludeIndexPlugin, IncludeRevisionPlugin, InterwikiPlugin, JavaDocPlugin, LaTeXToMathMLPlugin, LdapPlugin, ListOfSkinsPlugin, LocalCityTimePlugin, LocalTimePlugin, MathModePlugin, NavbarPlugin, NewsPlugin, NotifyOnChildModificationPlugin, PdfPlugin, PerlDocPlugin, PerlSamplePlugin, PhantomPlugin, PollPlugin, PowerEditPlugin, ProgramsPlugin, PrologSamplePlugin, QuickCalendarPlugin, QuickSearchPlugin, RandomQuotePlugin, RandomTopicPlugin, RecursiveRenderPlugin, RedirectPlugin, RevRecoverPlugin, RevisionLinkPlugin, RicherSyntaxPlugin, RollupPlugin, SearchToTablePlugin, SectionalEditPlugin, SingletonWikiWordPlugin, SlideShowPlugin, SmiliesPlugin, SpacedWikiWordPlugin, SpreadSheetPlugin, StylePlugin, SyntaxHighlightingPlugin, TWikiDrawPlugin, TablePlugin, TigerSkinPlugin, TocPlugin, TodaysVisitorsPlugin, TopicVarsPlugin, TranslateTagPlugin, TreePlugin, TypographyPlugin, UpdateInfoPlugin, UserCookiePlugin, VersionLinkPlugin, XpTrackerPlugin
NOTE: I have not checked the set-up of all plugins ... someone cold be slow just because I didn't do it properly.
Instrumenting Plugins.pm
I have instrumented (just a little) Plugins.pm to collect times with
CPAN:Time::HiRes
(see the attached Plugins.pm file)
Initial timings
The init time can be splitted into (ms):
- discoverPlugins 2.6
- eval "use $plugin" 999.7
- readPrefsFromTopic 270
The cumulative handlers' execution on a simple page without plugin tags is (ms):
- startRenderingHandler 92.2
- commonTagsHandler 1638.6
- getSessionValueHandler 13.1
- endRenderingHandler 7.5
- outsidePREHandler 4617.3 <<== HUGE!
- insidePREHandler 0.2
I have modified Prefs.pm to cache the preferences read (see Prefs.pm attached)
By caching the Prefs we go:
Then I have tweaked a couple of plugins that uses outsidePREhandler
Now the time is lower:
- outsidePREHandler 36.3 ms <<== SMALL!!
The next biggest time is:
- use "$plugin" 1057.5 ms <<== HUGE!!
Proposal
In the current implementation the Preferences are always loaded for each initted plugin.
This time is low (a total of 130ms if cached).
Thus I propose to use the plugin topic to store a preference that is used to decide if the plugin should be initialized.
I have added a SYNTAX variable to all the plugins to initialize a Plugin only:
- if the SYNTAX declaration is NOT present (old plugins and special ones)
- or if the SYNTAX (a regular expression) matches the topic text
Problems:
- patterns produced by other plugins are not processed
- included topics are not processed
- templates are not processed
Solutions:
- we could:
- define a second preference PLUGINDEPS that lists the plugins that a plugin uses:
- this means that the plugin developer must check for new plugins often
- else define a SAMPLE preference that show a complete sample of the plugin output:
- this is useful also to automatically build documentation
- if a new plugin matches the SAMPLE output it will be properly initialized
- NOTE: in both cases the ordering of the initializations implicitly defines an efficient RenderingPipeline (the handlers are applied in the proper order)
- we could move the initialization to after the include processing
- we could pass to the initialization both the topic and template text
Resulting times
The resulting times are (in ms, on a page without particular patterns):
- init 378.2
- discoverPlugins 2.6
- getPrefsFromTopic 143.1
- use 113.4
- initPlugin 39.0
- registerHandler 0.7
- getSessionValueHandler 2.4
- startRenderingHandler 13.0
- commonTagsHandler 20.0
- outsidePREHandler 81.0
- endRenderingHandler 4.8
- writeHeaderHandler 0
They are encouraging, no? (
very encouraging

)
Comparison between plugin starting times
I table here the starting times that I have collected.
|
selective plugin activation OFF
|
selective plugin activation ON
|
| Phase |
Total time (ms) |
| discoverPlugins |
2.63 |
| getPrefsFromTopic |
176.24 |
| usePlugin |
1443.78 |
| initPlugin |
203.63 |
| registerHandler |
3.66 |
| initTotal |
1986.57 |
| Handler |
Total time (ms) |
| startRenderingHandler |
62.77 |
| outsidePREHandler |
29.77 |
| getSessionValueHandler |
5.76 |
| commonTagsHandler |
788.93 |
| endRenderingHandler |
5.51 |
| writeHeaderHandler |
0.04 |
|
| Phase |
Total time (ms) |
| discoverPlugins |
2.68 |
| getPrefsFromTopic |
147.95 |
| usePlugin |
123.55 |
| initPlugin |
41.18 |
| registerHandler |
0.76 |
| initTotal |
418.62 |
| Handler |
Total time (ms) |
| getSessionValueHandler |
2.83 |
| startRenderingHandler |
13.14 |
| commonTagsHandler |
17.93 |
| outsidePREHandler |
11.92 |
| endRenderingHandler |
4.46 |
| writeHeaderHandler |
0.03 |
|
Comments
Andrea, I really like the fact that you profiled the code base, always a sign of diligent engineer. I had wanted to do it myself, but never got the time. One suggestion I would like to make, is that you (or someone who has time) put code into twiki that uses Timer::HiRes or whatever that will log times (if requested) for all rendered pages and separate out by plugin. If we have more global visibility to what plugins are pigs or what code areas are pigs we will get more people looking into fixing them.
As for other suggestions. If I understand your issue regarding 'patterns produced by other plugins are not processed' to be that a plugin that produces twiki output might not have that output rendered, I would suggest that you push that responsibility of declaring the output of the plugin to be either clean (needing no further rendering by other plugins) or dirty (needing further rendering by other plugins) to the plugin developer.
Not sure if it helps, but you get what you pay for.
--
JohnCavanaugh - 16 Sep 2003
Thanks for the "diligent engineer"

... I was simply annoyed of waiting 7 seconds for each click.
My version of Plugins.pm is already almost completely "instrumented" with Time::HiRes calls ... I'll immediately complete it as for your suggestion to show in (the logs? in an automatically generated page?) a nice table of the time spent in each init phase and in each handler.
Your idea for the clean/dirty declaration suggests me that if a plugin would declare its pre-requisites I could init all the prerequisites also. Thus a MytagPugin that contains in its topic page the lines
* Set SYNTAX = %MYTAG{.*?}%
* Set PREREQUISITES = TablePlugin
would init also the TablePlugin if the
%MYTAG{.*?}% is found (and so on and so on recursively)
It remains to handle templates ... this should'nt be too difficult.
--
AndreaSterbini - 17 Sep 2003
Im not sure at run time or even code time pne could know all the prereqs needed.
What I was attempting to get at is a plugin is deemed a "clean" plugin iff when processing tags it
never inserts text that has other tags. Ie. If you did something like s/%TAG%/xxxx/, the xxx would never have any %MYTAG{.*?}% in it. [A potential other name instead of clean could be "terminal node"]
A plugin is considered dirty if does or might inject %MYTAG{.*?}% in its processing. [A potential other name instead of dirty could be "recursive" or "branch node"]
Im thinking along the lines of two categories of plugins. For lack of a better term I would group plugins that
only have %MYTAG{}% style syntax as
SyntaxPlugins. These
SyntaxPlugins could be broken down into 2 further categories, namely clean and dirty, following the definition above.
Another category grouping of plugins would be (again for lack of a better term)
FreeformPlugins. These would be plugins that have unusual freeform syntax, something like the
SmiliesPlugin would be a classic example.
Then a rendering pipeline could be established.
SyntaxPlugins(Dirty) [Loop] ->
SyntaxPlugins(Clean) ->
FreeformPlugins ->
DefaultTwikiRendering
Granted an assumption here is that
FreeformPlugins are always clean. I couldnt think of any exceptions to this off the top of my head, but I guess it is possible.
I think the benefit here is the recursive looping only occurs on dirty
SyntaxPlugins, which I believe would help the rendering time. But as with most things, having the real data (as part of your profiling) is what will really answer the questions.
Separately as well, I think we might want to strive for
NOT processing the plugin prefs for every rendering. Thinking further out, focusing on how mod_perl could optimize things I believe would be prudent. Specifically, I believe a strategy that loads the prefs and keeps them in memory for all plugins and only "reloads" them at specific times (perhaps on demand) would also make things much faster.
I state (if possible) because Im not sure exactly how shared memory would work across multiple httpd instances. We would definitely need a mod_perl expert to comment.
Just another $0.02 that I hope makes sense.
--
JohnCavanaugh - 18 Sep 2003
I see ... a plugin developer shouldn't know about the other plugins ...
The ideal situation would be if a plugin declares both the syntax he recognizes and the syntax of
the produced output ... and then deduce if a plugin needs another by looking at the two declaration ... a not simple problem if you want to match a regexp
over another regexp. I'll give a thought to this.
By ordering appropriately the initialization we get the proper sequence of application of handlers ... then (implicitly) your pipeline ...
About Prefs ... my new implementation caches each set of variables on disk and loads/join the needed ones when the prefs-init is called ... this halves the time (that's small for almost 100 files).
A better idea could be start using
CPAN:Cache::Cache
to share Preferences (but also other general data structures) across calls on different apache processes, even without mod_perl.
I will do some experiments on that .... it could make TWiki blazingly fast!
--
AndreaSterbini - 18 Sep 2003
I have found the
Grail
project to manipulate finite languages, regular expressions and finite machines ... perhaps the problem is solved by:
- generating the language of the output regexp (with retofl)
- and filtering the stream with the input regexp (with retofm and flfilter)
- if anything comes out then we have a match
... but this will take almost forever (I fear)
... but it has to be recomputed only when a plugin page changes ....
... else we could use a minimal approach:
- the plugin declares an example of its output (a text, not a regexp)
- all the other plugins that match on such example are initialized after it
- the nice thing about this approach is that we already are showing output examples! (as documentation)
Now I should find a way to declare the OUTPUT_EXAMPLE:
- as an OUTPUT_EXAMPLE variable (but I have problems with the \n
- or as the text enclosed between %STARTINCLUDE% and %ENDINCLUDE% (or %OUTPUT_EXAMPLE% and %END_OUTPUT_EXAMPLE%)
--
AndreaSterbini - 18 Sep 2003
I have slightly refactored the topic and added a table of times above.
--
AndreaSterbini - 22 Sep 2003
copied from HighQualityPlugins:
I think that bad performance is inherent to the current Plugin mecanism. In my
opinion, this should be changed first. I am not a perl programmer, but points
which strikes me as inefficient are:
- Nothing should be done at runtime to registrer a new plugin. Some script should be called to install/deinstall a plugin, modifying perl code. Now each time a page is viewed, the list of all possible plugins are scanned, initialized, etc..
- Now the text body is scanned wholly N times, for each possible %-var. An obvious optimization would be to just scan the body ONCE for an %-vars, such as: %[A-Z0-9_]+({.*?})?%, and for each match, call the plugin-provided function/method registred for each var, stocked in an hashtable, that could be statically defined in a perl module, so that even its instanciation could be done at server start when using perl_mod, not on each request. This plugin install script would thus just add/remove lines in this .pm file...
Ideas?
--
ColasNahaboo - 30 Dec 2003
_ Now the text body is scanned wholly N times, for each possible %-var.An obvious optimization would be to just scan the body ONCE for an %-vars_
This can be done only if no plugin produces an output with an %-var in it, and there is at least one plugin that I know (
XpTrackerPlugin) that put in it's output.
Extending the Idea, perhaps several scanning can be done until there's nothing to transform. I bet that at most it'll be 2 or 3 passes.
--
RafaelAlvarez - 05 Jan 2004
Bumping Feature topic marked as Scheduled for
CairoRelease that hasn't been modified recently.
--
SamHasler - 20 Apr 2004
Another idea (borrowed from the way Java handle the JDBC Driver selection based on the provided URL):
The rendering process could scan for tags in the form %.*?% and ask each plugin if it recognize it. The content of the tag is passed to the first plugin that recognized it.
This way the plugin don't need to parse the page to find it's tags (
XpTrackerPlugin parses a page a lot of times because it can recognize several tags). This process could be repeated until there are not more tags present in the page (to allow plugins to emit text that uses other plugins) or if all the tags in the pass failed to be served by a plugin.
If you go further and move all the TWiki vars processing into a Plugin then you can skip a whole step in the rendering process.
hmmm. I'll give it a little more thought.
--
RafaelAlvarez - 20 Apr 2004
I think that's a very good idea. One concern you'll need to deal with is loops in the generation, though. Have a limit on the number of passes through the text before you give up on trying to get rid of all 's even if some are still being recognized.
--
WalterMundt - 20 Apr 2004
What if you recursivly parse the replacement text to see if there are new tags in it, and pass the list of plugins that have already processed it and don't allow a plugin to process the text more than once.
--
SamHasler - 21 Apr 2004
Finally, I'm uploading the proof-of-concept implementation for the alternate plugin tag handling in
AlternatePluginTagHandling
--
RafaelAlvarez - 28 Apr 2004
Feature bumped to Dakar due to lack of progress. I already did a lot in Cairo to improve this; but there is more that could be done.
--
CrawfordCurrie - 30 Jun 2004
This is a fantastic analysis. I hope that addressing this can be a principle contribution towards
DakarRelease's performance goals.
--
MartinCleaver - 21 Oct 2004
I've looked hard at accelerating plugins, and it basically requires some heavy refactoring of the core rendering loop to achieve performance improvmenets. Because the plugins API is so tightly tied to this architecture, that implies a huge number of changes to plugins. In the short term, plugins can take advantage of the
registeragHandler method in TWiki.pm for considerable improvements in tag handling performance, but until there is a commitment to upgrade existing plugins there is no point in scheduling anything else. --
CrawfordCurrie - 15 Feb 2005