Tags:
dakar1Add my vote for this tag development1Add my vote for this tag extract_doc1Add my vote for this tag create new tag
, view all tags
thoth_baboon.gif

TWiki has been extensively re-coded for DakarRelease. There are many reasons for this, but these are the main ones:

  1. The codebase had grown organically over several years, and had reached a point where it was not maintainable.
  2. There were a huge number of side-effecting functions, which made the codebase extremely fragile. This was hampering (stopping dead) any major improvements.
  3. There was almost no documentation, and the code had to be reverse engineered to extract it.
  4. There was almost no encapsulation; modules would cross-call and write each others data freely.
  5. Only a largely inactive core team had the background knowledge to make changes, blocking out new contributors.

Why this big change?

The goal was to get TWiki ready for TWikiWhatWillYouBeWhenYouGrowUp. We (those of us who worked on this release) felt that TWiki was not moving with the times, and that younger, more aggressive wiki implementations were stealing a march on it. We wanted a basis on which to experiment, and CairoRelease was just far too fragile to provide it.

The subgoals of the coders involved in the change were:

  1. Maximise reverse compatibility with previous TWiki versions
  2. Establish clear and simple APIs to each of the core blocks of functionality
  3. Extract documentation
  4. Build testcases, both unit tests and test topics
  5. 100% compatibility with old plugins that conformed to the published interfaces
  6. Recode for clarity first, performance second

The strategy used was to target an OO implementation, by making small incremental changes to the codebase, testing constantly and adding to the tests to ensure that nothing got broken.

Note that no individual names are mentioned in the description below. DakarRelease has had many contributors, both core team members and contributors at large, and it would be divisive to single any of them out. All TWiki contributors are listed in the DakarRelease package in the AUTHORS file. If you can't work out how something works, then visit TWikiIRC for help or create a TWikiDevQuestion in Codev.

So how does TWiki work now?

Much the same way as it always has, with a few differences. No description of how TWiki used to work exists to refer to, so the following is hard to argue with.

Overview

TWiki is implemented as a collection of perl programs, which share a common code library (in the lib directory). Each URL used to access a TWiki function invokes a corresponding perl program - called a CGI script - from the bin directory. Some scripts, such as manage, perform multiple functions, but the key scripts, view, edit and save, each have their own front-end script. Other command line scripts, such as mailnotifier, also use the code library.

The CGI scripts are written to be as small as possible, so their functionality is limited. All they do is to create a new TWiki Session Object, which is an object of class TWiki, and then invoke the TWiki::UI::run method with the name of a TWiki function to call. TWiki::UI::run is a simple dispatcher that handles common exceptions for error reporting.

The real functionality of the scripts is to be found in a corresponding module in the lib/TWiki/UI directory. For example, the view bin script has a corresponding lib/TWiki/UI/View.pm module that contains most of its functionality. Each of these UI modules contains one or more functions that act as targets for the dispatcher - for example, TWiki::UI::View::view. These functions are all invoked by the dispatcher with just the session object as parameter 1. By the time the functions are invoked, all the hard work of setting up the TWiki environment has already been done, so their role is usually the composition of an HTML page. The composition is done in memory, and the page is written to STDOUT. If errors occur, they are signalled using exceptions that are caught in the TWiki::UI::run function. This function then redirects the browser to an "oops" page (see exceptions below) for user errors, or prints a plain-text error report to the browser for fatal errors.

Terminology

A note on terminology used in the pod documentation.

  • A ClassMethod is a function required by the OO implementation. Usually the only class methods are new and sometimes destroy. The pod documentation shows all except the required $class parameter. Calls to class methods look like this: $fred = new TWiki::User(...).
  • An ObjectMethod is a function that requires an object to act on. The pod documentation shows all except the required $this parameter, which is a reference to an object of the class type. Calls to object methods look like this: =$this->objectMethod( ... );
  • A StaticMethod is a "traditional" perl function. Generally, where a function in "old" TWiki operated without reference to global variables it has been left as a static function, though many names have changed to make them more intention revealing.
  • A private function is one which is never called outside the package. private function names follow the perl "standard" of a leading underscore. Never ever call a private function outside the package it is defined in.

Documentation is automatically extracted for the ntwiki installation. This documentation is automatically generated from the pod, so always matches the code running on develop.twiki.org (which is the latest code to the nearest 5 minutes)

The Session Object

TWiki was originally coded using a traditional imperative coding approach, but has been converted to a more object-oriented approach. Bear this in mind when trying to understand some of the more obscure practices in the code! This is a work in progress; there is still a long way to go (like breaking down the singleton objects, but we're getting ahead of ourselves now)

The key object in TWiki is the session object. The session object is a singleton instance of class TWiki. The role of this object is to store all the data associated with the currently running CGI session (hence the name). Like all objects in TWiki, it follows the recommended approach to perl OO coding and is implemented as a blessed hash.

When the session object is created it is populated with a number of other singleton objects, one for each of the prinicipal modules (Sandbox, Plugins, Net, Store, Search, Templates, Attach, Form, Users, Access, Prefs and Render). The role of these singletons is to encapsulate what were originally global variables in the imperative implementation. Each of the singletons has a reference back to the session, accessible through $this->{session}. Thus it is possible to get to any of the singletons from any other. The objects exist only for the duration of the current CGI run, ensuring that there is no leakage of global data from one run to another when using accelerators such as mod_perl.

Each of the singletons will in turn create other objects. For example, the Users singleton manages a set of User objects, one each for the registered TWiki users. The Store singleton in turn create a singleton for the store implementation (currently RcsWrap or RcsLite)

So, to create a fully initialised TWiki environment (all preferences read, all users initialised etc) is just a case of instantiating a new TWiki object. You don't need a CGI environment to instantiate a TWiki object, so it can be instantiated in command-line scripts as well. Examples of this in practice can be seen throughout the unit tests - for example, see the set_up function in the test/unit/CommentPluginTests, which creates a TWiki instance to support subsequent tests.

For those of you familiar with the Cairo implementation, you should find most of the old TWiki functions still in place in the TWiki class, but many of them now have to be invoked via the session object.

Global variables

Yes, there are still global variables. However their use is restricted to static information i.e. things that do not change on a session-by-session basis. The most important global variables are the configuration information, the regular expressions, and the tag registers.

Configuration Information

Previously configuration information was stored as a flat pack of global variables in TWiki.cfg. This had a couple of problems:
  1. The configuration file was very complex and hard to edit
  2. It was difficult in code to tell if a variable came from the configuration or not.
To simplify this we have moved to a single configuration hash, called TWiki::cfg. This hash currently reflects the old set of global variables quite closely, though any variables added to the hash should follow the scheme of using a second level hash to separate the namespaces. This allows arbitrary modules to add to the configuration without risk of damaging existing configs. For example, the PublishAddOn uses configuration constants as follows:
$TWiki::cfg{PublishAddOn}{PublishDir} = '/home/me/publish';

Regular expression

The set of precompiled regular expressions is very much the same as before, with a few minor tweaks.

Tag registers

The tag registers are hashes that maintain pointers to tag handling functions and constant values. See below.

Tag processing

The old TWiki would perform tag substitution in text using a perl s/// statement. This caused all sorts of issues with expansion ordering, so the whole approach has been thrown away. In its place there is now a single tag parser that analyses tag syntax handling embedded tags correctly. When a tag is recognised, it is looked up in a sequence of hashes, one for predefined functions, one for constants, and one for session variables. All of the tag handling functions are declared in TWiki.pm as private functions using the tag name (e.g. sub _INCLUDE). Note that one of the new features of the plugins interface is that plugins now have the ability to define their own tag handling functions. This is significantly different to the old approach, because these new tags will be treated by TWiki as if they were proper TWiki tags i.e. they will be parsed and dispatched like any other tag.

TML processing

As well as the tag handling change described above, we have tried to centralise the knowledge about the syntax of TML to the Render module. If you need to iterate over every line in a topic, there is now a function call forEachLine that does this, and tracks pre, verbatim and noautolink blocks. Ultimately the core rendering loop in getRenderedVersion will be implemented in terms of this function (but we're not quite there yet).

Why functions have moved

You will find that a lot of functions have moved between modules. Sometimes it will not be immediately obvious why a function has moved. 2 ]

Store Specifics

The Store interface is probably the most significant area of change. The goal here was to make it possible to swap in an alternative implementation of Store - for example, we wanted to be able to switch in a store based on a database like MySQL, or another CM system such as Subversion.

This has involved moving any functionality that needs to know about the specific implementation of the Store into the store module, but moving only that functionality. So some functions have been split in two, to leave the Store-specific parts outside the Store. A prime example of this is Search; it is now possible to perform a search on the store without knowing any details of the implementation of topics on disc.

The TopicObjectModel

Or "why there isn't one". Well, it's coming. However, it's not there yet, for the simple reason that getting this far has been a huge amount of work. The next step, moving to a topic object model, is a significant architectural shift for the system which requires careful design. If you look through the system at the methods that currently take (meta, text) parameters - the obvious candidates for the topic object - it is not that obvious what the delegation of responsibilities should be, in all cases.

The next step is obviously to move in this direction, and to eliminate most of the singleton classes.

The Plugins Interface

The old imperative plugins interface posed a particular problem, because it allowed the calling of functions throughout TWiki without any context information, instead relying on global variables to provide context.

To support the old interface it has been necessary to add a single global variable to the Plugins module, viz. SESSION. This is set using local in each of the plugin handlers, so when a plugin invokes a Func method it can recover the correct session object. This does mean that old code that would call the Func interface directly (and not as a result of being invoked via a plugin handler) will not work unless you set this variable. Fortunately this is easy to do; just $TWiki::Plugins::SESSION = $session (where $session is a reference to your TWiki session object). It is best to use local to restrict your change to the current call stack.

The plugins interface has been extended, though not far enough yet. TWiki::Meta is implicitly published alongside TWiki::Func, though the rational thing to do would be to dispense with TWiki::Func altogether and allow plugins to directly access a subset of the interfaces published by the singletons in the core code. Many plugins already violate the Func interface, most of them because they needed functionality from the core that was published via the interface. Rather than extending the imperative Func interface ad infinitum, it makes far more sense to encourage plugins to follow the OO strategy and subclass objects in the core system.

The plugins handling has itself changed. There is now a separate object (of type TWiki::Plugin) for each plugin. The only real impact of this is separation of functionality out of the TWiki::Plugins module.

Exception Handling

To significantly improve error detection and reporting, we have changed the code to use perl exceptions (also known as try..catch blocks). If you don't know what an exception is, go google. We use the exceptions methodology from the CPAN:Error module.

There are three exception types used in TWiki:

The first two are subclasses of the third.

AccessControlExceptions are used to trap and report access control violations, and may be thrown anywhere in the code. They are always trapped in the TWiki::UI::run dispatcher method, where they can be reported using a redirect to an oops script. They are also sometimes trapped in the code. Note that raising exceptions is not the only way that access control violations are detected; several scripts explicitly check access rights as part of their normal flow of control. This is simply because the logic flows more clearly when this is done explicitly.

OopsExceptions are used to flag a desire for TWiki to redirect - either because an error has occurred, or because we need to go to another URL.

Other errors may be thrown by die - these will be trapped as type Error::Simple and redirected to the browser. This makes it much easier to work out what is going on in the code - just stick in a die, and you will get a stack trace.

Users

The users support has been totally reworked. Users are no longer identified by various uses of the login name and the wiki name. Instead there is now an integrated TWiki::User object that encapsulates all the information about a user. The password management has also been restructured to be a lot clearer. If you don't recognise the code any more, don't worry, it still works the same way, it is just a lot clearer how it does it. The registration process has also been completely rewritten to support email notification of registration codes.

Test Strategy

New to DakarRelease is an integrated test strategy. TWiki now employs three levels of testing:
  • In-code assert tests
  • Unit tests
  • Integration tests

In-code self test

In-code assert tests are implemented using a bastardised version the CPAN:Assert module. It was bastardised to invert the sense of the environment variable that enables ASSERT; apart from that it is identical to CPAN:Assert. Asserts in the code look like this: ASSERT(something) if DEBUG. The DEBUG is a necessary part of the assert; it allows perl to optimise the function out of existance. Note that the environment variable TWIKI_ASSERTS must be set to 1 to enable asserts.

Add the following to your apache httpd.conf (or the TWiki specific conf file)

SetEnv TWIKI_ASSERTS 1

Unit tests

Unit tests are implemented using CPAN:Test::Unit, a testing framework derived from the well-known JUnit. The tests are partitioned in roughly the same way as the responsibilities are partitioned in the codebase.

The unit tests are used when functionality can be easily isolated for testing, and for when an integration testcase would be difficult to automate e.g. for testing an edit-preview-save cycle.

To run a unit test, make sure that valid LocalLib.cfg and LocalSite.cfg files are configured with paths pointing to your checkout area (there is no need for a full Apache setup, though you can easily run one on a checkout area). The user running the test scripts must be able to write to the data and pub directories, where it will create and delete test webs. The tests should not try to overwrite any checked-in files, though if you are nervous you can always simply remove write access for the user running the tests from the default webs (Sandbox, TWiki, Main etc).

Then cd to the test/unit directory and perl ../bin/TestRunner.pl chosen test . There is a test suite that runs all the tests in that directory called TWikiUnitTestSuite.

Integration tests

Integration tests are topics in the TestCases web. The WebHome of that web has information about running the tests, and details on how to write new tests. The integration tests are supported by a plugin called the TestFixturePlugin which supports the comparison of predefined HTML against what is rendered by TWiki.

There are also a range on purely manual testcases also recorded there. The original intention was to make these part of the "what is TWiki" definition, though they are very incomplete.

Bug reporters are encouraged to submit reports in the form of an integration testcase.

Build

Before Dakar TWiki relied on a mostly manual process to build the release package. A lot of work has gone into automating this process so that anyone can (in theory) build a TWiki release package from the source tree.

See BuildingDakar for a full run-down of the process.

So, is this the end of the big changes?

Sorry, no, it isn't. TWiki was not designed to be object-oriented, and a lot of compromises have had to be made in the process of refactoring it this far. The most obvious of these is the use of heavy singleton objects to encapsulate blocks of functionality and data. These singletons are code objects, not data objects, and need to be reworked. The main targets here are Access, Form, Attach and Search.

There is also a glaring need for an proper TopicObjectModel, which should be based on a common data abstraction shared by webs, topics and forms.

At some point we will have to bite the bullet and transform the plugins interface, which will require the recoding of a lot of plugins. This can be seen as an opportunity to purge the plugins repository, which has an awful lot of rubbish in it.

Footnotes

1A note on why the session object is passed to TWiki::UI methods, rather than those methods being in the TWiki class. Well, actually there is no really good reason. TWiki::UI existed historically, and already encapsulated the bin scripts functionality adequately, so it was felt that moving these methods into the TWiki object was unnecessary. It is clearly better to keep this code in separate files (for load efficiency) and making them all part of the same class is arguably a bad idea.

2To help you understand these changes, especialy if you are unfamiliar with object-oriented design, here are some of the driving principles that were applied in deciding where functions should reside (these are a subset of http://c2.com/cgi/wiki?PrinciplesOfObjectOrientedDesign).

  • Liskov Substitution Principle If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2 then S is a subtype of T.
    • i.e. it should be possible to drop in alternate implementations of any object without requiring changes anywhere else in the system.
  • Information Hiding If code chunk A doesn't really need to know something about how code chunk B (which it calls) does its job, don't make it know it.
    • Then, when that part of B changes, you don't have to go back and change A.
  • Dependency Inversion
    1. High level modules should not depend upon low level modules. Both should depend upon abstractions.
    2. Abstractions should not depend upon details. Details should depend upon abstractions.
  • Interface Segregation The dependency of one class to another one should depend on the smallest possible interface.
  • Acyclic Dependencies The dependency structure between packages must not contain cyclic dependencies.

Background

See here

Questions

If you have any questions, add them here (or ask on TWikiIRC for a faster answer), and I'll try to factor answers in to the text above.

  • Is it possible to defer the creation of some singleton class til they are needed? -- MichaelDaum - 21 Apr 2005
    • I think I found all the places that could be done; the trouble is there is still a legacy of interdependencies that I haven't fully unravelled (it's an incremental process). If you can suggest any improvements, I'm listening. -- CrawfordCurrie - 21 Apr 2005
  • Suggestion: Typing $this->{session}->writeDebug seems very onerous for inserting a debugging function. Couldn't writeDebug be defined so that it is always callable as a subroutine? As routines do not have print functions associated, there seems little advantage of making this a method? -- ThomasWeigert - 24 Apr 2005
    • It needs to be a method because the debug log file is configurable. -- CrawfordCurrie - 25 Apr 2005
  • Pardon the newbie question but could the plugins interface rewrite be simplified by bringing the consensus non-rubbish plugins into the core (see AddSessionPluginToKernel) and letting the rest adapt or wither? -- RichardFreytag - 09 Aug 2005
    • Sort of. We don't want to move plugins into the core, unless there are strong technical reasons for doing so. But we would like to have some way of "ratifying" a plugin - giving it a "seal of approval". Various ways have been tried to make this a community thing, but the community has not responded, so it may have to be the core developers who make this judgement. -- CrawfordCurrie - 11 Aug 2005

Discussion on tests moved to TWikiUnitTests

Is it possible to create an 'adaptor' plugin so that plugins could be written in other languages? I'm fluent in PHP and familiar with Ruby, but pretty much ignorant in Perl.

-- MeredithLesly - 14 Nov 2005

You might want to look at CPAN:PHP, which should enable PHP code to be called from Perl - no idea how well this works. I doubt if this would make it into the core, but it might be possible to use this to write a layer over the Plugin API, i.e. do a PhpAdapterPlugin that itself loads PHP code. Might be easier to just learn Perl, which is not that different to PHP when getting into it.

-- RichardDonkin - 15 Nov 2005

there is a whole range of languages supported using the CPAN:Inline modules. sounds like a very interesting idea; let us know how it goes

-- WillNorris - 15 Nov 2005

I had some questions regarding the points above:

2. Establish clear and simple APIs to each of the core blocks of functionality

Where are the apis located? Or is this a reference to the perl doc in the code?

3. Extract documentation

Is this available on the web? I noticed that the link above is for the most current code, which is not helpfull if you are trying to work between stable released versions. Also, the links do not resolve (they take you to the 'topic' not found page).

5. 100% compatibility with old plugins that conformed to the published interfaces

Just curious if this is still a goal? Seems like there is a standard post on all the plugins asking them to upgrade to the T4 specs or at least make it conform to a new interface (which would not be 100% backward compatable).

Mostly I am looking for a developers dictionary of old to new functions? Seems like this would be essential to a group of people upgrading the code. I am trying to find such doco so that I can upgrade our own T3 customizations to T4.

Thanks for any information.

-- EricHanson - 03 Aug 2006

The request for all plugins to be upgraded to T4 is mainly because there are plugins that bypassed the published API (ie, methods in the Func module) and reached inside the core to methods that disappeared during the internal refactoring.

So, it's still the goal to be 100% compatible with old plugins as long as they stick to what is considered the TWiki Public API. We're in the process of defining this API in a clearer way, so you're invited to check out PluginsApiPolicies.

As for the dictionary, Crawford started to create one, but the changes are so masive that it's pointless. I would suggest to hang out in TWikiIRC and ask people there for help in your customization upgrade.

-- RafaelAlvarez - 04 Aug 2006

Edit | Attach | Watch | Print version | History: r46 < r45 < r44 < r43 < r42 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r46 - 2006-08-04 - RafaelAlvarez
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.