TWiki was designed as an enterprise wiki from its inception. You find features specifically designed to support large deployments. Other wiki engines have a different focus and may lack some of these features. Wikis typically flourish in grassroots. Once at the radar screen of the CTO/CIO, grassroots wikis often get consolidated into a central TWiki. That is when scalability comes into play. Key scaling features of TWiki:
Multiple webs (workspaces):
You can create as many webs as you need. Some large TWiki deployments have over 1000 webs. Think of a web like a wiki within TWiki. Each team can get their own wiki. People need to register only once, then they can create content in their own space. If needed you can link across webs, such as to reference a registered user or an entry in the Glossary web.
Fine grained access control:
You can create TWikiGroups and restrict access to content for view and edit based on those groups. Although it is possible to restrict access on a topic (page) level, it is typically done on a web level for ease of administration.
In a large deployment it is advisable to authenticate users against your directory server, such as Active Directory or LDAP. That reduces the workload on registration/login questions.
TWiki has a per-topic namespace for file attachments. That means, if one team uploads a file called inventory.xls to their team page, and another team uploads a file of the same name to a different page, they will not collide. Try that with Mediawiki or other wikis.
You can limit the maximum size of attachments that can be uploaded. This can be done for the whole site and also on a web level.
Web 2.0 is all about user generated content. TWiki is a web application platform where you can install ready made applications. For example, check out the BlogAddOn, the TWikiDotNetForumAppAddOn and other TWiki extensions in the Plugins web.
Create your own applications:
The IT department is in charge of the wiki dial tone and wants to have some control over the wiki deployment. With TWiki you allow users to experiment in a controlled environment. That is, IT can get the dreaded "shadow IT" under control.
TWiki has a plugin API and ready made plugins to connect to external databases. That way you can run a query in MySQL and other RDBMS and display the result in TWiki pages. Useful to show CRM data to sales teams and bug trends to engineering teams. See Extensions:database.
Server Selection, Caching, Load Balancing
Plan for adequate server hardware when you deploy TWiki. The following are ballpark figures for an sample TWiki deployment serving 1000 employees and 50,000 pages:
Enterprise class Linux
Dual core CPU 2.6 GHz
2 GB RAM
RAID 1 or RAID 5 for redundancy
Dual power supply for redundancy
Plan disk space:
Page content: 15MB per 1000 page (yes, MB, not GB)
File attachments: 1GB per 1000 pages
If you have a high read to write ratio (such as a TWiki on the public internet) consider a caching solution and/or a load balanced setup.
For high volume traffic sites it is possible to put TWiki on a load balanced setup. Here is an example:
Cisco Ace load balancer.
NAS storage back-end.
Webservers share data on NAS for pages, file attachments and log files.
In the early days, TWiki.org was on a load balanced server setup while hosted at SourceForge. Now it is on a single and aging server hardware. The TWiki community plans to move TWiki.org again to a load balanced server setup which will improve the performance considerably.
Some people express concerns that TWiki's flat file back-end does not scale well. We know of a number of large TWiki deployments that have over 300,000 pages (such as at Yahoo), over 1000 webs (such as at a major telco company), and over 10,000 users (such as at a major financial institution in USA).
A flat file based storage back-end has several advantages:
Simple backup and restore
Simple migration of content between TWiki installations (think of grassroots wiki consolidations, spin-offs and acquisitions of companies)
Well understood caching and replication technologies available
Resilient to data corruption
Some scaling factors:
TWiki scales well on number of webs, e.g. it does not matter much if you have 3 webs or 3000 webs.
TWiki has a limit on the number of pages in a web. You will see a performance impact if you have more than 20,000 pages in a single web. This depends on the file system/configuration used, on the bandwidth of your server I/O and on the memory installed.
TWiki scales well on the number of registered users. We have not done tests on the upper limit. It is also feasible to not register users in TWiki, e.g. to rely solely on LDAP login.
As stated above, performance can be addressed with caching and/or load balancing.
The TWiki community is working on a pluggable storage back-end, see TWikiRoadMap.
(This post is based on Peter Thoeny's blog post on Scalability of TWiki, also posted as a supplemental document at TWikiScalability.)
MichaelDaum - 26 Mar 2008:
See also the discussions on the upcoming TWikiCache, a build-in caching infrastructure and dependency tracking. Compared to all other caching solutions this one aims at (1) correctness: never deliver outdated wiki content and (2) transparency: no extra provision is needed by the wiki author to get content cached. The TWikiCache will be part of TWiki-5.0. There are backports available for 4.1.x and 4.2.x as well.
MartinSeibert - 26 Mar 2008:
Very valuable information, Peter. Thank you.
ColasNahaboo - 02 Apr 2008:
On already available performance enhancements, there is also: