This is a SupplementalDocument
topic which is not
included with the official TWiki distribution. Please help maintain high quality documentation by fixing any errors or incomplete content. Put questions and suggestions concerning the documentation
of this topic in the comments section below
! Use the Support
web for problems you are having using TWiki.
Sometimes we get the question on how well TWiki can scale. This supplemental document compiles scalability related information so that you can plan your TWiki deployment effectively.
Scaling Across Teams and Departments
TWiki was designed as an enterprise wiki from its inception. You find features specifically designed to support large deployments. Other wiki engines have a different focus and may lack some of these features. Wikis typically flourish in grassroots. Once at the radar screen of the CTO/CIO, grassroots wikis often get consolidated into a central TWiki. That is when scalability comes into play. Key scaling features of TWiki:
- Multiple webs (workspaces):
- You can create as many webs as you need. Some large TWiki deployments have over 1000 webs. Think of a web like a wiki within TWiki. Each team can get their own wiki. People need to register only once, then they can create content in their own space. If needed you can link across webs, such as to reference a registered user or an entry in the Glossary web.
- Fine grained access control:
- You can create TWikiGroups and restrict access to content for view and edit based on those groups. Although it is possible to restrict access on a topic (page) level, it is typically done on a web level for ease of administration.
- In a large deployment it is advisable to authenticate users against your directory server, such as Active Directory or LDAP. That reduces the workload on registration/login questions.
- File attachments:
- TWiki has a per-topic namespace for file attachments. That means, if one team uploads a file called
inventory.xls to their team page, and another team uploads a file of the same name to a different page, they will not collide. Try that with Mediawiki or other wikis.
- You can limit the maximum size of attachments that can be uploaded. This can be done for the whole site and also on a web level.
- Organize content:
- Web application platform:
- Web 2.0 is all about user generated content. TWiki is a web application platform where you can install ready made applications. For example, check out the BlogAddOn, the TWikiDotNetForumAppAddOn and other TWiki extensions in the Plugins web.
- Create your own applications:
- The IT department is in charge of the wiki dial tone and wants to have some control over the wiki deployment. With TWiki you allow users to experiment in a controlled environment. That is, IT can get the dreaded "shadow IT" under control.
- TWiki has a plugin API and ready made plugins to connect to external databases. That way you can run a query in MySQL and other RDBMS and display the result in TWiki pages. Useful to show CRM data to sales teams and bug trends to engineering teams. See Extensions:database.
Server Selection, Caching, Load Balancing
Plan for adequate server hardware when you deploy TWiki. The following are ballpark figures for an sample TWiki deployment serving 1000 employees and 50,000 pages:
- Enterprise class Linux
- Dual core CPU 2.6 GHz
- 2 GB RAM
- RAID 1 or RAID 5 for redundancy
- Dual power supply for redundancy
- Plan disk space:
- Page content: 15MB per 1000 page (yes, MB, not GB)
- File attachments: 1GB per 1000 pages
If you have a high read to write ratio (such as a TWiki on the public internet) consider a caching solution and/or a load balanced setup.
- Load balancing:
- For high volume traffic sites it is possible to put TWiki on a load balanced setup. Here is an example:
- Cisco Ace load balancer.
- 3 webservers.
- NAS storage back-end.
- Webservers share data on NAS for pages, file attachments and log files.
Scalability of Search
TWiki uses the Unix grep command to search content in real time. This enables flexible and powerful searches in real time, which is important for TWiki applications. Search is covered in SearchHelp
The real time search has a performance impact. Searching all webs in a TWiki sites with more than 50,000 pages can be slow. If you have a large TWiki deployment of more than 50,000 pages it is advisable to index TWiki content with a search engine. This can be done with a commercial search engine such as the Google Search Appliance or an open source search engine. TWiki currently has three open source search engine integrations: SearchEnginePluceneAddOn
. See more Extensions:search
To scale the queries of TWikiForms
based TWiki applications look into DBCacheContrib
Flat File Back-end
Some people express concerns that TWiki's flat file back-end does not scale well. We know of a number of large TWiki deployments that have over 300,000 pages (such as at Yahoo), over 1000 webs (such as at a major telco company), and over 10,000 users (such as at a major financial institution in USA).
A flat file based storage back-end has several advantages:
- Simple installation
- Simple backup and restore
- Simple migration of content between TWiki installations (think of grassroots wiki consolidations, spin-offs and acquisitions of companies)
- Well understood caching and replication technologies available
- Resilient to data corruption
Some scaling factors:
- TWiki scales well on number of webs, e.g. it does not matter much if you have 3 webs or 3000 webs.
- TWiki has a limit on the number of pages in a web. You will see a performance impact if you have more than 20,000 pages in a single web. This depends on the file system/configuration used, on the bandwidth of your server I/O and on the memory installed.
- TWiki scales well on the number of registered users. We have not done tests on the upper limit. It is also feasible to not register users in TWiki, e.g. to rely solely on LDAP login.
As stated above, performance can be addressed with caching and/or load balancing.
The TWiki community is working on a pluggable storage back-end, see TWikiRoadMap
(This supplemental documentation is originally largely based on Peter Thoeny's blog post on Scalability of TWiki
, posted also at Blog.2008-03-26-scalability-of-twiki
-- Contributors: PeterThoeny
Comments & Questions about this Supplemental Document Topic
Good document. I like that it focuses on scalability across teams and departments as well as server scalability.
With regards to the first: When trying to run a single TWiki server for multiple departments (each having their own TWiki web) you'll want to decentralize as much of the administration as possible to 'web managers' but you cannot decentralize management of groups
. This happens at the Main level and therefore needs Main privileges. This is an obstable for which I wish there was a solution.
- 25 Jun 2008
Lars, you can
decentralize the management of groups. Just not using the default TWikiUserMapper
. It probly isn't too difficult to write an extended mapper that adds topic based group definition in other webs, but it will add more slowdowns to the code. Personally I prefer to use a database backed user and groups system, like HTTPDUserAdminContrib
(mmm, i'll have to make sure that the released version supports groups..). It would be nice to have one of the UI guru's develop us a group management UI - but that hasn't happened yet. (when you see this - I'd suggest moving decentralize management of groups
to a feature request in Codev.)
- 26 Jun 2008