Tags:
caching1Add my vote for this tag performance1Add my vote for this tag scalability1Add my vote for this tag create new tag
, view all tags

Scalability of TWiki

2008-03-26 - 06:24:31 by PeterThoeny in General
Sometimes we get the question on how well TWiki can scale. This blog post compiles scalability related information so that you can plan your TWiki deployment effectively.

Scaling Across Teams and Departments

TWiki was designed as an enterprise wiki from its inception. You find features specifically designed to support large deployments. Other wiki engines have a different focus and may lack some of these features. Wikis typically flourish in grassroots. Once at the radar screen of the CTO/CIO, grassroots wikis often get consolidated into a central TWiki. That is when scalability comes into play. Key scaling features of TWiki:

  • Multiple webs (workspaces):
    • You can create as many webs as you need. Some large TWiki deployments have over 1000 webs. Think of a web like a wiki within TWiki. Each team can get their own wiki. People need to register only once, then they can create content in their own space. If needed you can link across webs, such as to reference a registered user or an entry in the Glossary web.
  • Fine grained access control:
    • You can create TWikiGroups and restrict access to content for view and edit based on those groups. Although it is possible to restrict access on a topic (page) level, it is typically done on a web level for ease of administration.
  • Authentication:
    • In a large deployment it is advisable to authenticate users against your directory server, such as Active Directory or LDAP. That reduces the workload on registration/login questions.
  • File attachments:
    • TWiki has a per-topic namespace for file attachments. That means, if one team uploads a file called inventory.xls to their team page, and another team uploads a file of the same name to a different page, they will not collide. Try that with Mediawiki or other wikis.
    • You can limit the maximum size of attachments that can be uploaded. This can be done for the whole site and also on a web level.
  • Organize content:
  • Web application platform:
    • Web 2.0 is all about user generated content. TWiki is a web application platform where you can install ready made applications. For example, check out the BlogAddOn, the TWikiDotNetForumAppAddOn and other TWiki extensions in the Plugins web.
  • Create your own applications:
    • TWiki goes beyond Web 2.0: The TWiki platform is about user generated application logic. Your users can create situational applications that solve specific business needs, such as a bug tracker, a employee news portal, TWiki's Support web and more. You do not need to be a programmer; all application logic is done in TML (TWiki Markup Language) using TWiki forms, reports and optionally some HTML and JavaScript.
    • The IT department is in charge of the wiki dial tone and wants to have some control over the wiki deployment. With TWiki you allow users to experiment in a controlled environment. That is, IT can get the dreaded "shadow IT" under control.
  • Integrate:
    • TWiki has a plugin API and ready made plugins to connect to external databases. That way you can run a query in MySQL and other RDBMS and display the result in TWiki pages. Useful to show CRM data to sales teams and bug trends to engineering teams. See Extensions:database.

Server Selection, Caching, Load Balancing

Plan for adequate server hardware when you deploy TWiki. The following are ballpark figures for an sample TWiki deployment serving 1000 employees and 50,000 pages:

  • Enterprise class Linux
  • Dual core CPU 2.6 GHz
  • 2 GB RAM
  • RAID 1 or RAID 5 for redundancy
  • Dual power supply for redundancy
  • Plan disk space:
    • Page content: 15MB per 1000 page (yes, MB, not GB)
    • File attachments: 1GB per 1000 pages

If you have a high read to write ratio (such as a TWiki on the public internet) consider a caching solution and/or a load balanced setup.

  • Load balancing:
    • For high volume traffic sites it is possible to put TWiki on a load balanced setup. Here is an example:
      • Cisco Ace load balancer.
      • 3 webservers.
      • NAS storage back-end.
      • Webservers share data on NAS for pages, file attachments and log files.
    • In the early days, TWiki.org was on a load balanced server setup while hosted at SourceForge. Now it is on a single and aging server hardware. The TWiki community plans to move TWiki.org again to a load balanced server setup which will improve the performance considerably.

Scalability of Search

TWiki uses the Unix grep command to search content in real time. This enables flexible and powerful searches in real time, which is important for TWiki applications. Search is covered in SearchHelp, VarSEARCH, QuerySearch, FormattedSearch and SearchSupplement.

The real time search has a performance impact. Searching all webs in a TWiki sites with more than 50,000 pages can be slow. If you have a large TWiki deployment of more than 50,000 pages it is advisable to index TWiki content with a search engine. This can be done with a commercial search engine such as the Google Search Appliance or an open source search engine. TWiki currently has three open source search engine integrations: SearchEnginePluceneAddOn, SearchEngineSwishEAddOn and SearchEngineKinoSearchAddOn. See more Extensions:search.

To scale the queries of TWikiForms based TWiki applications look into DBCacheContrib and DBCachePlugin.

Flat File Back-end

Some people express concerns that TWiki's flat file back-end does not scale well. We know of a number of large TWiki deployments that have over 300,000 pages (such as at Yahoo), over 1000 webs (such as at a major telco company), and over 10,000 users (such as at a major financial institution in USA).

A flat file based storage back-end has several advantages:

  • Simple installation
  • Simple backup and restore
  • Simple migration of content between TWiki installations (think of grassroots wiki consolidations, spin-offs and acquisitions of companies)
  • Well understood caching and replication technologies available
  • Resilient to data corruption

Some scaling factors:

  • TWiki scales well on number of webs, e.g. it does not matter much if you have 3 webs or 3000 webs.
  • TWiki has a limit on the number of pages in a web. You will see a performance impact if you have more than 20,000 pages in a single web. This depends on the file system/configuration used, on the bandwidth of your server I/O and on the memory installed.
  • TWiki scales well on the number of registered users. We have not done tests on the upper limit. It is also feasible to not register users in TWiki, e.g. to rely solely on LDAP login.

As stated above, performance can be addressed with caching and/or load balancing.

The TWiki community is working on a pluggable storage back-end, see TWikiRoadMap.

(This post is based on Peter Thoeny's blog post on Scalability of TWiki, also posted as a supplemental document at TWikiScalability.)

Comments

MichaelDaum - 26 Mar 2008:

See also the discussions on the upcoming TWikiCache, a build-in caching infrastructure and dependency tracking. Compared to all other caching solutions this one aims at (1) correctness: never deliver outdated wiki content and (2) transparency: no extra provision is needed by the wiki author to get content cached. The TWikiCache will be part of TWiki-5.0. There are backports available for 4.1.x and 4.2.x as well.


MartinSeibert - 26 Mar 2008:

Very valuable information, Peter. Thank you.


ColasNahaboo - 02 Apr 2008:

On already available performance enhancements, there is also:

  • perl accelerators: SpeedyCGI PersistentPerl ModPerl
  • using / writing faster skins:
    • less CPU-consuming (few or no %TMPL: and %INCLUDE tags). Do computation more client-side (javascript, ajax,...)
    • smaller amount of CSS & javascript
    • serving (pre-)compressed css & js
    • minimizing the number of http requests to get files (see FirefoxBoosterPlugin
  • using long expires on images, css & javascript files
  • deporting these files on servers specialized in serving static files
And there are other solutions in the works, such as Michael one, or Gilmar work as part of TWikiStandAlone

PeterThoeny - 03 Apr 2008:

Colas, good additional info. Possibly update the supplemental document TWiki.TWikiScalability?


.

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r3 - 2013-12-18 - PeterThoeny
 

Twitter Delicious Facebook Digg Google Bookmarks E-mail LinkedIn Reddit StumbleUpon    
  • Help
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.