create new tag
, view all tags

Dealing with Groups that have very large number


I'm setting up a site that has the intent of migrating people from a mailing list to a wiki.

The way this adds value is that it makes the "archives" of the threads of discussion searchable. There is a noticable amount of "churn" in the list membership and the same questions keep coming up over and again. If you're familiar with mailing lists, you'll also know that simply archiving the mesages results in bloat for a few reasons:

  • All the e-mail headers, the "key: value" lines that you don't see
  • Trailer lines added, long signature blocks and trailers added by ISPs
  • Poor etiquette: previous messages in the thread remain, nested with lots of "> > >" prefixes.

The last point can also upset any simple search on an archive of the messages. BTDT!

The Problem

The basic division is a web per list. The lists are related and some members are on more than one list. The intent is that members can only contribute to the area corresponding to the list they were on. This imapcts the registration process, which I'd like to automate.

The problem is the large number of users on the lists.

One list has over 3,500 members; another over 9,000. One organization which is interested says they have 17,000 members with e-mail addresses on their books who could be potential users of the wiki!

I'm concerned, very concerned, looking at the code in lib/Access.pm, about the performance of dealing with large groups.

Before I start populating the groups files, can anyone tell me if they have met this kind of situation before? How bad is it?

-- AntonAylward - 05 Nov 2004

I do not have a good answer but something to start with. At my daytime job we have over 1000 users and do not see a noticable performance impact on restricted vs unrestricted webs.

I guess you will hit first a limit on the number of topics in the Main web. At TWiki.org we have 12K users; there is a noticable impact on searching the Main web. You can address that by configuring TWiki to use RCS subdirectories (and moving the ,v files to the subdirectories), and possibly by using a filesystem that is fast with large number of files (such as Reiser FS)

-- PeterThoeny - 06 Nov 2004

I'm long since aware of the Big-O-Squared performance of the original UNIX file system from the V7/SYSIII/SYSV days. The Berkeley FFS of 1983 heritage began to address that problem. HierarchicallyNestedTwikiWebs would be a more sensible approach, if we can ever figure it out!

But my question was really about the code not about the file system. UNIX file system performance has been increasing over the years.

-- AntonAylward - 06 Nov 2004

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r4 - 2008-08-24 - TWikiJanitor
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.