Tags:
database_store1Add my vote for this tag preferences1Add my vote for this tag create new tag
, view all tags

Preferences in a Database

It occurs to me that "Preferences" is one of the things that can be migrated out of the Topic pages and into a database. A simple ndexed file is all that's required, not need for the large footprint of a relational database.

By addressing preferences as a generic, the impact on the code base is simplified. See below in "implementation".

Benefits

While preferences as test in a topic is a conventient format, it is not robust; in fact it is very error prone.

Apart from offering a mechanism that can elimnate many end user typos and making the mechanism more robust by moving access control out of the topic, a datbase consolidates the preferences into one repository, one interface method.

One DB or Many?

In the limiting case there could be a single DB, providing lcoking issues could be addressed. Failng that, a DB for the TWiki, for users, for plugins, for groups, for webs, and one in each web for the topics.

In this latter case, a simple key-value arrangement will suffice.

Classification in a single DB

If the consolidated DB is used, another indexed field will be needed to indicate which class of preferences:

  • User.Usename (e.g.User.AntonAylward)
  • Plugin.Pluginname (e.g. LocalCityTime)
  • Group (e.g. TWikiAdmin)
  • Web (e.g. Web.Codev)

Examples

So we can have:

  • Class = Web.Codev
  • Key = WEBBGCOLOR
  • Value = #FFD7D7

And the access control isues can be addressed wth the likes of:

Implementation

... of the database

The Perl DBI module allows databases to be implemented in a number of ways

  • Flat Files
    • Delimited text - like the /etc/passwd file or a CSV
    • Perl "Data:Dumper" format - which is essentiallya perl data structure text
    • Perl "Storable" format
  • Indexed files
    • Berkely DBM files, ndbm, db, gdbm, sdnm
      • Simple key-value
      • Encapsulated perl data strcutures
      • "Multilevel" using the MLDBM package

  • SQL and Relational
    • MySQL and mSQL
    • DB2
    • Oracle
    • Sybase
    • Xbase (e.g. dBase)
    • Microsoft's ADO

... of the code

There are going to be THREE items of code that are neeeded.

The first is the replacement for TWiki::Prefs (and hoopefully in die course a simplifiction of TWiki::Access)

The second is the web form based interfces that will allow users to modify the preference values. This will be more focused than the present method of editng text. For example, when editing "Web.Codev, Key = ALLOWWEBRENAME" the values can be constrained to users and the groups in the database.

The third is a display function. Some kind of percent-variable expansion is needed to show in topic the value, for backward compatibility. So, in a page listing the web preferences there might be the line

    * Set ALLOWWEBRENAME = %PREFERENCE{Web.%WEB%,"ALLOWWEBRENAME"}%

Other

The current method of dealing with preferences involves slurping up the various preferences files and stepping through them line by line. No, not reading them in line by line.

Compared to this, keeping prefences in a DBM hash file would be an enourmous boost to an execution path that is critical to all operations within Twiki.

-- AntonAylward - 01 May 2003

How about this.

  • Keep preferences as a topic.
  • when the preferences are saved, they are exported through translation/codegen to a perl package ('webpref.pm'). I just saw you mentioned Dumper as a suggestion, so we're thinking about the same thing.
    • just hook the save/store to check for the pref topic name, and call the code gen/dumper.
    • this is done for each web, so that web prefs can over-ride site prefs.
      • a seperate 'webpref.pm' is generated within each web's data dir.
  • 'webpref.pm' for each web is "included" by engine

This solves:

  1. speed problem. there's no need for a database when the perl can be generated directly. why waste time reading/parsing the prefs each time, when they can be 'include'ed directly.
  2. translation from db to edittable file/topic. the topic is the "master copy" and is always parsed into the pref.pm. No reverse mapping is ever done, so there's no need for code which allows in-browser edits of the prefs.
  3. out of sync problem. if there is any assumed duplication of vars, one must be labelled as the master copy. The master copy is the topic wikitext.

-- JonathanCline - 10 May 2003

That's a great idea Jonathan.

Could you explain how a few points would be dealt with.

  1. If its a topic it can be edited. Most people aren't used to editing perl structures. The only way you can make a specific topic non-editbale is to put access control prefs in it, which messes up the perl structure.
  2. I don't understand your 'out of sync' prolblem. With a DB there is just the one copy and its its not in the body of the topic.
  3. What triggers teh conversion when the topic is saved?
  4. What if I change the topic with VI or EMACS
  5. How is parsing a perl data structure in a file faster than reading the specific elements with a DB?
  6. Some Webs have files where there are settings in many topics. TWiki/WebPreferences and TWiki/TWikiPreferences, as well as the preferences in the Plugin control topics that are in the TWiki web.

Ties

As for speed, a hashed DB and 'tie' is actually going to be fast. The point here is that it doens't all have to be read in, parsed and the structure built up. The structure is on the disk anyway.

Have you actually experimented with DB files, hashes and ties? Settings are not "complex data strcutures", they are just key-value pairs. The value is a simple string, never anything with "structure". If we are going to parse them, then a CSV is perfectly adequate - we dont need Dumper. See Chapter 9 of the camel book (3rd edition) for some examples of complex records.

Whatever the database, once the hash is tie'd it behaves just like a hash that has been built up be reading a topic like by line for "set" statements, puling them apart, and putting them in the hash. Only with a tie'd DB_HASH there is no parsing overhead.

(We might need to justify compex records for metadata becuase of forms being ordered and the possiblity in the future of multiple forms, and of course the ordering of attachments.)

I don't recommend reading "man perltie" because it tells you so much its actually confusing. Look at a textbook example of a tie'd database.

  • With a Hash any setting value can be accessed instantly and individually
  • If the persistent store is a Dumper structure, it has to be parsed.
  • If the persisitent store is a DB any one element can be accessed.

If we think of the situation where TWiki::Prefs::initiaizePrefs() walks through Site-level, Web-Level and User level topics, it is called early in in the CGI scripts on the supposition that all the values will be needed. This is not so. Regardless, though, if we have a complete persistent store of the settings out of band, we can still determine access early on in the CGI scripts

As I'm discussing elsewhere, the acccess control settings, if available, can be used early on to avoid additional processing. The way things are now, the topics have to be rendered. The topic might contain "Set DENYTOPICUPDATE = %MAINWEB%.%NASTYGROUP%". To resolve this involves rendering the whole page.

One of the points I'm trying to make is that the present implementation spends a lot of time walking through the topic files line by line search to see if the have "META:" or "&nbsp&nbsp&nbspSet name = value". Compared to this, any of the forms of the databases is faster because its more focused.

Take a look at the actual code in Prefs.pm; talk a walk-through of the calls involved in outputting a page from bin/view. There's a lot that can be simplified. ThIs shows how that scanning each line of the topic can be elimianted.

   sub checkAccessPermissions {
       
      # @settings is tie'd to the appropriate persisitent store .....
      
      my @allowlist = map { getUsersofList ( $_ ) }
                         prvGetUserList( @settings{ "ALLOWTOPIC$theaccesstype" } );
      
      my @denylist = map { getUsersofList ( $_ ) }
                         prvGetUserList( @settings{ "DENYTOPIC$theaccesstype" } );
      
      .....
   }

In actuality the call from Access::checkAccessPermission() -> Store::readWebTopic() -> Store::readFile() has the potential for a bug -- Prefs::checkAccessPermissions() could find a set statement in a topic if it was called without rendered text as a parameter on a topic that was discussing the setting. Having settings in-band is bad news. Having them out of band avoids such potential errors.

As I've shown elsewhere, to view a topic takes 13 topic acceses on my system. Much of this is scanning that topic, the Web Preferences and the site presferecnes for the access control settings. Later on its the plug-ins doing look-ups. The settings for each of these don't need to be ordered so the database interface is simple.

See also:

  • Tie::TextDir
    This lets you read a file in one go.
The folowing example trivializes rendering if there is no metadata in the topic.
       tie( %hash, 'Tie::TextDir', "$dataDir/$web" );
       print &doRenderItAll( $hash{ "$topic" };

Settings aren't like metadata. Some metadata does need to be ordered. But not all metadata has to go in the same part of the store. If we are dealing with attachements we can have a file $web/.attachments.$topic.$attachdbtype. Hey, we can even have the different types of stores handled differently. The attachdbtype couudl be CSV, HASH or whatever. Personally I think the added complexity overcmes the benefits of sperating out the stores.

And don't forget that even with a DB_HASH tie, the payload can be a list or array.

-- AntonAylward - 10 May 2003

Whew. I guess you've given it serious thought. Here's what I meant, simple idea, really. Of course, I forgot about the user prefs & arbitrary vars in any topic.

  1. admin edits TWikiPreferences topic (like normal, i.e. same as it works now)
    • note: I've felt for a while that all the "helpful" verbiage in TWikiPreferences has got to be cleaned out. Just list the variables! If the admin-person wants help, s/he can pop up a browser window on the_seperate topic_ which details each variable.
  2. user makes changes in text box (like currently)
  3. user hits Preview
  4. twiki shows preview
  5. user hits save
  6. twiki sees the topic being saved is TWIKIWEB.$wikiPreferences and:
    1. saves the topic as normal (to TWikiPreferences.txt)
    2. iterates over each Web
      1. reads/parses the TWikiPreferences.txt settings
      2. reads/parses the Web's WebPrefereces.txt
      3. dumps resultant vars into a webprefs.pm file unique to each web

The same type of procedure would happen when editting WebPreferences (which updates the webprefs.pm), or when a user edits their home topic (which updates a userprefs.pm).

Then, during normal operation (view, etc) in a web, twiki includes:

  • the webprefs.pm file
  • the userprefs.pm file (overrides the former?)

There's only one data conversion performed: topic raw text -> perl for inclusion, and that time is spent right after the topic save (when the user can wait). The data is included via the package. Using any DB to parse/read in values from an arbitrary db file can't be much faster than include?

Note, this has nothing to do with metadata, just pref variables.

And, for variables set in arbitrary topics or in plugin areas, I dunno how that would be handled. In my mind that would be kept the same as present and handled on-the-fly.

I believe the main requirement which needs satisfying is this: (which you've addressed in several of your points)

  • preferences should be easily editable both within twiki (in browser text box) and from without twiki (in vi at the shell)
    • as such, there would have to be a command line method of doing the dumping step, in my method, if any prefs were modified outside twiki.

-- JonathanCline - 11 May 2003

I see what you're doing. My objection is that this means code to recognize "special cases". Such "yes-but" code is, to say the least, inellegant. It also represents a maintianance threat since it "bolts-in" certain things. To give you an example, a project I audited recently had the developers hard code IP addresses into the client applciation code for the server to use rather than use DNS.

I seem to be working out a scheme whereby such special cases are not needed. However I'm limited by time and attention span and the need to experiment. Its all very well experimenting with code fragments - its trying it out in a "live" TWiki that takes time. The present code base is not well structured and could do with more partitioning and use of such things as "inheritence" to simplity the interfaces. Some things, such as the handling of "verbatim", are very obscure. Part of this is that it deosn't build a "tree" of the body of the topic.

I'll refator this later, but what I mean is that this is like the old ROFF processors. It sees a trigger and then "just does it". By analogy with parsing HTML: the stream processor sees a tag and calls a switch statement which sees its <FONT ...> and so calls the fontHandler() which just emits the code to change fonts. By contrast, an XML processor builds the tree in memory. Perhaps we need to revisit the XML ideas abuot TWiki to simplify internals before moving on.

You are right about the tools. That I/We use VIM is a cheat. I could equally well use VIM on a CSV file or a small perl script on a DB_File. I'm a UNIX old-timer (nealy 25 years) and consider VI to be the best sysadmin tool around.

The solution I'm heading towards, when I have time to convert my notes to code, is to display the values in the preferences file. The syntax wuld be something like

    <-- TWikiPreferences -->
    * Set HTTP_EQUIV_ON_VIEW = %SET{ scope="site" key="HTTP_EQUIV_ON_VIEW"   }%
or
    <-- WebPreferences -->
    %SET{ scope="web" key="WEBBGCOLOR" format="   * Set $key = $value"  }%

or even better if you just want the seetings with no comment

    %SET{ scope="web" key="WEB*" regex="on" format="   * Set $key = $value"  }%

If my notes are right, writing a skin and plugin to edit the settings is no big deal. See Peter Nicon's MegaTWiki for an example of that. Actually, given the %SET{}% construct it should be possible to build a form or a series of select/option bits in a topic. This is, of course, assuming you don't want to do it in JavaScript.

Just to re-iterate for any other readers who haven't twigged yet:

  • There is no need for a relational database - this is simple key-value stuff
  • Perl has had the mapping of the a tie-d has to a BD file for a long time. Its a reliable and well used technique. I recall using it in 1989.
  • Settings are functionally orthogonal to metadata and the topic content
    • This means they can be stored and manipulated seperately
    • That they are not at present is a side effect of the current implementation, and can have very -ve consequences
  • Performance analysis of bin/view indicates it takes about 25% of its time (never mind disk access latency) in extracting the setting values from a few files.
  • The start-up time for an application using tie'd DB_HASH for persistent storage is independant of the nummber of values stored.
    • By contrast, the start-up time for the present implementation depends on the number of lines in the files, whether they have a setting on them ore not.

-- AntonAylward - 11 May 2003

Please note that the alpha release (CVS ...) has the ability to edit settings using TWiki forms - see UsingFormsForSettings.

-- JohnTalintyre - 28 May 2003

I've attached an optimized version of Prefs.pm . Please note this file is probably not usable right now for your environment as it needs some further cleaning. When used with mod_perl, it does not save the prefs, there is a little patch required for view.pl/render.pl . Stay tuned ...

-- MichaelRausch - 04 Jun 2003

How can you not save for mod_perl? With mod_perl there will be multiple processes with independent memory spaces - are you moving preferences to shared memory or ...

-- JohnTalintyre - 04 Jun 2003

There is initialize_prefs(), which is called from TWiki::initialize() on startup of every script. The preferences are parsed not only during this initialization, but also later on during initialization for the plugins, and for example when %VAR% variables are expanded. At no time during all these calls to Prefs.pm I can be sure that there is no further call. Right now I'm using the END() destructor for Prefs.pm to store the hashes when the script finishes (if something changed), but this is not true for mod_perl, because the module/package never finishes during normal use. So I need a hook somewhere when there is no further modification to the preferences hash.

-- MichaelRausch - 05 Jun 2003

Did you do any more on this Michael? Thanks.

-- MartinCleaver - 12 Dec 2003

I'm interested in any progress on this as well. Due to performance issues with Cairo I started poking in the code the other day and toyed with some Prefs caching. Basically added a very crude hack in TWiki::Prefs::TopicPrefs->readTopic() to cache pref and metadata key/values to a consolidated text file. Subsequent page loads seemed a fraction faster (unreliable bench via ab), but the "cache" obviously had to be read in sequentially.

-- RobertEdeker - 03 Sep 2004

I did some experimentation on this, using Storable for the cache. Didn't seem worth the effort, even using core benchmarks (more accurate than ab)

I personally think this is a blind avenue, unless you envisage a huge number of prefs. Expanding the prefs takes far more time than reading them; repeated application of s/// for each variable over the whole of the topic text kicks seven kinds of heck out of the performance. Of course, I can be proven wrong.

-- CrawfordCurrie - 03 Sep 2004


Related topics: DatabaseForPreferences PrefsDotPm AddValuesToPreferenceVariable MetadataSucks MetaDataSucksMeasurement StepByStepRenderingOrder DatabaseForAccessControl SimplifyInternalMetaDataHandling MetaDataDefinition DatabaseForMETA MegaTWiki

Topic attachments
I Attachment History Action Size Date Who Comment
Texttxt Prefs.pm.txt r1 manage 10.2 K 2003-06-04 - 14:25 MichaelRausch Optimized Prefs.pm; right now not usable as is!
Edit | Attach | Watch | Print version | History: r18 < r17 < r16 < r15 < r14 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r18 - 2008-09-17 - TWikiJanitor
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.