Tags:
create new tag
view all tags

Feature Proposal: Plugins need a working/temp file cleanup mechanism

Motivation

Plugins can (and do) easily create working files, but there is no cleanup mechanism. This is a serious headache for site admins. See the May 2011 discussion under http://twiki.org/cgi-bin/view/Plugins/GaugePluginDev that brought this to a head.

Description and Documentation

Implement a plugin callback from tick_twiki. This puts policy in the hands of the plugin; encourages plugin writers to think about the issue, and relieves the burden on administrators.

Extend tick_twiki to plugins. Something like:

# Run plugin garbage collectors

use warnings;
use strict;

foreach my $plugin ( @{$twiki->{plugins}{plugins}} ) {
    next if( $plugin->{disabled} );

    local $TWiki::Plugins::SESSION = $twiki;

    my $cleanup = $plugin->{module} . '::pluginCleanup';

    if( defined( &$cleanup ) ) {
   no strict 'refs';
   &$cleanup( $twiki, $now );
   use strict 'refs';
    }
}

where plugins can have a hander somewhat like this proposed default:

# Garbage collection function called by the tick_twiki.pl cron script
# You should cleanup any old/orphaned files that your plugin creates
# in your working area (or elsewhere).

sub DISABLED_pluginCleanup
{
    my( $twiki, $now ) = @_;

    my $wa = TWiki::Func::getWorkArea($pluginName);
    my $oldest = $now - (24*60*60);  # One day - adjust to suit or use a config variable (e.g. $TWwiki::cfg{$pluginName}{MaxAge})

    # Age-based cleanup

    foreach my $wf ( glob( "$wa/*" ) ) {
      unlink $wf if( (stat( $wf ))[9] < $oldest );
    }

    # Orphan cleanup.  Suppose your working files are "$web_$topic_data" 
   # but .conf files are never to be deleted:

    my $webRE = TWiki::Func::getRegularExpression('webNameRegex');
    my $topicRE = TWiki::Func::getRegularExpression('wikiWordRegex');

    foreach my $wf ( glob( "$wa/*" ) ) {

      my( $web, $topic, $type ) = $wf =~ m!^($webRE)_($topicRE)_.*(?:\.(.*))?$!;

      next unless( defined $web && defined $topic && defined $type && $type ne 'conf' );

      unlink $wf unless( TWiki::Func::TopicExists( $web, $topic ) )    }
    }
} 

Here's a working example for the VarCachePlugin (after my recent patches):

# =========================
sub pluginCleanup
{
    my( $twiki, $now ) = @_;

    my $webRE = TWiki::Func::getRegularExpression('webNameRegex');
    my $topicRE = TWiki::Func::getRegularExpression('wikiWordRegex');

    my $wa = TWiki::Func::getWorkArea($pluginName);

    foreach my $wf ( glob( "$wa/*/*_cache.{head,txt}" ) ) {

      my( $web, $topic, $type ) = $wf =~ m!^$wa/($webRE)/($topicRE)_cache\.(head|txt)$!;

      next unless( defined $web && defined $topic );

      unlink( $wf ) unless( TWiki::Func::topicExists( $web, $topic ) )
    }

}

This is the correspondng Foswiki work item

Examples

Impact

WhatDoesItAffect: API, Documentation, Plugins, Usability

Implementation

I'm not exactly sure how best to iterate over installed plugins, but the essence of the idea is coded in the GaugePluginDev discussion referenced above.

This should be a minor coding effort with a major payback for site administrators.

-- Contributors: TimotheLitt - 2011-05-23

Discussion

I had a few minutes and prototyped this concept using TWiki 4.2.3 and my recent VarCachePlugin patches. It seems quite workable - the loose ends are whether I picked the best way for tick_twiki to iterate over the plugins and whether I have to check for disabled plugins. The code is in the discussion referenced above.

I think this effort demonstrates that the approach is viable.

I hope someone picks it up and integrates it.

-- TimotheLitt - 2011-05-23

Good idea, there are plugins that produce working data that should be cleaned up. And it is logical to delegate this task the the plugin itself.

I copied your example code from GaugePluginDev to the proposal above.

Instead of callback name pluginCleanup I suggest to name it cleanupWorkareaHandler or periodicCleanupHandler or the like. This brings it in line with other callback handlers, and the name suggests what it is supposed to do.

As for implementation, I recommend to study and use the code of other callbacks, such as beforeEditHandler is TWiki::UI::Edit. That way you do not need to worry about making sure all enabled plugins that have this function are called.

-- PeterThoeny - 2011-05-24

Thanks for the feedback.

There was a reason for not using "Handler" in the callback name and for not using the core plugin callbacks.

This is not a handler used by webserver functions (rendering, save, attachment, etc.) I did not want to burden every request with the overhead (beyond the unavoidable compilation of an unused function) of another entry in the registered handler mechanism. I also did not want to modify the core to add these entries, as this makes much more work for a code merge.

It turned out to be simple to do the enabled plugin scan from tick_twiki, which already had a session object. And this puts the cost only where it's used.

I am not wedded to the current name, but I wanted a name that made it clear that this is NOT the same family of handlers used by the webserver runtime. initPlugin already existed, so I went for pluginCleanup. I thought 'plugin' was close enough to a reserved word that no accidents would happen with existing plugins...

I kind of like periodicCleanup (no Handler), but that doesn't fit either model.

Also, see the similar discussion on Foswiki - it would be good to have compatibility.

A suggestion there was that there are other periodic functions that this mechanism can be used for. I will update the prototype so that the function is called with "Time since last call" as well as "now".

It will be a bit awkard to negotiate this in both forums; stay tuned...

-- TimotheLitt - 2011-05-24

I have a working TWiki prototype. See it, the discussion and documentation in http://foswiki.org/Development/PluginGarbageCollection.

I encourage you to give it a whirl and participate. I'm not reposting everything here; it's too much of a hassle to keep in sync. They have to run TWiki to play with it; you have to visit their site to get it. Seems fair to me.

I think it's well on its way to being useful - see what you think. The time to participate is before they start porting to Foswiki and imposing their styles...

I posted a sample log showing startup and multiple schedules/arguments of a task. Code is on Foswiki, but I thought I'd post the log here as well so you have a taste. The error is expected - there is a test that intentionally tries to delete a task belonging to another namespace.

Periodic Task(I)Wed May 25 08:38:05 2011: Schedule::Cron - Starting job 0 with ('initWiki','none',{'p' => '/var/www/servers/twiki/working/tick_daemon.pid','d' => 1},bless( {...}
Periodic Task(I)Wed May 25 08:38:06 2011: AddTask: 0-59/2 * * * * 30 TWiki::Plugins::PeriodicTestPlugin::cronTask1( 1,4,19 )
Periodic Task(I)Wed May 25 08:38:06 2011: AddTask: 15 8-17/2 * * 1-5 TWiki::Plugins::PeriodicTestPlugin::Mail( runmail,Mailer.Log )
Periodic Task(I)Wed May 25 08:38:06 2011: AddTask: 18 20 * Jul-Sep Sun,Sat TWiki::Plugins::PeriodicTestPlugin::News( runnews,News.Log )
Periodic Task(I)Wed May 25 08:38:06 2011: initWiki
Periodic Task(I)Wed May 25 08:38:06 2011: AddTask: 0-59/2 * * * * 30 TWiki::Periodic::TickTock( HASH(0x87ed2dc) )
Periodic Task[24320]: Event queue listing
Periodic Task[24320]: 0-59/2 * * * * 30 Next: Wed May 25 08:38:30 2011  - TWiki::Plugins::PeriodicTestPlugin::cronTask1 (session, 1, 4, 19)
Periodic Task[24320]: 15 8-17/2 * * 1-5 Next: Wed May 25 10:15:00 2011  - TWiki::Plugins::PeriodicTestPlugin::Mail (session, runmail, Mailer.Log)
Periodic Task[24320]: 18 20 * Jul-Sep Sun,Sat Next: Sat Jul  2 20:18:00 2011  - TWiki::Plugins::PeriodicTestPlugin::News (session, runnews, News.Log)
Periodic Task[24320]: 0-59/2 * * * * 30 Next: Wed May 25 08:38:30 2011  - TWiki::Periodic::TickTock (session, HASH(0x87ed2dc))
Periodic Task[24320]: End of event queue
Periodic Task(I)Wed May 25 08:38:06 2011: initWiki finished successfully
Periodic Task(I)Wed May 25 08:38:06 2011: Schedule::Cron - Finished job 0
Periodic Task(I)Wed May 25 08:38:30 2011: Schedule::Cron - Starting job 0 with ('TWiki::Plugins::PeriodicTestPlugin::cronTask1',bless( {...}
Periodic Task(I)Wed May 25 08:38:30 2011: TWiki::Plugins::PeriodicTestPlugin::cronTask1 finished successfully
Periodic Task(I)Wed May 25 08:38:30 2011: Schedule::Cron - Finished job 0
Periodic Task(I)Wed May 25 08:38:30 2011: Schedule::Cron - Starting job 3 with ('TWiki::Periodic::TickTock',bless( {...}
Periodic Task(I)Wed May 25 08:38:30 2011: Expire sessions
Periodic Task(I)Wed May 25 08:38:30 2011: Expire leases
Periodic Task(I)Wed May 25 08:38:30 2011: Cleanup plugins
Periodic Task(I)Wed May 25 08:38:30 2011: ReplaceSchedule: New schedule for TWiki::Plugins::PeriodicTestPlugin::cronTask1: 0-59 * * * * 10
Periodic Task(E)Wed May 25 08:38:30 2011: Schedule::Cron - Error within job 3: delete at /var/www/servers/twiki/lib/TWiki/Plugins/PeriodicTestPlugin.pm line 152.

Periodic Task(W)Wed May 25 08:38:30 2011: TWiki::Periodic::TickTock exited with status 1
Periodic Task(I)Wed May 25 08:38:30 2011: Schedule::Cron - Finished job 3

Enjoy.

-- TimotheLitt - 2011-05-25

Timothe, I invite you to drive this feature proposal, including committing as a developer, coding, creating the docs, testing, and checking in the feature into SVN trunk so that we have the feature in soon to be released TWiki-5.1.

-- PeterThoeny - 2011-05-25

A 'problem' that needs to be addressed is with how do you decide which images to remove. The above suggestion is to delete "old" (24+ hours) Plugin:GaugePlugin image files, but I could imagine a situation where that is the wrong thing to do. For example say I've a topic page that is used to pre-create a bunch of gauges. I've a cron job that runs once a week tickling this page pre-creating all of my gauges. I then have numerous other topic pages that reference these pre-created gauges.

So having a per-plugin preference variable to define this timeout would be needed instead of hard coding any value. We would probably also need some way to allow admins a way to define a filtering preference with a value being a regular expressions defining which files to clean up (or which files to ignore). So maybe two preferences, one for which files to include/clean and a second preference for which files to exclude. So include_filter and exclude_filter.

-- TaitCyrus - 2011-05-25

Tait -

Everything that you requested is already possible in the proposal, which is a framework for plugins and extensions needing periodic tasks. File deletion is one such task.

You can either garbage collect on the system tick schedule (which is adjustable in configure), or you can specify any schedule that crontab can express for your plugin. Or you can have as many schedules and/or tasks as you like.

The sample plugin routine is just that - a sample. It's the plugin authori's responsibility to understand what makes sense and what filters, age limits, file locks or other actions (like sending mail) are appropriate. You then code your task(s) in the plugin. As the example shows, they can be quite simple.

On the other hand, this can provide scheduling for mail and news notify, or database maintenance - which are very complex.

I do not want to provide a generic routine for file deletion in the framework. There is too much variety in how and where plugins store persistent data. Maintaining a generic routine would become an endless headache. It's better to distribute the problem to the code (and people) of each consumer.

-- TimotheLitt - 2011-05-26

Peter,

I will finish the code, document and test against the version of TWiki that I run. I will provide a tarball of the changes.

I will not check-in to SVN (you can remove my checkin access, I've never used it.) I don't expect merging into trunk or other releases will be difficult - I've gone to considerable effort to code this independent of the core. At present, tick_twiki.pl is completely replaced, and there's a patch to TWiki.spec. And a small number of new files.

Checking-in to SVN has an enormous overhead for the occaisional user - as I've said in the past, too much for me. It's relatively easy for you, because you do it daily and have for years.

So my strategy is to do what I do best - create things that are useful and easy to merge. And let you, Sopan and others do what you have the expertise to do - handle the checkin overhead.

I appreciate your support.

-- TimotheLitt - 2011-05-26

I have a much closer to finished prototype available, as well as a good start on a real documentation page.

The change to twiki.spec has been reverted, but a minor patch to a configure module is required.

This is snapshot 2.0-005. Although the code is TWiki-based, I posted on the Foswiki site.

Here are the links:

Documentation page: http://foswiki.org/Development/PeriodicTasks

Discussion page: http://foswiki.org/Development/PluginGarbageCollection

Code snapshot http://foswiki.org/pub/Development/PluginGarbageCollection/periodic.tgz

The documentation page is worth reading - it has both admin and developer information, including install instructions for the currrent snapshot. There are screenshots of the configure interface.

Everything documented is running and has passed basic unit tests (manual, I'm afraid.)

Everything is documented - well, except for a patch to Schedule::Cron, but hopefully the owner will keep his promise to merge and release a kit late this week.

I think it's a pretty solid design that provides a framework that many can build on. It might evolve a little more, but I hope to hear that it meets the known requirements.

It's become more than my original simple proposal, but it opens the door to a lot of off-line processing in a persistent environment.

Enjoy.

-- TimotheLitt - 2011-05-31

Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r10 - 2011-05-31 - TimotheLitt
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.