archive_me1Add my vote for this tag dakar1Add my vote for this tag create new tag
, view all tags

Bug: Dakar merge model problems

Dakar introduced a new model to avoid conflicts during concurrent topic edits: merge the resultant files and ask the users to resolve the conflict after the fact.

Note that the merge philosophy inherently has to rely on that in the editing mode you can see how the two versions conflict. This is relatively easy in the white board application, but impossible, for example, with TWikiForms:

  • You cannot see the difference between two different forms in the merged version when in edit mode.
  • Even if the form is the same, many input fields cannot show the changes (radio, checkbox, select).

As a consequence, with respect to the form (and any other kind of metadata that applications may use) we now have the situation that concurrent editing tramples on each other's data without possibility of easy recovery.

Merging only works (well) for the whiteboard application. Any TWikiApplication that is not whiteboard based (even the form application) is severly hampered by the new model.

At minimum, each of the applications will have to come up with its own merge model and an implementation. Not only does this put an enormous burden on the application developers, also we have no mechanism at this point of adding the code that would do the merging into the existing merge code dynamically.

I propose that the merge functionality is a configuration option and that users can decide to stick with the concurrent lockout strategy.

This is unfortunate, as the new model is a great idea for whiteboard applications. I just cannot see how this can work with any other TWikiApplication. (Note that this problem is the same in other areas of concurrent work. Whenever what you see is different from what the comparisons performed on, you have a problem. Compare the maturity of merging for program code with that for graphical modeling tools and the huge effort CASE vendors are currently putting into the capability of comparing and merging graphical models. Such features are just coming out... in the UML space compare Rational Rose with, iLogix Rhapsody or Telelogic Tau, for example.)

Test case


TWiki version: Dakar
TWiki plugins: DefaultPlugin, EmptyPlugin, InterwikiPlugin
Server OS:  
Web server:  
Perl version:  
Client OS:  
Web Browser:  

-- ThomasWeigert - 08 May 2005

Follow up

Another option is to fully implement the original proposal and redirect to a preview page if a merge would be required. The preview should show the differences in the form between the two versions, and support selecting which rev to keep. I didn;t do this because it's obviously a great deal of work. frown

-- CrawfordCurrie - 09 May 2005

also a great deal of work, but maybe a good idea as it's wanted for other reasons, would be to once and for all split metadata and topics into different containers (files, directories, db tables, whatever). This way the new model could be used on %TEXT%, where it is effective, and not on forms where it is not.

-- MattWilkie - 10 May 2005

Not sure how splitting up helps here... the new model relies on that you make it easy for the user to resolve conflicts. To do so, you need specific merge routines for each type of data and a way of showing the diffs later. Whether storage is combined or split does not change this.

Even a scheme where you merge as long as text is changed, but do lockout once you change metadata can be done independent of the storage model....

-- ThomasWeigert - 10 May 2005

oh, I see I misunderstood. Consider my comment retracted (and you can do so once you've read it.)

-- MattWilkie - 10 May 2005

The more I think about this, the more I think it's a question of usability for the end user. The whole forms system is designed to be as "obvious" as possible; for example, changing the form type does not reset all the fields; as much data as possible is retained.

The basic principles I used in coding up the mergeing algorithm were:

  1. If it's possible to merge without using conflict markers, do so.
  2. If it's possible to merge using conflict markers, do so.
  3. If it's not possible to merge, then the most recent checkin wins.
I tried the simplistic approach with forms, wanting to avoid loading the form definition. I think correcting that error, and making the merge algorithm sensitive to the type of form fields, is the correct approach, and a better approach than reverting to the locking approach (which was a PITA, IMHO). The key of the debate is whether or not to treat all form fields as non-mergeable or not.

How about another attribute in the form definition; if the "C" attribute is set (you already nicked "M" frown ) then the field can be merged using conflict markers. Otherwise the most recent field value is always taken. (I suppose you could have C+ for the most recent and C- for the least recent, but I suspect that would just confuse users).

  • If the form changes, choose the most recent version
  • Only merge form fields if the "C" attribute is set in the definition of the chosen form
How about it?

-- CrawfordCurrie - 11 May 2005

The first issue is that I don't see how you can merge radio, checkbox, and menu fields. The only option that I can see is that in a merged version you would render them as text box input in edit so that the user can resolve the problem.

If we cannot resolve that, I think we cannot merge forms.

I don't think the behavior in the face of conflict is a setting for the form. It is general behavior of the system. In other words, I don't think your "C" attribute is a solution.

But more importantly, forms are only one kind of meta data. What about all the application-specific metadata the merge algorithm knows nothing about? The only way of dealing with that would be to allow application-specific merge code to be added (e.g., another plugin hook).

I think the "lock on conflict" option was much simpler (KISS) and hardly ever created a problem in real life.

-- ThomasWeigert - 11 May 2005

Crawford, I did some further experiments.

Please let me apologize first. I realize that a lot of energy and good work has gone into the merge feature.

However, I am afraid that in the current state I could not deploy this feature.

Here are a view of my observations. These were all obtained with the following scenario:

  1. User A begins to edit
  2. User B begins to edit
  3. User B saves
  4. User A saves

The following I consider real problems:

  • When two users edit a topic with plugins such as EditTablePlugin, the version by user A wins, overwriting what user B had written earlier, without even knowing that this is overwritten. There is no indication that such had happened.
  • When two users edit a topic that has a work flow attached in the style of WorkFlowAddOn, user B can put the topic in a state that prevents user B from saving (if the topic is put into the terminal state of the workflow). It can also happen that the edits that both users make are inconsistent but the workflow ends up in a terminal state and this cannot be corrected any more, as no more edits are possible. There is no indication that this has even happened.
  • No application specific metadata is merged. The later version always wins. No indication that something went astray.
  • There is no way of telling change markers from real text which by accident uses the change marker notation.
  • User A may change the form and provide new values, but user B overwrites this with the original (or another) form.
  • One cannot tell the merge conflicts during edit with radio, checkbox, and select fields.

ALERT! I strongly feel that this feature is not ready for prime time. I suggest we pull it out of Dakar until we all agree on the behavior and have a working implementation. Maybe Edinburgh would be a better target?

At a minimum, we need to make this feature an option. I am ok with having a config flag that determines which model is utilized (lockout vs. merge). It is acceptable for the lockout to always release the lock right after save (to ensure we only need one edit template).

Further, when a plugin locks a topic, this must lock the topic, rather than being a no-op call.

Finally, if we implement the merge, we need to give every plugin a chance to implement a custom merge for its own data. Note that this may still be very hard for the plugin, depending on how it edits its data.

-- ThomasWeigert - 12 May 2005

On the other hand, I feel strongly that the merge capability is the most significant advance made in DakarRelease, and is a powerful addition to the arsenal. Look at TWiki.org, where CommentPlugin is frequently used in a "forum-like" mode to collect inputs. The inputs never conflict. Why should a merge be blocked? Similarly, I frequently need saves to never fail - for example, the MailInContrib should not have to deal with locks. And I should be able to attach a file to a topic someone else is editing, without the frustration of having to trust them to remember to cancel their edit. The locking model really gets in the way in these scenarios.

Perhaps what's really needed is a middle road. The save script can always "bounce" the save back to the browser, and display the differences with radio buttons to allow the user to select which version to use. This process generates a new "second version" that can be safely "force saved" (unless, of course, another user has saved in the interim).

During the merge process, the merge algorithm builds a set of "differences". These differences could be presented to plugins via a simple API. Any plugin that can resolve a difference should do so. Any differences left after the plugins have processed the differences would be merged according to the algorithm described above. Thus a ManualMergePlugin could present the differences to the user for resolution. I describe a useable handler below.

So I propose the following "way ahead"

  1. Disable merging of non-text form fields (radio, select, checkbuttons),
  2. Add the handler described below
  3. Reconstitute the locking model, with release-lock-on-save as the default behaviour, selectable on the configuration option described below.
  4. Target the ManualMergePlugin at EdinburghRelease (though I peresonally doubt I have the energy left to do it, so it will probably be someone else)

The new plugins handler would be as follows:

sub mergeTextHandler( \@differences, \%info )

Try to resolve differences encountered during merge. The differences array is an array of hash references, where each hash contains the following fields:
  • old => text from version currently saved
  • new => text from version being saved
  • diff => one of the characters '+', '-', 'c' or ' '.
    • '+' - new contains the text inserted in the new version
    • '-' - old contains text deleted from the old version
    • 'c' - old contains text from the old version, and new text from the version being saved
    • ' ' - new contains text common to both versions
Plugins should try to resolve differences by transforming '+', '-' and 'c' elements into ' ' elements. For example, a radio button field where we have { diff=>'c', old=>'Leafy', new=>'Barky' } might be resolved as { diff=>' ', new=>'Treelike' }

The merge handler may be called several times during a save; once for the body-text of the topic, and once for each field in the form data. The context of each call is given by the content of the \%info hash, which will be undef for the body text of the topic, but will point to the attributes hash of the field being processed otherwise (with fields 'name', 'title', 'value' etc.). The info hash is read-only and should not be modified.

If any merges are left unresolved after all plugins have been given a chance to intercede, the following algorithm is used to decide how to merge the data:

  1. new is taken for all radio, checkbox and select fields to resolve 'c' conflicts
  2. '+' and '-' text is always included in the the body text and text fields
  3. conflict markers are used to mark 'c' merges in text fields

The merge handler is only called if the Lock option is set to "merge" in the configration. Otherwise it will only be called if a user saves after the expiry of the lock period started when they first edited the topic, and someone else has saved in the interim.

The configuration option would work as follows:

# The 'Lock' option controls how TWiki handles conflicts when two users both try to modify the
# same topic. It can be set to 'merge', or to an integer lock time value.
#  * If it is set to 'merge' then TWiki will try to automatically merge the topic text
#    and form field values when it sees a conflict. Text, and text fields in forms, will be merged
#    using <ins> and <del> HTML change markers to highlight the changes. Where merging is not
#    possible (e.g. for radio buttons or select fields) then the version being saved is taken.
#  * If it is set to an integer value, that is taken to be the number of minutes that a topic is
#    'locked' after an edit of that topic starts. For example, $cfg{Lock} = 60 will lock a topic
#    for 60 minutes after an edit starts. The lock is automatically released when the topic is saved.
#    Note that if the browser is navigated away from the page, the lock will be kept. Users must
#    hit 'Cancel' on the edit screen to clear the lock. Note also that a 'checkpoint' save will
#    restart the lock period.
# The 'merge' model is designed for use in wiki environments - i.e. those that are mainly
# text-based, and may *break* some plugins that rely on strictly formatted text. It is generally
# safer where users are likely to use browser navigation buttons to navigate away from edits.
# The lock time model is better where TWiki is being used as a database to store formatted data.
$cfg{Lock} = 'merge';

-- CrawfordCurrie - 12 May 2005

Thanks, Crawford. This looks like a good plan. I was thinking about this also. Whe could leave the merge capability in but combine it with the lookoout in the following way: When another user is currently editing the topic the oops comes up. When the user says "edit anyway" she goes ahead, but then the merge will help whoever is the later one on save.

Then features like MailInContrib could rely on checkin never failing. (Note that there appear to be still periods of short lockout, e.g., during an attachment move or the actual save).

-- ThomasWeigert - 12 May 2005

Regarding tools such as CommentPlugin or MailInContrib...

The merging is a good solution for these, most of the time, as they are only generating input to the whiteboard application. Merge obviously works well for the text on the whiteboard, and we could deploy it there.

All that said, there are also issues with that. 2 examples:

  • Take an application where you allow users to add input through CommentPlugin only (the [edit] link has been taken away). If there is a conflict between two comments having been added, there is no ability to resolve the resultant conflicts, as no user is able to edit the topic afterwards.
  • Take a table of requirements. Say one user adds a row at the top stating, say, the requirement that the color of X is green. Another user concurrently adds a row at the bottom stating the requirement that the color of X is red. Given the current granularity of the merge, there is no conflict, but taken as a whole the topic is now inconsistent. (What is a conflict depends on the granularity of the information you are looking at.)

-- ThomasWeigert - 12 May 2005

Yes, there are issues; if two users add almost identical comments (I add potato and you add tomato) then you can get a conflict. As for your example; you clearly need ArtificialllyIntelligentRequirementsRationaliserPlugin. wink

-- CrawfordCurrie - 12 May 2005

Crawford, you are making another very good point above. There are areas of overlapping editing which we should merge no matter what. For example,

  • attaching a document to a topic that is currently being edited.

Here is another thought... if we were to separate editing of the form from editing the whiteboard, as enabled by SeparateWhiteBoardEditFromFormEdit, then we could always use a merge model when editing the whiteboard, but use lockout when editing the form.

Maybe we should make a table of situations of possible conflicts and decide how these should be resolved (merge model, lock model, etc.)?

-- ThomasWeigert - 12 May 2005

Crawford, one more thing... I feel bad as you obviously spent huge efforts on this feature. Let's see how we can have the best of both worlds and use as much of the DakarMergeModel as possible without causing problems for those applications that cannot fully leverage this model...

-- ThomasWeigert - 12 May 2005

Thomas, I need you to look at the changes I made in SVN 4331. I changed the merge to exclude radio, select and check, and made the save redirect to an oops instead of view if a merge happened. The oops template could carry an edit link to support resolution for the case you highlight above. There is still no way to veto a merge. I am really not motivated to put the locking back in, as it is an awful lot of work to do properly and I am struggling to understand how it is justified. At the moment I am thinking that (1) and (2) of the proposal above is enough.

Note that if CommentPlugin is used as you describe, users cannot delete content, only add it. So a merge conflict can never happen.

-- CrawfordCurrie - 30 May 2005

not only is locking a lot of work, just putting the old model back isn't good enough, anyway (many edge cases that aren't dealt with). you'd need a full transaction-based system (which would be cool) to be able to deal with, as one example, renaming a topic, which needs to change the text in another topic (to the new topic name), but is currently locked.

-- WillNorris - 30 May 2005

You will be pleased to hear I have not implemented locks. I had a brainwave last night, and realised that "twiki locks" are not, in fact, locks. The locks we use to make store operations atomic are very necessary, and are true locks (even though they are prone to race conditions).

What have traditionally been known as "locks" (i.e what is taken when a topic is edited) are not actually locks. They are not locks because they can be broken at a whim, and are ignored in so many important places, that they are at best advisory.

So I invented the concept of "leases". When a topic is edited, a lease is taken on the topic for a fixed period of time (default 1h). If someone else tries to edit, they are told that there is already a lease on the topic, but that doesn't stop them from editing. It isn't a lock, it's just a way of advising them. Mergeing is still the prime resolution mechanism; the lease is purely advisory. If a user - or a plugin - chooses to back away from a topic because someone has a lease out on it, well, that's up to the plugin.

The descriptive comment in TWiki.cfg is as follows:

# When a topic is edited, the user takes a "lease" on that topic.
# If another user tries to also edit the topic while the lease
# is still active, they will get a warning. The warning text will
# be different depending on whether the lease has "expired" or
# not i.e. if it was taken out more than LeaseLength seconds ago.
$cfg{LeaseLength} = 3600;
TWiki::Func topic "edit locks" now work on leases.

The mergeHandler is also in, spec mostly as above with minor changes and clarifications. See EmptyPlugin for actual spec.

The leasing is fairly comprehensively tested, except where there are edge cases I didn;t think of. The mergeHandler is totally untested!

SVN 4243

-- CrawfordCurrie - 31 May 2005

No feedback, so I have to assume this is acceptable. Setting to ReadyForMerge.

-- CrawfordCurrie - 05 Jun 2005

Fix record


Edit | Attach | Watch | Print version | History: r19 < r18 < r17 < r16 < r15 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r19 - 2008-09-17 - TWikiJanitor
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.