Read-Write Offline Wiki
A read write offline Wiki is for people in the field who have a need to change content while offline. (Topic started in OfflineWiki
- Can edit and search content while offline.
- Webs shared this way cannot be censored or controlled by an single individual or small group of individuals without significant implementation overhead. MS
- Can be implemented as a bolt-on, and "grown into", rather then increasing initial install hurdle MS
- Setup issues: Web server and TWiki needs to be installed on client.
- Intelligent merge is necessary, like for example a TWikiWithCVS.
- Webs shared in this way lose the ability to perform access control, unless a heavyweight, tightly integrated approach is taken MS
- Work on content independently on different TWiki installations.
- Synchronize periodically.
- 23 May 2000
A web server on the client could be avoided if TWiki is (optionally) extended with a primitive httpd server mechanism that talks only to 127.0.0.1 . Anybody knows GPLed source we could use?
send me these URLs of Perl based web servers: -- PeterThoeny
- 07 Jul 2000
Merging issue: TWikiWithCVS
is not necessarily needed, just the funtionality of CVS. Or, it could be done in a primitive way by simply showing the diffs in case there is a conflict, and then let the user merge the content manually with a cut&paste operation.
- 06 Jul 2000
No "primitive merge" is required; RCS
has exactly the same merge
functionality as CVS. (In fact CVS uses RCS
merge. CVS is "just" a
framework for managing RCS
revisions; the actual file manipulation is
done entirely in RCS
There may still be conflicts that the automatic merge cannot resolve,
so a manual merge facility is still needed.
We might want to add some special intelligence for common conflict
cases. Off the top of my head, if two people add stuff independently
to the end of a topic then there will be a conflict because RCS
know in which order the additions should be appended. In this case, we
could simply sort the new sections according to edit date. This could
be achieved by appending the signature lines in proper order and then
reapplying both changes (the signatures have a date, so it's easy
enough to sort them, and they should be enough to guide RCS
experimentation is needed though).
- 22 Nov 2000
Hi Peter, I have written a small perl hack to GetAWebAddOn
(see) as a zipped/tar.gz/tar.bz2 file, containing data, templates and
attachments of the web of choice.
I have not addressed the merge requirement above, so I am not sure if
this note should'nt moved to ReadOnlyOfflineWiki
- 15 Aug 2000
Here's my list of ideas on the issue. This is all written strictly
from my personal perspective, with no attempt at generalizing the
design for other setups; I just hope that my setup is common enough to
My setup is the following:
I'm connected to the Internet with a
dial-up line. I'm offline most of the time; when I wish to exchange
data with a server, I go online, initiate the transfers, and go
offline again. (This setup is very common. Connection fees are
extremely high in Europe, at least in comparison to US rates.)
also one of those Windows-challenged guys. This enforces some
peculiarities (setting up a WWW server or a Perl interpreter is a
major task, for example).
My need is the following:
I have a project with a TWiki hosted on
(actually it could be any WWW server with a Perl
engine). As I'm mostly living offline, and transmission times are at a
premium, I want to connect once per day, upload all changes that I
have made, download all changes that others have made, and disconnect,
leaving any conflicts to be resolved offline.
This requires mechanisms for the following areas:
Identify changes in the WWW server.
This is already done (the email notifications).
Identify local changes.
I'm doing an explicit connection build-up
anyway, so it would be reasonable to require that I click on a "wrap
my local changes up" button that ran a script over all TWiki topics
and compared them with previous versions.
This could be optimized:
Let the edit script take a note of which topics were edited, and have
the change wrap-up script touch just these topics.
Extract changes so they can be merged later.
This one is
simple. Either manage the stuff locally using RCS
(this requires a
working local RCS
installation; that's not too difficult even under a
Windows installation), or just have the edit script keep a copy of the
original file around somewhere and let the wrap-up script do a
diff. (Installing diff essentially means installing the core or RCS
so the difference isn't that large as it might seem on first sight. In
particular, we'll need diff3 locally later, and that
definitely a part of RCS
Transmit change files to offline clients.
There are many options
for this, each with advantages and disadvantages. Email
Advantage is that everybody has it. It also offers the intriguing
possibility to run a DistributedTWiki
without a central WWW server:
every offline client just sends the change emails to all
participants. Disadvantage is that getting the change mails out of the
mailbox can be a challenge: every mail client has different ideas
about how mails should be stored locally. Another disadvantage is that
mail is unreliable, so the offline client must have a way to detect
lost mails and to request a resend. Email (2)
: Tell users to set
an email account aside for change emails, and to not
tell their mail
user agent about it. Use Mail::POP3 or Mail::SMTP to retrieve the
change emails. Advantages: Everybody has email. Setting up a POP3
account on GMX or AOL is dead simple and doesn't even cost
money. Disadvantage: Retrieving the mail requires that the offline
client get along with the idiosyncrasies of the various mail servers
in the world. Some mailers use POP, others SMTP. Some mail servers
append advertisements to every message. Other than that, this offers
the same advantages and disadvantages as the first Email variant.
: This protocol generally isn't very suitable for distributing
changes (been there, done that: it's really clunky). There's no easy
backchannel to tell the server to resend a change. The server needs a
good naming scheme for files waiting to be retrieved. And the server
never knows when it can delete a file. The other disadvantage is that
it definitely needs a server. HTTP
: Communicate changes with the
server via HTTP GET and POST requests. Advantage: No need to access
protocols that the TWiki scripts don't need to use anyway (keeping the
maintenance burden down). Disadvantage: Requires a central WWW server.
This is simple: Just call the appropriate RCS
Applying a merge may result in a conflict; in that
case, no unsupervised merge is possible.
The best strategy that I can think of is the
one taken by CVS: don't try anything fancy, just tell the author that
the merge failed due to a conflict, and send him an update of the
file. The user can then reapply his changes to the updated file and
Note that RCS
can automatically resolve merge
conflicts that apply to different parts of a topic. This works well
enough in practice; if people want to see what came of such a merge
they can always look at an RCS
diff to see what was done.
- 03 Nov 2000
Being the completely the other way oriented - in no way Windows
Challenged (ie Unix on my laptop with webserver, CVS, RCS
etc, and the
same on a desk machine at work.), I'm taking the other approach - A
central repository of text controlled by CVS, editted using a Text
Editor called "TWiKI" (IYSWIM
which runs pretty much as normal. ie
Many editors, each with local RCS
locking/control, whom periodically
update/merge their changes with the central text repository. The clear
win here is if the central repository dies, the text does not die with
it. It also means there's very little code I have to write.
Since I'm adding functionality (for my convenience !) to publish a
web, and also to allow me to intelligently add email messages
with auto thread/topic detection (beyond the normal header schemes -
is automatic folder creation rather than manual procmail
style rules), I'm doing it my way for my convenience
I suspect when a few people have implemented something that works
right for them, merging the ideas that work
in practice, will
result in something much better than a theoretical "I like doing it
this way"... Anyhow, what I'm implementing:
Twiki is great for a single server installation. In a multiple server
installation (eg as a notetaking tool on multiple laptops) syncing with
a central server, it's pretty naff. Idea decided upon:
- Cental CVS repository of documents.
- Each TWiKi checks out it's own copy. Moves CVS to a different directory, and creates an RCS directory in it's place.
- DON'T need to check in the docs to the RCS. They will be checked in by TWiKI correctly.
- Then mv RCS to somewhere else
- Then mv CVS back
- Then commit changes.
Theory of operation:
- Documents are stored on a central repository.
- Every TWiKi server is a client of this repository. No master of the repository. It should be possible to make one TWiKi a master TWiKi though with this logically being the "maintainer" of the respository. Therefore a TWiKi web simply becomes something stored in the repository, essentially not changing the semantics of TWiKi's code. (No huge change to code - allows incremental transition.)
- Therefore each server can make modifications to it's hearts content. As normal only one user can be editting a document on a single TWiKi at a time using TWiKi's normal RCS locking.
- Periodically the TWiKi admin (or the server itself) performs updates to the central trees, informing the TWiKi admin of conflicts to be resolved.
The key points to deal with are:
- Conflict editting.
- Updates - automatic/manual.
- Change control notes. (Merging of modification data... - can simply log...)
- CVS Aware TWiKi needs to exist in order to perform online roll backs. (This can be done by the TWiKi attached to the central repository, rather than roving Twiki's)
Twikis then become online roving editor applications for editting
documents grabbed via CVS. CVS Twiki (nb I don't necessarily mean what
the topic on CVS TWiKi
means by a CVS TWiKi
!) is an editting/publishing
point - in the short term, this isn't a vital thing for me... (But will
Editting for approval/content update. Publishing for the ability to
modify things online.
Required extra features:
- Ability to do access control ala Unix file system.
(ie access control on a per topic basis much like the UFS - preferably via an
access lists approach rather than just allow people to edit everything in a
web - enforcing control using Apache is possible, but currently very messy.)
- 21 Nov 2000)
There's one downside to this approach: It requires a central
administrator who's willing and able to resolve conflicts. This is
fine if the admin is (a) truly dedicated and willing to serve until
the TWiki is taken down and (b) able and willing to resolve conflicts
in a manner that none of the original authors has reasons to object.
If the participants are sloppy at exchanging their data, the central
administrator will have lots
of conflicts to resolve. IOW this is a
scenario of "if I'm sloppy it won't hurt me", which tends to end in
A fully decentral TWiki doesn't have this. If I'm sloppy and don't
exchange my data, I risk getting into a conflict and having to resolve
it. If getting connected is truly difficult (such as being mobile on
another continent for a while or such), then I have more options: I
can strike a personal balance between the hassle of setting up a
network and resolving the accumulated conflicts.
- 21 Nov 2000
There might be another way around the problems of off-line working,
that is making support of merges simple and straightforward.
Here is my basic idea. Introduce a MergeLink
that is rendered slightly different from normal links
(for example followed by an exclamation mark).
Such a MergeLink
! indicates somebody is going to
change here something. In a (temporary) topic-page
there is some (tagged) intentional information
like a regular expression
characterizing a small fragment of text before
. A preview highlights this fragment in the page
to be changed.
Then one downloads the page to be changed together with its
temporary change-page. The replacement is now edited off-line
in the temporary page.
The on-line editor has 2 small windows in stead of 1.
Being online again one does a copy/paste of the temporary page
in the second window and pushes a new merge button.
One can preview the result, do a little tuning if needed,
and then confirm the changes after which the tempory page
is removed and the original page is updated.
The merge-algorithm first looks for the associated MergeLink
then controls if the characterizing regular expr is still
valid for the neighbourhood above the MergeLink
. In case of failure
it is up to the user what to do.
The next step is to let the mechanisme also work in case you
want to make changes off-line that you did not indicate before with
. It seems reasonable to assume that one has off-line
(a subset of) the same editing, merge and preview facilities as on-line.
Then the off-line and on-line procedure can be made quite symmetrical.
First add a MergeLink
to the page to be changed,
then edit the tempory page, and control the preview.
Going back on-line, the algorithm uses now only the regular expression
for fragment resolution because the MergeLink
is absent in the original
one. If resolution fails one has to inspect why, and take manual action like
skip the change, tune the regular expr or add the missing MergeLink
for explicit resolution.
Well, I hope this explanation is clear enough and that the approach
makes sense as a step forwards into enabling an easy way to
support off-line working.
- 20 Nov 2000
Hmm... if I understand this correctly, you're proposing a way to annotate changes.
This is not necessary!
is well-equipped to detect changes without assistance. Just let
run over two different texts and tell you where the changes
are. No need to write a MergeLink
, no need to design a regular
expression (which would be beyond the abilities of many people anyway,
It might be interesting to outline changed-but-not-yet-accepted text;
this could stand in for the visual aspect of the MergeLink
understand that idea correctly, that is!). E.g. everything that's
changed between the last known server revision and the local copy
could be displayed as dark green instead of black. (Here's another
opportunity for a color configuration <grin>).
- 21 Nov 2000
No, not changes themselves but only a temporary annotation of
the 'intension' of a change.
You know you are going to change something, but have to think about it
(off-line in the sunshine).
You indicate the involved area (I prefer direct colouring, but thought that is
not general available, so suggested a regular expr (but just a number
of lines it presumable good enough)). The indication does help others
to take a little care about the future, and yourself with merging in later on.
Or, another way of saying the same thing is: a mechanism for
outlining the area that should have been untouched at the moment of merging.
Then the term MergeLink
maybe better replaced by something like MyContext
So, while RCS
is indeed taking full care of the past, my little idea is about
some preparation for a future change.
- 22 Nov 2000
This is called "pessimistic locking" in a revision control context: the MyContext
marker essentially locks the stretch of text against changes by other people until you have made your changes.
Pessimistic locking has serious disadvantages:
- It's too easy to place a lock and forget removing it. In short time, all topics will be riddled with locks, and whenever you want to change a text you'll have to ask the lock holder to remove it. (The worst-case scenario is a drop-out who doesn't have the interest, time, or ability to remove the locks anymore.)
- You don't usually know in advance what you're going to change. At least not exactly enough to place the locks exactly where you're going to apply your changes. This means that people will have a tendency to lock more text than they actually need (and usually with a bad conscience, which means that the TWiki experience will be less fun).
- Whenever you have a change in mind, you'll have to take note of the change (to avoid forgetting about details), apply the lock(s), wait until they are confirmed, apply the change, and check the stuff back in. This is a very tedious process that's making the TWiki experience even less fun.
- If you apply multiple locks, two people can accidentally lock each other out. All that's required is that Ann applies a lock on text A, and days later decides that she also needs text B to complete her change. Unfortunately, Barbara has decided to lock B in the meanwhile, only to discover that she needs A as well. Now both will find that the text that they need is already locked, and wait indefinitely for the other to release the lock so they can apply the change. Well, dumb computers would go into an infinite wait (deadlock), humans will probably let the issue drift into oblivion, maybe negotiate to get the change through, or just quit the TWiki.
- And, last, it adds administrative overhead even if there's never a conflict.
It is possible to work around these problems. E.g. one can add a mechanism for storing a change and have it automatically carried through once the lock is approved, for example... but that's already achieved by the current resolve-conflicts-when-they-occur strategy, the only difference is that the change will be stored in the topic instead of in a separate file (and it's in the topic where I usually want to see the changes - if I work with TWiki, I don't want to bother about who's got my version unless there's really a conflict).
- 22 Nov 2000
Let me have another go at conflict resolution. If I have a fully decentral offline TWiki, the worst conceivable situation is a "net split": The network is partitioned into two subnets that don't have any communication for a while.
While the split is in effect there's no problem: everybody makes modifications, and they're checked into RCS
and (somehow) internally redistributed.
In the moment when the subnets are rejoined, the topic changes that have accumulated during the split must be merged. I.e. I have a timeline like this (assume all changes would be in conflict, nonconflicting changes are automatically merged by RCS
and are thus uninteresting):
Revision history Event log
1.1 Topic created
1.2 Topic changed
/ \ Communication between subnets ceased
1.3A | Some change in subnet A
| 1.3B Some change in subnet B
| Communication reestablished
| 1.3B revision is sent to A subnet
What happens next?
First, the A subnet repository must know what the revision was off
which the 1.5B version was based on, so this information must be
transmitted together with the change itself. (I'm not sure whether
TWiki does this already, but it would be a good idea if it doesn't: it
prevents mixups if two people start editing the same topic
Now the A subnet knows what was done in B. It does a three-way diff:
1.4A vs. 1.2 vs. 1.5B, and sees the conflicts. (If it doesn't see a
conflit it's done: it just merges the changes. It should also send its
1.2-to-1.4A delta to the B subnet, which will be able to merge that
change without a conflict as well, and all's done.)
The conflict must be resolved, but since all automated conflict
resolution has failed, the issue must be presented to a human. The
human will see a conflict warning and get a request to resolve the
There are two human candidates to send the request to: the one
responsible for 1.3A and the one for 1.3B. I'd say the request should
go to 1.3B; after all, 1.3A was done earlier, and it's more likely
that he has forgotten about the details of his change. Besides, I like
to award faster over slower changes; loading a page and keeping it
open for editing for days shouldn't become a tactics to avoid pesky
conflict change requests.
After resolution, the result is sent back to A, and we have a new common 1.4 version established in both subnets.
What if there are multiple changes, with potentially many interleaving
changed in both revisions? It would be possible to reconstruct a
series of interleaving changes and have the author of the later change
redo it in a merge, but this would be too much work. Often it's easier
to redo a bunch of changes in one go instead of painstakingly
retracing each change.
So if there's a revision history like this:
then the author of 1.5A will get a message with the deltas of 1.2->1.5A and 1.2->1.4B, with a request to merge the changes into a new, common 1.6 revision.
What if the 1.4A author is lazy (or ill or was run over by a bus) and ignores the request? Assume somebody has a 1.4B version on his machine and adds a change; in the moment that his new 1.5B change is distributed, a conflict will arise and the author of 1.5B will get a conflict resolution request.
Hmm... when reviewing my changes, I looked at the top of this topic and found that Peter had exactly this in mind in 6 Jul 2000. IOW we're back full circle, with the additional twist that this merge isn't necessarily restricted to a scenario with a central server, it should work reasonably well if all TWikis are offline. (Everybody should have an RCS
BTW all this merging and branching can be mapped directly to RCS
has facilities to store and retrieve independent branches of change for the same document, and it has commands to automatically merge branches. It will just complain if there's a conflict during merge, asking the user to add enough changes to the head revisions of any of both changes to make a merge possible. Sounds familiar...
- 22 Nov 2000
Of course, if my idea was really just 'pessimistic locking' your remarks
about the complications and a non-twikilian moral are quite right.
Well, let me try to give an example to illustrate my rationale.
A group of songwriters is working on a document containing songs.
A typical aspect of a song is that any change can unbalance
the whole song, therefor the absence of 'technical' conflicts
does not mean the absence of poetic or melodious conflicts.
So a songwriter working on a song indicates he is working on it.
That does not mean a lock, the others may edit it just in the usual
twikilian way. It is only offering a type of awareness, not forcing
behaviour. With awareness a partner could for example make a choice
in what order to make his changes (if # > 1) to enable a better
workflow in the cooperative writing.
At the other side (of the merger) the indication helps as a
precondition for an automatic merge, only when the precondition
failes the merger has to inspect manually if the song is still balanced.
Without indication the merger has always to inspect manually if any
of the other changes since he checked out are touching the song.
If supporting a little bit of workflow is polluting the simple concept
of TWiki, then my idea can also be applied only as a little help
during merging (without any indication in an earlier stage). The hypothesis
behind 'help' is that marking what should be invariant is less work than
inspecting what might be changed and wrong (in particular in case of multiple changes).
One might say that each song should been a separate topic. But I
think that each well-written piece of human language is full of
small song-like fragments.
- 24 Nov 2000
Ah yes, now I understand. This should work. I'm not sure whether implementing it is worth the effort, but then one is never sure about new things.
However, if the change forewarnings are just advisory notes, I don't see how they help with offline editing and reconciling changes. Could you elaborate?
Re the song-like quality of human language: I fully agree, but the key point is "well-written". WikiWikis
are used for quickly exchanging ideas and building consensus, and with that use, literary quality is not considered important enough to spend any effort in that area. (This doesn't mean that TWiki cannot be made into a form that attracts people with interests in cooperatively creating high-quality texts, so if you feel that your idea contributes, go ahead and implement it!)
- 25 Nov 2000
I used a poetic metaphor to explain a technical mechanism for proactive
help with semantic conflicts.
The mechanism is: making the granularity of a possible merge conflict tunable
by the author of a change. The granule (larger than the change itself)
is related to the document state that an author sees just before his change.
So indicating the granule can be done at any time the author
sees the document in that state.
Part of helping to reconciliate conflicts by highlighting just the semantic delicate regions is also
making more clear that your own change is clumsy, irrelevant or impossible in relation
to what others did in the meantime.
A more down-to-digital-earth usage of the idea would be the domain of
cooperative literate programming (once upon a time invented by Knuth).
Not of course for exchanging tons of
ugly Perl or C++ code. But maybe for something
that is close to compact, readable (!) and (perhaps) executable specifications
(I use Python for that purpose as my poetic vehicle).
A typical granule-size for a change within a piece of Python-text
would be a method of a class.
But, I do realise that making usage of TWiki practical for such a domain also
requires an import/export facility to a specific syntax-directed editor.
There is an ironical paradox in saying: "... because TWiki is a tool
for building consensus 'literary' quality is not considered important ...".
The more a document represents consensus within a group of human actors,
the more precise (balancing on the edges of the ambiguities of natural language) it will have to be, and the harder to make any
change without introducing new (and increasingly) time-consuming 'conflicts'.
Just look (as
extreme) at the production-process of documents by politicians, lawyers or committees.
- 26 Nov 2000
Has anyone updated GetAWebAddOn
- 27 Jun 2002
A few more comments to some older arguments from Joachim and Nicolas
- "I still don't see how a CVS backend helps with release tagging" -- it should save you implementation effort, because all the looping and low-level work is done within CVS. And (unless there is RcsLite) it saves tons of fork()s.
- "CVS' replication facilities don't give TWiki a serious advantage" -- replicating CVS repositories is a pain; that's true. But I'm dreaming of distributed check-out copies. TWiki could serve, say, 95% of requests from these local copies. For the other 5% (query versions + diffs) you need CVS. And it handles the propagation of changes up to the central repository (which could be your existing CVS / file / non-TWiki server) and back down to the other distributed servers.
- "Using CVS for TWiki replication is non-trival" -- granted. Q: is there a trivial way to implemented custering, replication, offline usage?
I agree with PeterKlausner
's comments above and support his desire to support replication using a CM system. This issue has also been discussed in TWikiWithClearCase
- 08 Aug 2002
What about using a backend that is smarter about doing distributed version control? I'm thinking of something like arch
. These guys have a lot of tools for allowing an isolated repository to make changes independent of the "main" repository and also sync them up.
- 09 Aug 2002
I've got the method I outlined above up and running:
- Use data and code separation
- Check in the entire contents of each web's directory (using -kb for the contents of the pub subdirectory)
- Use RCS directories rather than have ,v and .txt in the same directory - this avoids a number of problems.
Once checked in, and checked out on satellite site, the RCS
directory on each local node means that people have safety of edits and no reliance on external change control (cvs ignores an RCS
Then periodically satellite sites can do a
cvs update; ci -l changed topics; cvs ci
type cycle, picking up the changes from the other sites participating in that discussion. One very key point - no single wiki becomes a single point of failure - or control
. (Essentially this brings a usenet like quality to the wiki)
Currently synchronisation is a manual process - largely to check out what problems arise. (Biggest expected problem is conflicts in metadata - conflicts in data are much easier to deal with - render differently, and then leave to the user to resolve. The wiki can even mail the user who made the last local edit telling them to resolve the conflict)
Still a number of issues to be dealt with, but that's the case with any new feature. (If wiki was a "replacement" for email - not sure I buy that TBH - then this works pretty well
as a "replacement" for usenet.) If you weren't using data and code separation
then this functionality would be significantly harder.
Finally got round to it
I've wanted this feature for almost 4 years!
-- MS - 22 Jan 2004
Michael, this is fabulous - but you hedge around the conflict resolution problem a bit. I assume that's because you don't know yet what's in the can of worms. Conflict resolution scares me a bit; having used clearcase for years, I know the problems of merging conflicting updates.
Personally I'd prefer that conflicts were resolved by the person doing the latest checkin, and the online wiki identified clearly as the "master". Here's a possible user story:
- User1 goes on business trip, takes laptop with offline wiki
- User2 makes changes in main wiki while they are gone
- User1 also make guerilla changes to the same topics
- User1 and User2 change the same form fields in the same topics
- User1 returns home, plugs in and hits the "synchronize" button on their offline wiki
- they are presented with a page containing a list of topics containing conflicts. All these topics are locked in the online wiki.
- non conflicting changes are performed silently and the online and offline versions synchronised
- when they click on one of the conflicts, they are taken to a page that displays the online page version open for edit, and their offline version next to it. This gives the opportunity for them to merge their changes into the online version.
- if they ignore the conflicts, then the offline wiki is updated to reflect the online version i.e. their changes are lost.
- 23 Jan 2004
I was brief because it was relatively late (and because the conflict problem isn't actually that bad)
wiki in this implementation (after all I've done read only distributed TWiki
for a long while now) is actually the CVS repository. This doesn't "belong" to any of the wiki servers directly. It's likely to be hosted by one of them, but could equally just be a sourceforge or savannah repository. None of the wiki servers uses the CVS repository as it's local store. (ie no special TWiki::Store::CVS modules get written)
There's two kinds of conflicts that need consideration:
- Conflicts in topic text lines
- Conflicts in META lines
- User is editting on their laptop (or independent server), makes changes
- They (or their local admin, or cron) performs an update/checkin RCS/checkin CVS cycle.
- This may leave them locally with conflicts. (Standard CVS issue)
CVS marks up the conflicts using the usual:
- In topictext
- This can simply be handled by rendering the text differently until the conflict is resolved. A clash is a clash of edits - a difference of opinion. This means that both edits have equal weight. (Hence why CVS couldn't resolve the issue ) Either the person who was the source of the problem resolves the conflict, or they don't. If they don't, it's just two differing opinions that both get checked in next time. Unless conflicts in programming code, presenting conflicting arguments in text is positive.
- Pages where it might cause problems are those with complex searches - pages part of an application, defining CSS in the topic and so on. The simplest solution there would IMO be take the same approach as META lines (below) and provide a META tag to tag pages as "conflict sensitive".
- In META lines
- This causes problems for the TWiki code. This means some work needs to be done. A simplest thing that can possibly work is to do this:
- Assume the master server is always correct.
- Take the conflicting META lines that are already in the CVS repository as "correct" - use that to resolve the conflict
- Tag the page as a conflict needing resolution in metadata
- Place the conflicting metadata into storage (the topic text is simplest) and checkin.
- The ideal scenario here is to allow the existance of alternates - which then allows the topic to have multiple sets of metadata.
What this is likely to cause is the same situation that happens normally with CVS - the person who has to resolve the conflict is normally the one who performed the last edit. In either case the wiki server that discovers conflicts can find out who locally
last editted the page, and mail them with a link to the conflict to resolve.
That might sound too simple, but in practice I'm pretty sure it is
that simple. (As I say I'm doing this using a manual process at present until I work out the kinks)
Setting it up isn't really that difficult either. (Conversion from RCS
files in the same directory as the text to RCS
files in an RCS
directory is the bit with the most faff in fact)
Consider the degnerate case - each TWiki is used by one user only. In that situation each TWiki is just a text editor for that user. The conflict resolution scenario is exactly the same as standard CVS. So far from not being aware what's in the can of worms I'm fully aware of what's in this well known can of worms you can get at any handy dandy sourceforge project
(After all I'm mirror TWiki.org changes into "my" wiki codebase and get conflict resolution issues on a regular basis - as a result I'm pretty certain this approach will a) work b) prove fruitful)
Note for anyone confuddled : this isn't using CVS as a local store backend - it's using CVS as a synchronisation tool - allowing global histories of edits from different wiki servers to be stored in one place, and local histories of edits to be stored locally. All the wiki servers are created equal, with no "master" wiki. (What's the master server for usenet?)
(Corrollary: Anyone who puts their web into this
shared environment has to release absolute control over the system)
-- MS - 23 Jan 2004
Great, that there is progress on this important feature,
although I'm waiting only for 3 years
Still, there are a few problems to solve:
- TWiki assumes, that the checked-out .txt and the latest repository revision are identical. If you fiddle with the .txt (as I often do), then revision + diff Display get inconsistent, confusing innocent users
- By synchronising only the current checked-out copy, you loose all history in the local TWiki's RCS
- Even if you don't loose a revision, you loose the precise time and author info
- Conflict resolution basically has to happen on shell level, not from within TWiki.
- When you fixed the conflicts under the hood, nobody will see this in the WebChanges, unless you reload+save the topic from TWiki. From each TWiki!
- A simple cvs ci or update transfers the whole tree, even if there are 0 changes! Depending on your Wiki size (those evil attachments, you know...) and bandwidth, this may be a problem.
I still guess, that it is important to "teach" TWiki the notion
of a checked-out, work-in-progress revision not
in the repository,
as layed out in PageCheckoutCheckinStrategy
This would make it much easier to do conflict resolution,
didn't see Michael's latest comment before save.
Good point to explain the degenerate case of 1 TWiki per user.
Still, I would really love to use CVS not only as synchronisation aid,
but to integrate it into TWiki's revision history.
hack looks like it would be easy enough
to change TWiki's basic behaviour on checked-out copies.
- 23 Jan 2004
If you can think of a sensible way of dealing with checking in the specific
version histories I'm interested in that. However you need to think of each edit
location as essentially a branch, and each checkin as a merge. (Has version
control separate store etc) Generally the CVS trunk doesn't contain all the changes
that have been made in all the branches. (After all, how do you deal with the fact
that version histories will conflict - editting is no longer a linear sequence. One
approach is to go to the branch - and having a point to the appropriate branch - if
it's not behind a firewall is a possibility - assuming you know where that branch lives
I've performed this setup now on 4 wiki servers (one on my laptop, two at work, one public),
with the laptop one taking feeds of different webs for local editting. This means I can
work on several independent wikis all locally on my machine and periodically down/upsync
automagically from/to the correct server. (In a similar way that you can run a news server and
take feeds from several news machines)
-- MS - 23 Jan 2004
The "it's all good stuff" approach with the document text I like, and can see working. It's the changes to meta-data, specifically form content, that worries me. Reiterating your alternatives
- Assume the master server is always correct.
- Tag the page as a conflict needing resolution in metadata
- Place the conflicting metadata into storage (the topic text is simplest) and checkin.
- The ideal scenario here is to allow the existance of alternates - which then allows the topic to have multiple sets of metadata.
IMHO 2 and 4 are too complex. As you correctly point out, you are not trying to build a highly sophisticated CM system here. Just as well, given the difficulty of getting even a trivial change into the code. 1 is too simplistic. 3 is interesting. Metadata need to be treated as atomic units of change, in which case a simple strategy of "last come, best served" would work well. Consider the scenario that two
guerrillas make changes to the same form field; who takes precedence? He who synchronises last? She with the latest date stamp?
Your degenerate case is amusing. It makes you think of other possibilities, like only synchronising a subset of your content.
- 23 Jan 2004
Completely alternative approach for implementing this would be for all the wiki servers to listen to an NNTP feed looking for edit messages. When the wiki server has an edit, when the lock is released (whatever method) a diff is performed, and a unified diff with sufficient context is posted into a newsgroup. When the message is picked up by a remote site from the NNTP feed it is merged into local edits of the site. The messages could be stamped with the username (+local edit page) of the editor.
Problems in this scenario relate around lost NNTP messages - which can be a problem for some sites. This would cause the topic text on different sites to be permenantly out of sync. The nice thing about this solution however is it becomes completely decentralised.
-- MS - 24 Jan 2004
does a three way merge of IMAP repositories...
- 14 Jan 2005
all this discussion is quite interesting, but I don't yet see it concretize into an implementation, be it a plugin or whatever else or even just a detailed functional/technical specification. with a previous group I have been working with, we had planned also something like this, but then we had no resources to use to implement our ideas. We did not want to use anything else than RCS
whereas the installation, especially on the satellites, had to be as easy as possible, conflicts had to be solved on the satellite and had to be presented in twiki format in the edited topics (a non technical user uses TWiki, (s)he does not want to learn an other symbolic language for the conflicts. these are seen as rubbish created by TWiki, so it would be quite surprising for the user to see that twiki uses non-twiki syntax.)
anyhow: the interaction is described in some detail on the page TWiki:Codev.TWikiWithIntermittentConnectivity?rev=1.1
, but it is written in Italian. I'm slowly translating it into English. at the moment only the first lines are readable (the rest has been automatically translated) and the summary too can be used.
Since the basic ideas are probably clear enough, I wonder if there are any reactions (let alone the cry: "implement it and translate the docs!").
- 10 Mar 2005
I was thinking about this style of interaction: I want to take advance of the presence of a web server also on the satellites, server containing pages and scripts that can be called by the planet after it has received a set of patches.
one point of attention is that I'm not trying to keep version numbers on the satellites. all edits get shrunk in one version increment on the planet, and thus it is returned to the satellite. this because it gets the differences all at once, after a possibly local history of (non conflicting) modifications.
|| asks the user what s/he wants to syncronize
|| checks the last connection time, that no topic has unresolved conflicts, then prepares the rcs.diff file and sends them to the central patchweb script
| local:cycle (until planet calls)
|| receives the rcs.diff file, applies the patch, invokes satellite's getupdate
| local:still cycling,
| receives the complete differences between the last connection and the present, included the own data. this syncronizes topic versions. updates last connection time. sets flag so cycling can end.
|| shows the list of conflicting topics. these are kept in a recognizable format so that it cannot be later sent unresolved to the planet.
- 12 Mar 2005
There has been some discussion on Meatball about the concept of a
- an internet wiki where nodes are local computers
that can drop in and out. Not quite the same concept, but you might
find the discussion amusing, though rather academic.
These are all problems that have been addressed in the design of
distributed CM systems, such as ClearCase
. Can you leverage
anything from there? For example, if each of the satellites had a SVN
checkout area, and a merge with the central server was equivalent to
checkin. Conflicts would have to be immediately flagged for
resolution. A satellite patching in to the central server would
effectively just do an
- Um, SVN is not a distributed VC system. SVK is (somewhat). -- AndyGlew Thu Oct 5 2006
- 12 Mar 2005
well, maybe it can be interesting, SVN
does not look to me too
different from CVS, I have no idea what this ClearCase
is. I should
definitely give it a look...
My aim is to keep as simple as possible, maybe you disagree given the
results, but I was thinking of two differend add ons after a complete
twiki installation: after you have everything set up, you install a
satellite or planet plugin. the satellite plugin has to be configured
as to know where is the planet (on first instance I would not make the
planet too complex). then we already have RCS
, we already have a
web/twiki server on both ends, from here my proposed design. I think
we have been abstract enough for enough time to make a concrete
attempt. I recall having performed the steps described by hand, it
did not behave that bad, but it has to be automatized. when we have a
running prototype, we can make it better.
- 12 Mar 2005
a) At least one key member of my team just plain out-and-out refuses
to use TWiki because it is not accessible while offline and
disconnected. He flies all the time, can write Word on the plane, can
write email on the plane. He can even write to Windows shared
directories on the plane, courtesy of replication and
synchronization. But he can't write wiki easily while disconnected.
b) If looking to change the underlying version control from RCS
, I recommend looking further, to one of the new generation of
distributed VC systems. BitKeeper
is the best example, but
unfortunately has weird licensing. Open source distributed VC systems
include GNU Arch, Monotone and Darcs. Linus has started writing his
own, Git, now that Linux is no longer allowed to use BitKeeper
I recommend against CVS and SVN, since they are both centralized VC
systems. Maybe it would be neat to have TWiki use CVS or SVN for
other reasons; kwiki does nearly all VC systems. But CVS and SVN
will not solve the offline wiking problem.
distributed stepchild SVK
may be reasonable. However, if TWiki
goes that way, I will stop using TWiiki.
Reason: the big reason why I use TWiki rather than Zope/Plone/Zwiki is
that TWiki uses ordinary files. I can use standard UNIX tools like
grep to manipulate them. And, yes, I frequently grep the ,v files.
uses a database, and SVK
is built on top of SVN
c) I started writing a tool to merge RCS
,v files. Basic idea is to
compute a content based hash for each version, and then line up
versions between ,v files. Most of the time the merge is easy;
however when there have been conflicting edits you need to do a merge
in much the way CVS does. And probably TWiiki would want a WUI (Web
User Interface) to do such merge editing.
I.e. I merge the version history in the
files. The actual leaf
content merge is separate.
It would probably be too much of a hassle to get Intel to allow my
code to be shared, and my code isn't good enough to warrant the
hassle. But the basic idea is straightforward, albeit slow.
d) given a completed RCS
file merge, and a leaf content merge
WUI, then TWiki would only need to be extended to make the RCS
files available to download, when an offline wiki is being spawned.
- 14 Jul 2005
Erm, just one thing....
does not use a database. it can be configured to use either
berkelydb, which many people would argue is not quite a database, OR
(more significantly) a fileSystem based store (though i think
binary). either way, the svnadmin dump command will give you a text
based dump of the repository, which makes it less likely that some
bugger edits your ,v files (which sadly happened with cvs)
that said, its a fair point that the ,v file system is
a useful bonus (i also have done similar).
do arch, monotone, darcs or git (linus') actually use plain text files
for the versioning info? as you are implying that they do....
in any case, it is extremely unlikely that we would drop support for
rcs (though its quality would depend on those that use it to test it)
when we start to add other backends, as my
plan is to allow
different backends to be used on the same twiki - that way freeform
data would be in some text format (with distributed data possible),
and some of the data would be in non-distributable database type
stores (potentially where twiki is not the main client of that data),
and other data sources that are a mixin of the two
All that said, any funtionality in TWiki is highly dependant on the
code,docco,testing contributions of those users that need that
functionality - and this is a classic case where everyone that says
they have to have it, has not actually done it. (i don't need
but i sure would love to see it)
- 14 Jul 2005
I think the main obstacle is that the best distributed VC system is
, which is not really open. The others - darc, arch,
monotone, git - aren't really ready for prime time.
- Thu Oct 5 2006: Git is now ready for prime time.
arch is widely used, but is highly idiosyncratic.
- 16 Jul 2005
- 26 Jul 2005
more or less implements this.
You'll only find it in SVN
- 23 Nov 2005