Tags:
create new tag
, view all tags

Generalizing TopicTrees to Facilitate Management of Document Relations

Topics such as TreePlugin and EditTopicTreeStructure discuss, and provide ways, to edit a tree structure imposed on wiki topic pages, using the %META:PARENT{name="parent"}% facility.

META:PARENT tree structure insufficient

However, as I use these more and more, I realize that %META:PARENT{name="parent"}% is unsatisfactory. It's a good hint... It usually means "where a page was created from"... but it's often incomplete. For example, EmbedTopicPlugin does not (as of March 2006) use %META:PARENT{name="parent"}%

Further, the parent link does not specify an order amongst children. You may want to imagine pages as

  • Parent
    • Child1
    • Child2
    • Child3

But TreePlugin, based on the parent link, is just as likely to give you some other order.

  • Parent
    • Child2
    • Child3
    • Child1

I have experimented with specifying the child order explicitly, via META:CHILD variables. But this just gives you more metainformation that you need to explicitly manage.

I have also experimented with inferring much of the rest of the tree structure. E.g. look at the %INCLUDE{...}% and %EmbedTopix{...}% structure. Also, infer structure from references to other pages. Such structure inference is promising, but I think that I need more...

Furthermore, it really seems that there are multiple views of the structure of topics. It might be nice to have a tree structure for an entire web; but it may also be nice to have a tree structure for a subweb. E.g. if I have a web on computer architecture, it might be nice to have a tree structure from the point of view of compilers, and a different tree structure from the point of view of hardware. Neither being a subtree of the other. These tree structures are just particular views or organizations of the knowledge.

Why TopicTrees?

Why do I want a TopicTree - a view of a set of topics - possibly an entire web, or multiple webs, or a subset of a web?

Several purposes come to mind:

  • Navigation tool
    • it can be convenient to see "all of the TopicTree" as a tree structure, rather than clicking around from page to page.
    • indeed, many hand made pages have little tree structured lists that help navigate to some sub-topics and related topics
    • TopicTrees are in ost ways an attempt to automate sch little navigation aids. And to ensure that they are complete.

  • Printing
    • You may want to print a set of topic pages in some order that makes it easy to read from front to back.
    • Yes - some people still read on paper printouts... smile
    • Or... instead of printing to paper, we may still want to create a view of al the pages, using %INCLUDE{...}% and/or %EmbedTopic{...}%

Any more...?

The Basic Idea of TopicTrees

Here's my basic idea for TopicTrees:

Create a TopicTree that looks something like this

%BeginTopicTree{TopicTreeName}%
   * Root1
      * Topic1sub1
      * Topic1sub2
   * Root2
      * Topic2sub1
%EndTopicTree{TopicTreeName}%

Indeed, it might be more convenient not to have to use the %BeginTopicTree{...}% stuff:

  • TOPICTREE(TopicTreeName)
    • Root1
      • Topic1sub1
      • Topic1sub2
    • Root2
      • Topic2sub1

There may be more than one TopicTree in a page or a web - hence the TopicTreeName. (The TopicTreeName should be optional.)

Given a TopicTree specified as above, the following operations should be available.

Find Pages Missing From TopicTree: it should be easy to create a report of topic pages that are in a web, but which are not in any given TopicTree. Similarly for subsets of pages, e.g. specified via a regexp. E.g. specified via transitive closure from pages already in the TopicTree. (We don't need the tree structure to do this - a simple list would suffice - but we'll need the tree structure for subsequent operations.)

Such a missing pages report might look like:

  • TOPICTREE(TopicTreeName)
    • Root1
      • Topic1sub1
      • Topic1sub2
    • Root2
      • Topic2sub1
  • TOPICTREE pages not referenced in TopicTreeName

Infer a TopicTree structure from Parent INCLUDE and EmbedTopicPlugin structure: E.g. if Topic2sub1 INCLUDE's Topic2sub1sub1 and Topic2sub1sub2, in that order, we might infer where they should be placed under Topic2sub1. Similarly, if Topic1subUnknownOrder has Root1 as a META:PARENT, we might infer what subtree it should be placed in, but not in what order. Such a report might look like:

  • TOPICTREE(TopicTreeName)
    • Root1
    • Root2
      • Topic2sub1
        • Topic2sub1sub1 INFERRED FROM INCLUDE
        • Topic2sub1sub2 INFERRED FROM INCLUDE
  • TOPICTREE pages not referenced in TopicTreeName

Report Inconsistencies Between Inferred and Explicit TopicTree Structure: ... obviously, these may be inconsistent. Such inconsistencies should be reported, although it should be able to disable such a warning.

E.g. we may want to report when the parent of a node in the TopicTree is not the META:PARENT. But at other times the same node may have different parents in different TopicTrees.

Similarly, INCLUDE and EmbedTopicPlugin imply a tree structure. We could report when it is inconsistent, both parenthood and ordering.

Merge TopicTrees: There may be several different TopicTrees specified in the document - hence the names. Where TopicTrees overlap - e.g. where the root of 1 tree is a leaf of the other - they may be merged to form a bigger effective TopicTree.

Update META:PARENT information: we could update the META:PARENT information, based on explicit and inferred TopicTrees.

Create a master document that INCLUDEs or Embeds all the pages in a TopicTree: We could create such a big document, for prinout, mailing, or your viewing pleasure. Here it is probably necessary to take into account inconsistencies between INCLUDE/EmbedTopicPlugin, and the TopicTree.

E.g. we would probably not want to print a topic page twice, once where it was embedded, and a second time according to the TopicTree.

Other Ways of Viewing Structure

There are other ways of viewing structure - the relationships between pages. E.g. various graps and visualization tools.

These are, many of them, great.

However, a serializable tree structure is still one of the most useful. If onl;y because it can be printed.

-- AndyGlew - 27 Mar 2006

Discussion

I wrote the first step of TopicTree - a script to find files not mentioned in the TopicTree. Attached below.

This script is currently UNIX CLI. Written that way because it is easiest for me. I'll translate it to a TWiki plugin when I figure out FilesystemToolToTWikiPluginAdapter.

Unfortunately, this means the script has to be run externally, e.g. from a cron job. It really needs to be a plugin.

-- AndyGlew - 29 Mar 2006

Continuing, sporadic, work: now, instead of just printing a list of the missing files, I infer their tree structure from the INCLUDE and Embed directives within them. Doesn't help if you don't use INCLUDE or Embed, but it's a start.

Also has the ability to infer tree structure for an entire web based on INCLUDE and EmbedTopic.

Still a UNIX CLI script; not TWiki pluginified.

  • topictree.tgz: script and perl module to infer tree structure of files missing from topictree

-- AndyGlew - 04 Apr 2006

Andy - you probably want to write it as a twikishell CommandSet - this will smooth the transition between testing your script from the command line and exposing that same functionality over the web.

-- MartinCleaver - 04 Apr 2006

Try TWikiIRC and talking to RafaelAlverez or me.

-- MartinCleaver - 04 Apr 2006

I'll give this a try, Martin - I'm just not sure when I'll have time to work on it in the near future.

By the way, I seek advice, suggestions on two topics:

a) I am pretty happy with how TopicTree is currently working with the manual editing, but...

b) I want still more automatic inference of tree structure. E.g. if I have not got a manual TopicTree for a few topics, I would like to infer it, e.g. from parent or linkage

c) I am pretty sure that I want to be able to "merge" TopicTrees. E.g. I have often dne things like having a mini TOC that points to related pages at the top of every subsection.

Merging is straightforward when all are consistent. I am not sure what to do if the trees are inconsistent. Any ideas?

d) I actually have enough right now to be able to set the META:TOPICPARENT link appropriately - which is why I started. But I have noticed that I often want the same topic to appear in more than one place in the TopicTree. If so, which "parent" should I use? The first encountered? The last? ...?

I am basically noticing that the TopicTree makes META:PARENT irrelevant.

e) But, the biggest design issue - i.e. the thing that takes the most manual work - is, what to do about DanglingLinks?

I spent a few hours a few days ago setting up my original topic tree. Most of the time was spent not organizing existing pages, but organizing the nonexisting pages. I took the DanglingLinks list that I generate through yet another of my scripts - where is the link for that, now? - and manually edited them into a TopicTree.

Now, though, I still get new DanglingLinks that are not into the TopicTree. But whereas new EmbedTopicPlugin links have a natural place in the TopicTree, the DanglingLinks do not. So I still have to move them around by hand.

I am thinking that maybe I should "guess" at a parent for any DanglingLink. E.g. if referred to in only one place, that page?

Anyway, suggestions appreciated.

-- AndyGlew - 06 Apr 2006

I continue to work occasionally and sporadically on TopicTree.

This most recent upload is a more complete direct tree - mainly sanitized, although it still contains some hardwired paths. Provided in the hopes that others will be interested. Features:
  • I am getting to the point where I am almost willing to depend on TopicTree.
  • I generate it from cron regularly, circa every half hour.
    • I would have cron overwrite the TopicTree file, instead of creating TopicTreeWithMissingFilesCron, if there was an easy way to make this atomic. But I don't know of such a way. It is strange that the hardware community is thinking about transactional memory, when the OS cmmunity does not...
  • I added the "noauto" flag to lines between %!BeginTopicTree%/%!EndTopicTree% to allow the noise level for various standard files like WebHome and TopicTree itself to be reduced
  • I added -print_style replace-topictree to, well, replace the topictree in-place - what I would want if cron were to overwrite.
  • Along the way, print the list of missing files, etc., as %!BeginTopicTreeAutoStuff%/%!EndTopicTreeAutoStuff%
  • Improved handling of cross-web links

TBD:

  • make into a plugin
    • or twikish
    • I am thinking about using Perl's objefct oriented I/O, so that any Perl script can be made to run "inside" twiki
  • add a button to say "regenerate topictree now!"
  • merge multiple topictrees

-- AndyGlew - 16 May 2006

Hi Andy, just to let you know that this work is greatly appreciated! I am following your progress.

-- ArthurClemens - 16 May 2006

Thanks for the comment, Arthur. It sometimes feels like I am just talking to myself.

Now, on to the next topic:

I use a lot of REDIRECTs, as in RedirectPlugin.

Or, rather, I would like to use a lot of redirects. But I find RedirectPlugin too hard to use.

Why use redirects? Well, for example: I often define NewTerm1 and NewTerm2. I drop links to these all over the place. But, I don't want to create a separate page for each - instead I want to create a single page NewTerms, that defines them both in contrast to each other. Hence I want NewTerm1 and NewTerm2 to redirect to NewTerm.

So, tonight, I added this to TopicTree:

I.e. I can use the TopicTree as a sort of centralized database from which I can create the redirects. (I use conventional RedirectPlugin for the actual work.)

It works well enough that I'll keep writing my documentation this way.

Note: the redirectPlugin folk say that you should not overwrite a topic with a redirect. I find that I really want to. Maybe I should warn against overwriting a non-trivial topic. And certainly I should version control these files.

By the way, this leads to an undesirable proliferation of small redirect topics. Someday we will have to give in and put them all in a single file - a single file containing multiple redirects. This is an area that database driven wikis have an advantage.

I'm not posting this nw change yet. Maybe this weekend.

-- AndyGlew - 17 May 2006

In case anyone cares, I have code that assembles a nice PDF from a list of topics in a topctree. I am now using this to publish collections of wiki pages as standalone documents.

-- AndyGlew - 28 Jan 2008

That is a nice feature!

-- ArthurClemens - 28 Jan 2008

I'd be interested too.

-- StephaneLenclud - 31 Jan 2008

Topic attachments
I Attachment History Action Size Date Who Comment
Compressed Zip archivetgz find-unwritten-wiki-pages.tgz r1 manage 30.0 K 2006-05-16 - 07:56 AndyGlew more topictree stuff
Texttxt topictree-find-unmentioned-pages.pl.txt r1 manage 3.3 K 2006-03-29 - 20:37 AndyGlew script to maintin topictree. not twki integrated
Compressed Zip archivetgz topictree.tgz r1 manage 7.4 K 2006-04-04 - 07:30 AndyGlew script and perl module to infer tree structure of files missing from topictree
Edit | Attach | Watch | Print version | History: r15 < r14 < r13 < r12 < r11 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r15 - 2008-02-04 - PeterThoeny
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.