r15 - 04 Feb 2008 - 06:01:37 - PeterThoenyYou are here: TWiki >  Plugins Web > EditTopicTreeStructure > GeneralizingTopicTrees
Tags:
, create new tag

Generalizing TopicTrees to Facilitate Management of Document Relations

Topics such as TreePlugin and EditTopicTreeStructure discuss, and provide ways, to edit a tree structure imposed on wiki topic pages, using the %META:PARENT{name="parent"}% facility.

META:PARENT tree structure insufficient

However, as I use these more and more, I realize that %META:PARENT{name="parent"}% is unsatisfactory. It's a good hint... It usually means "where a page was created from"... but it's often incomplete. For example, EmbedTopicPlugin does not (as of March 2006) use %META:PARENT{name="parent"}%

Further, the parent link does not specify an order amongst children. You may want to imagine pages as

  • Parent
    • Child1
    • Child2
    • Child3

But TreePlugin, based on the parent link, is just as likely to give you some other order.

  • Parent
    • Child2
    • Child3
    • Child1

I have experimented with specifying the child order explicitly, via META:CHILD variables. But this just gives you more metainformation that you need to explicitly manage.

I have also experimented with inferring much of the rest of the tree structure. E.g. look at the %INCLUDE{...}% and %EmbedTopix{...}% structure. Also, infer structure from references to other pages. Such structure inference is promising, but I think that I need more...

Furthermore, it really seems that there are multiple views of the structure of topics. It might be nice to have a tree structure for an entire web; but it may also be nice to have a tree structure for a subweb. E.g. if I have a web on computer architecture, it might be nice to have a tree structure from the point of view of compilers, and a different tree structure from the point of view of hardware. Neither being a subtree of the other. These tree structures are just particular views or organizations of the knowledge.

Why TopicTrees?

Why do I want a TopicTree - a view of a set of topics - possibly an entire web, or multiple webs, or a subset of a web?

Several purposes come to mind:

  • Navigation tool
    • it can be convenient to see "all of the TopicTree" as a tree structure, rather than clicking around from page to page.
    • indeed, many hand made pages have little tree structured lists that help navigate to some sub-topics and related topics
    • TopicTrees are in ost ways an attempt to automate sch little navigation aids. And to ensure that they are complete.

  • Printing
    • You may want to print a set of topic pages in some order that makes it easy to read from front to back.
    • Yes - some people still read on paper printouts... smile
    • Or... instead of printing to paper, we may still want to create a view of al the pages, using %INCLUDE{...}% and/or %EmbedTopic{...}%

Any more...?

The Basic Idea of TopicTrees

Here's my basic idea for TopicTrees:

Create a TopicTree that looks something like this

%BeginTopicTree{TopicTreeName}%
   * Root1
      * Topic1sub1
      * Topic1sub2
   * Root2
      * Topic2sub1
%EndTopicTree{TopicTreeName}%

Indeed, it might be more convenient not to have to use the %BeginTopicTree{...}% stuff:

  • TOPICTREE(TopicTreeName)
    • Root1
      • Topic1sub1
      • Topic1sub2
    • Root2
      • Topic2sub1

There may be more than one TopicTree in a page or a web - hence the TopicTreeName. (The TopicTreeName should be optional.)

Given a TopicTree specified as above, the following operations should be available.

Find Pages Missing From TopicTree: it should be easy to create a report of topic pages that are in a web, but which are not in any given TopicTree. Similarly for subsets of pages, e.g. specified via a regexp. E.g. specified via transitive closure from pages already in the TopicTree. (We don't need the tree structure to do this - a simple list would suffice - but we'll need the tree structure for subsequent operations.)

Such a missing pages report might look like:

  • TOPICTREE(TopicTreeName)
    • Root1
      • Topic1sub1
      • Topic1sub2
    • Root2
      • Topic2sub1
  • TOPICTREE pages not referenced in TopicTreeName
    • Topic2sub1sub1
    • Topic2sub1sub2
    • Topic1subUnknownOrder?

Infer a TopicTree structure from Parent INCLUDE and EmbedTopicPlugin structure: E.g. if Topic2sub1 INCLUDE's Topic2sub1sub1 and Topic2sub1sub2, in that order, we might infer where they should be placed under Topic2sub1. Similarly, if Topic1subUnknownOrder? has Root1 as a META:PARENT, we might infer what subtree it should be placed in, but not in what order. Such a report might look like:

  • TOPICTREE(TopicTreeName)
    • Root1
      • Topic1sub1
      • Topic1sub2
      • CHILDREN INFERRED FROM META:PARENT, UNKNOWN ORDER
        • Topic1subUnknownOrder?
    • Root2
      • Topic2sub1
        • Topic2sub1sub1 INFERRED FROM INCLUDE
        • Topic2sub1sub2 INFERRED FROM INCLUDE
  • TOPICTREE pages not referenced in TopicTreeName
    • Topic2sub1sub1
    • Topic2sub1sub2
    • Topic1subUnknownOrder?

Report Inconsistencies Between Inferred and Explicit TopicTree Structure: ... obviously, these may be inconsistent. Such inconsistencies should be reported, although it should be able to disable such a warning.

E.g. we may want to report when the parent of a node in the TopicTree is not the META:PARENT. But at other times the same node may have different parents in different TopicTrees.

Similarly, INCLUDE and EmbedTopicPlugin imply a tree structure. We could report when it is inconsistent, both parenthood and ordering.

Merge TopicTrees: There may be several different TopicTrees specified in the document - hence the names. Where TopicTrees overlap - e.g. where the root of 1 tree is a leaf of the other - they may be merged to form a bigger effective TopicTree.

Update META:PARENT information: we could update the META:PARENT information, based on explicit and inferred TopicTrees.

Create a master document that INCLUDEs or Embeds all the pages in a TopicTree: We could create such a big document, for prinout, mailing, or your viewing pleasure. Here it is probably necessary to take into account inconsistencies between INCLUDE/EmbedTopicPlugin, and the TopicTree.

E.g. we would probably not want to print a topic page twice, once where it was embedded, and a second time according to the TopicTree.

Other Ways of Viewing Structure

There are other ways of viewing structure - the relationships between pages. E.g. various graps and visualization tools.

These are, many of them, great.

However, a serializable tree structure is still one of the most useful. If onl;y because it can be printed.

-- AndyGlew - 27 Mar 2006

Discussion

I wrote the first step of TopicTree - a script to find files not mentioned in the TopicTree. Attached below.

This script is currently UNIX CLI. Written that way because it is easiest for me. I'll translate it to a TWiki plugin when I figure out FilesystemToolToTWikiPluginAdapter.

Unfortunately, this means the script has to be run externally, e.g. from a cron job. It really needs to be a plugin.

-- AndyGlew - 29 Mar 2006

Continuing, sporadic, work: now, instead of just printing a list of the missing files, I infer their tree structure from the INCLUDE and Embed directives within them. Doesn't help if you don't use INCLUDE or Embed, but it's a start.

Also has the ability to infer tree structure for an entire web based on INCLUDE and EmbedTopic? .

Still a UNIX CLI script; not TWiki pluginified.

  • topictree.tgz: script and perl module to infer tree structure of files missing from topictree

-- AndyGlew - 04 Apr 2006

Andy - you probably want to write it as a twikishell CommandSet - this will smooth the transition between testing your script from the command line and exposing that same functionality over the web.

-- MartinCleaver - 04 Apr 2006

Try TWikiIRC and talking to RafaelAlverez? or me.

-- MartinCleaver - 04 Apr 2006

I'll give this a try, Martin - I'm just not sure when I'll have time to work on it in the near future.

By the way, I seek advice, suggestions on two topics:

a) I am pretty happy with how TopicTree is currently working with the manual editing, but...

b) I want still more automatic inference of tree structure. E.g. if I have not got a manual TopicTree for a few topics, I would like to infer it, e.g. from parent or linkage

c) I am pretty sure that I want to be able to "merge" TopicTrees. E.g. I have often dne things like having a mini TOC that points to related pages at the top of every subsection.

Merging is straightforward when all are consistent. I am not sure what to do if the trees are inconsistent. Any ideas?

d) I actually have enough right now to be able to set the META:TOPICPARENT link appropriately - which is why I started. But I have noticed that I often want the same topic to appear in more than one place in the TopicTree. If so, which "parent" should I use? The first encountered? The last? ...?

I am basically noticing that the TopicTree makes META:PARENT irrelevant.

e) But, the biggest design issue - i.e. the thing that takes the most manual work - is, what to do about DanglingLinks? ?

I spent a few hours a few days ago setting up my original topic tree. Most of the time was spent not organizing existing pages, but organizing the nonexisting pages. I took the DanglingLinks? list that I generate through yet another of my scripts - where is the link for that, now? - and manually edited them into a TopicTree.

Now, though, I still get new DanglingLinks? that are not into the TopicTree. But whereas new EmbedTopicPlugin links have a natural place in the TopicTree, the DanglingLinks? do not. So I still have to move them around by hand.

I am thinking that maybe I should "guess" at a parent for any DanglingLink? . E.g. if referred to in only one place, that page?

Anyway, suggestions appreciated.

-- AndyGlew - 06 Apr 2006

I continue to work occasionally and sporadically on TopicTree.

This most recent upload is a more complete direct tree - mainly sanitized, although it still contains some hardwired paths. Provided in the hopes that others will be interested. Features:
  • I am getting to the point where I am almost willing to depend on TopicTree.
  • I generate it from cron regularly, circa every half hour.
    • I would have cron overwrite the TopicTree file, instead of creating TopicTreeWithMissingFilesCron? , if there was an easy way to make this atomic. But I don't know of such a way. It is strange that the hardware community is thinking about transactional memory, when the OS cmmunity does not...
  • I added the "noauto" flag to lines between %!BeginTopicTree%/%!EndTopicTree% to allow the noise level for various standard files like WebHome and TopicTree itself to be reduced
  • I added -print_style replace-topictree to, well, replace the topictree in-place - what I would want if cron were to overwrite.
  • Along the way, print the list of missing files, etc., as %!BeginTopicTreeAutoStuff%/%!EndTopicTreeAutoStuff%
  • Improved handling of cross-web links

TBD:

  • make into a plugin
    • or twikish
    • I am thinking about using Perl's objefct oriented I/O, so that any Perl script can be made to run "inside" twiki
  • add a button to say "regenerate topictree now!"
  • merge multiple topictrees

-- AndyGlew - 16 May 2006

Hi Andy, just to let you know that this work is greatly appreciated! I am following your progress.

-- ArthurClemens - 16 May 2006

Thanks for the comment, Arthur. It sometimes feels like I am just talking to myself.

Now, on to the next topic:

I use a lot of REDIRECTs, as in RedirectPlugin.

Or, rather, I would like to use a lot of redirects. But I find RedirectPlugin too hard to use.

Why use redirects? Well, for example: I often define NewTerm1 and NewTerm2. I drop links to these all over the place. But, I don't want to create a separate page for each - instead I want to create a single page NewTerms, that defines them both in contrast to each other. Hence I want NewTerm1 and NewTerm2 to redirect to NewTerm.

So, tonight, I added this to TopicTree:

  • NewTerms
    • redirect-to-above-from: NewTerm1?
    • redirect-to-above-from: NewTerm2?

I.e. I can use the TopicTree as a sort of centralized database from which I can create the redirects. (I use conventional RedirectPlugin for the actual work.)

It works well enough that I'll keep writing my documentation this way.

Note: the redirectPlugin folk say that you should not overwrite a topic with a redirect. I find that I really want to. Maybe I should warn against overwriting a non-trivial topic. And certainly I should version control these files.

By the way, this leads to an undesirable proliferation of small redirect topics. Someday we will have to give in and put them all in a single file - a single file containing multiple redirects. This is an area that database driven wikis have an advantage.

I'm not posting this nw change yet. Maybe this weekend.

-- AndyGlew - 17 May 2006

In case anyone cares, I have code that assembles a nice PDF from a list of topics in a topctree. I am now using this to publish collections of wiki pages as standalone documents.

-- AndyGlew - 28 Jan 2008

That is a nice feature!

-- ArthurClemens - 28 Jan 2008

I'd be interested too.

-- StephaneLenclud - 31 Jan 2008

 

Topic attachments
I Attachment Action Size Date Who Comment
ziptgz find-unwritten-wiki-pages.tgz manage 30.0 K 16 May 2006 - 07:56 AndyGlew more topictree stuff
txttxt topictree-find-unmentioned-pages.pl.txt manage 3.3 K 29 Mar 2006 - 20:37 AndyGlew script to maintin topictree. not twki integrated
ziptgz topictree.tgz manage 7.4 K 04 Apr 2006 - 07:30 AndyGlew script and perl module to infer tree structure of files missing from topictree
Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r15 < r14 < r13 < r12 < r11 | More topic actions
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback SourceForge.net Logo