Tags:
create new tag
, view all tags

Feature Proposal: More actions page to be able to suppress "in all public webs" links

Motivation

If there are hundreds or thousands of webs on a sie, "in all public webs" links on the "More actions" page are impractical.

Description and Documentation

NOINALLPUBLICWEBS preference variable is introduced. If it's on, "in all public webs" links on the "More actions" page are suppressed.

Examples

Impact

Implementation

-- Contributors: HideyoImazu - 2012-09-14

Discussion

In the release meeting held on 2012-09-14 GMT, TimotheLitt pointed out th following isues.

  • Simply disabling the feature to other webs is too binary
  • A negative preference variable is difficult to grasp

He proposed to define a cluster of webs. In a large site, instead of doing things "in all public webs", you would be able to do "among the webs in FooCluster"

-- HideyoImazu - 2012-09-14

  • Feedback from JerusalemReleaseMeeting2012x09x14:
    • First considered no-brainer
    • Then idea to use web clusters instead; rename would look in all webs of web's cluster only

-- PeterThoeny - 2012-09-14

Brainstorming on WebClusters feature:

  • New configure setting {EnableWebClusters}
  • If set, an "all webs" action (rename, delete, backlink, ...) is done only on cluster(s) a web belongs to instead of all webs
  • WebPreferences setting to set clusters, as defined by topics in the Main web:
  • Cluster topics in Main web could list all webs that are in that cluster.
  • For performance, the cluster info should probably be cached

-- PeterThoeny - 2012-09-14

I suppose clustering is diagonal to this proposal and should be a proposal of its own. Because {EnableWebClusters} being true doesn't necessarily mean suppressing "in all public webs" links. Some site owners may need both.

If NOINALLPUBLICWEBS is confusing, how about SUPPRESSINALLPUBLICWEBS?

--

At first, I thought about custer defining topics in the %USERSWEB% web, but now I'm negative about it. Because cluster defining topics are redundant and causes inconsistency. What if a FooWeb has "Set WEBCLUSTERS = Main.FooCluster" but Main.FooCluster doesn't have FooWeb? Things get more complicated if we allow a web to belong to multiple clusters. So cluster memberships are defined only by "Set WEBCLUSTERS =" on WebPreferences files.

On a large site, visiting WebPreferences of all webs takes too long. If RepositoryForSiteAndWebMetadata is in use, the custer field of the web metadata should be referred to instead of WEBCLUSTERS preference value. Iterating through all webs' metadata doesn't take long.

In addition, WEBCLUSTERMEMBERS function tag would be defined, which is expanded to the list of webs in the clusters the current web belongs.

-- HideyoImazu - 2012-09-14

With "Main web could list all webs that are in that cluster" I meant that they have a SEARCH listing all participating webs, e.g no redundancy. For documentation purposes it is good to have clusters listed in Main web, they don;t actually do anything for the rename action.

I am OK with just one or the other proposal or both.

-- PeterThoeny - 2012-09-14

Now I think NOINALLPUBLICWEBS is the appropriate preference name for consistency. Because we have long been using NOAUTOLINK.

-- HideyoImazu - 2012-09-18

Just because we have one negative option name doesn't mean that creating more is appropriate.

Further, the binary choice really doesn't scale well, and we're dealing with a scaling problem. (I'm not sure why one has thousands of webs on a single vhost, but I'm taking that as a given.)

The real problem here is that you want to limit the scope of a search to webs that are relevant so they don't time out. And the specific case is that of fixing backlinks in other webs when a topic is deleted, renamed or moved.

I still believe that the right approach is to note that links tend to be local and that grouping webs will address this, as well as the arbitrary %SEARCH case.

In smaller sites, "Public" is enough granuality, but as Hideyolmazu-san points out, if one has a huge number of webs, it's a problem

Here's what I suggest:

Each WEB is associated with one or more groups - I can accept "WEBCLUSTER" as a name. This is defined in the web's preferences, not in each topic. Backlink and user searches consider only webs whose WEBCLUSTER matches that of the initiating topic's web.

SEARCH also gets a webcluster selector (parameter) - this is a regexp that can be used in special situations to include other webclusters in a search. For example, if one wants to create an index of all webclusters, one would search for the main topic in any cluster.

A user could also have a preference variable that specifies webclusters of interest. If it's defined, SEARCH (and hence backlink fixups) would use it as the default value of it's webcluster selector. So the manager of 2 webclusters would set her "WEBCLUSTERS" preference to "^(?:CLUSTER1|CLUSTER2)$". (The ^(?:)$ could be implicit) Since most people will interact with a handful of WEBCLUSTERS, this will cut the search times down as desired.

For creating webs, we have a site preference "WEBCLUSTER_DEFAULT". Any new web that's created inherits this preference as it's "WEBCLUSTER" tag.

The distributed default value is "PUBLIC". Where we now have a check box for "search public webs", it becomes a select box, with PUBLIC as the selected default. So the look and feel stays the same. The other elements are the user's preference and the web default preference. (Consider BASEWEB/INCLUDINGWEB as usual in INCLUDEd material)

There is no central list of WEBCLUSTERS (though SEARCH could create one by visiting each web's preferences.). And no cross-links that were mentioned earlier. Each web simply says "I'm in these". The cluster effect happens when two or more webs have a name in common. The usual topic permissions on the web preference topic will keep that under control.

This seems like a general solution that's easy to implement. It's scalable. And it doesn't start with "NO" smile

The other thing that one can do is to put cleanup of backlinks into a queue for an off-line task, which can also maintain an index. So if there are links between islands of webclusters (some user will do it if the permissions allow), they can be found and fixed in the background. The user doesn't have to wait.

-- TimotheLitt - 2012-09-30

Timothe's proposal is worth considering. Hideyo-san?

-- PeterThoeny - 2012-10-01

Peter and Timothe, I'm not against WEBCLUSTERS. My point is that introduction of WEBCLUSTERS doesn't make NOINALLPUBLICWEBS redundant.

  • On a middle sized site, you may need both "in all public webs" and "in the webs in the same cluster".
  • On a large site, you may need "in the webs in the same cluster" but you have to suppress "in all public webs".
As such, regardless of WEBCLUSTERS, some need NOINALLPUBLICWEBS.

Timothe, can you please create a separate feature proposal about WEBLUSTERS if you want to pursue? I need NOINALLPUBLICWEBS regardless of WEBCLUSTERS.

-- HideyoImazu - 2012-10-02

I am not going to pursue this. I don't have this problem.

However, your solution is not consistent with TWiki's mission of being Enterprise capable/quality.

If I were a gatekeeper on this project, I would reject this proposal. It is not scalable. And if implemented, it leaves dangling links behind in the case where you say "don't update all links", but there are some. That will cause support issues and user frustration. So in my opinion, it doesn't fit under the guideline that "he who does the work gets to decide."

Further, I disagree with your analysis. The "cluster" approach subsumes the current "all public webs" concept. PUBLICWEBCLUSTER is just a predefined preference that some sites may choose to use.

Your idea of site size is not the way I think of this. Clusters should be relatively small, reflecting a community of interest. Your problem statement suggests that in your environment, you have many isolated communities, which makes global searches, including that done for backlink fixups, very slow and unproductive. (They will not find anything.) If this is the case, grouping related webs ("clusters") will cut the scope of searches down to a manageable size, and solve the problem.

Conversely, if these webs are not isolated, the links need to be found. This option (like the existing one) give the user the power to generate a corrupt database - that is, to leave erroneous backlinks in other webs. And since is "fast", people will use it.

As I see it, there are three alternatives for you to consider:

  1. Implement clusters. This has the advantage of being a very minimal change to existing code, and it scales well. It also makes ALL global searches, not just those used in topic moves, work better.
  2. Move the topic link fixups to off-line processing. (Use my tasks framework, which can trigger on a topic update. Or do it in an ordinary cron job.) This has the advantage that the user never waits for link fixups, although while the off-line job is pending, there will be some stale links visible to others. It would benefit all webs - large or small. It should not be difficult to implement a functional version. All you need is a plugin that adds a fixup request to the off-line queue.
  3. For scaling to really large environments, I suggest an afterSave handler that scans a topic for all links, and adds them to an indexed database. Then the fixup processor (and search) can use the index to find all topics referencing this one. There probably needs to be a "repair" script that scans all topics and rebuilds the index to handle manual edits/some crash scenarios.

Ideally, you would do all three. But each one provides increasing capability. All are scalable. All benefit every global search, not just those involved in editing topics.

That's my perspective. You and Peter can decide what to do with it.

-- TimotheLitt - 2012-10-02

Tomethe,

Now I understand the full-fledged WEBCLUSTERS feature makes NOINALLPUBLICWEBS redundant. But nobody is commited to implement WEBCLUSTERS.

What do you mean by "not scalable"? I'm running a single TWiki site having 7,000 sites and 700,000 topics suppressing "in all public webs" option. My users are not taking a lot of time fixing interweb links.

I don't understand why NOINALLPUBLICWEBS violates the "Enterprise capable/quality". As of now, nobody being committed to a superior solution, NOINALLPUBLICWEBS is the only attainable solution to avoid "in all public webs" links which are destined to browser time-out on my TWiki site.

NOINALLPUBLICWEBS doesn't make TWiki inefficient - it would be just a preference variable referred to by page templates. It doesn't get in the way of the things currently possible. There is no downside.

When somebody is committed and then implements WEBCLUSTERS, I will be happy to see the NOINALLPUBLICWEBS feature is deleted.

-- HideyoImazu - 2012-10-03

This is now regarded as accepted after 7 days of no feedback.

-- HideyoImazu - 2012-10-11

I understand why your patch works for you. I don't support it as a general solution. TWiki should be about engineered solutions, not an agglomeration of point solutions.

If your users fix ONE broken backlink because of this misfeature, it's too many. You and your users may be OK with that, but it's not an enterprise solution. I think it's a fine local patch/workaround for you.

I think that the time you spent implementing it could have produced a better solution - and I've tried to help you to understand how. And that would keep you from having to maintain your local patch.

My strong objections to including this in the product are on record. That's not "no feedback". I'm not sure about the propriety of closing one's own proposal in these circumstances. I've always let someone else close mine, unless there was truly no negative feedback. Maybe I'm too conservative.

Peter will have to decide about this feature; I don't (and don't wish to) control this product.

As for "somebody committed" - I do as I say, When I have a local patch, I keep it a local patch. When I've thought it would be generally useful, I've upgraded it - often with things I don't use, but know are required - before proposing it here.

E.G. X509Plugin has a fully general matching capability for certificate names, even though all I needed for my environment was a constant. And a lot less documentation.

IpPlugin is a fully general solution, even though I could have done just as well for my immediate needs with a /:/? 4 : 6 one-line test.

The off-line tasks framework meets a community need expressed here and on the other wiki. All I needed was a 4-line patch to have plugins clean up tempfiles. I ended up, at last count, with 139 source modules and a fully general solution. And having done this, I find that I use it too.

That's the difference between committing to the philosophy of an open source product and exploiting it as a repository of local patches that others will eventuall maintain.

I know you have done other excellent work for TWiki. In this case, it's not up to your standard. My position is that you should keep it as a local patch. It's fine for that.

-- TimotheLitt - 2012-10-11

Timothe, I understand your concern. If a project keeps getting tiny enhancements for very special cases, the project will be bloated and difficult to maintain and enhance. Very special enhancements should be kept local and not incorporated to a project.

On the other hand, keeping local patches isn't desirable. It's cumbersome to keep patching new releases. Your patch may stop working, then you need to manually fix it.

Though my proposal is not as generic as yours, still it's worth putting into the project as the working solution to avoid "in all public webs" links, which are destined to browser time out. A large TWiki site needs this feature or an alternative.

Since NOINALLPUBLICWEBS is basically a site-level preference, it's trivial to switch to something else - it's amatter of modifying Main.TWikiPreferences. I'm looking forward to a more generic solution implemented.

-- HideyoImazu - 2012-10-12

We discussed at JerusalemReleaseMeeting2012x10x12. I am OK with the feature as proposed. However, it would be good to see the more generic web clusters feature. This requires a committed developer.

-- PeterThoeny - 2012-10-13

I submitted a new WebClusters feature proposal.

-- PeterThoeny - 2012-11-05

I've realized that {NoInAllPublicWebs} configuration parameter is easier to deal with than NOINALLPUBLICWEBS preference variable. So switching.

-- HideyoImazu - 2012-11-08

Edit | Attach | Watch | Print version | History: r18 < r17 < r16 < r15 < r14 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r18 - 2012-11-08 - HideyoImazu
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.