Bug: Rename script misses some ref-by topics
Test case
- In TopicOne, add a link to TopicTwo this way :
__See also TopicTwo__
- The link to topic is well expanded
- Try to rename TopicTwo.
- TopicOne won't appear in the referrers list.
- Submit the new name.
- Logically, TopicOne isn't updated.
Environment
TWiki version: |
20011201 |
TWiki plugins: |
|
Server OS: |
W2K |
Web server: |
Apache |
Perl version: |
|
Client OS: |
|
Web Browser: |
|
--
JeromeBouvattier - 26 Mar 2002
Follow up
This is probably caused by the bug mentioned in
RefBySearchMissesLinks. The ref-by search fails if the link is the last word. --
ArthurClemens - 17 Sep 2003
This may be similar in nature, but will not be in the same piece of software. Rename has different and more complex ways of detecting references. I'll try and see what's going on.
--
JohnTalintyre - 17 Sep 2003
Have a look at
BugInTopicRenaming - I've put in a patch that makes this work with
I18N again, and it worked OK when tested on my site. May require a bit of modification to fix this as well, but I'm happy to help if needed.
--
RichardDonkin - 18 Sep 2003
Fix record
The above refers to:
__See also TopicTwo__
interestingly
__TopicTwo also interesting__
does no link in
TopicTwo. In other words it's okay to have an underscore after a
WikiWord, but not before.
The offending line for the rename not working is in
Search.pm
and is:
my $match = "(^|[^A-Za-z0-9_.])($originalSearch)(?=[^A-Za-z0-9_]|\$)";
^
Remove the indicate underscore and the reference will be picked up correctly. Unfortunately a key problem is that the way rename detects links is different to the rendering in TWiki, would be a lot neater if topics produced marked up output that could be scanned by rename; but many issues with getting to this point e.g. performance.
Anyone see any problem with this fix?
--
JohnTalintyre - 19 Sep 2003
Your fix looks OK, but should be made
I18N-safe by eliminating the A-Z ranges. See my patch in
BugInTopicRenaming for similar regexes that work with
I18N (the bug there was due to incorrect quoting).
--
RichardDonkin - 21 Sep 2003
Fix put in with some (I hope)
I18N fixes. Note that I accidentally put the wrong Codev topic in the fix and then got some strange CVS error message. Would still be better if code for this was shared i.e. not hard wired in Search.pm and TWiki.pm
--
JohnTalintyre - 23 Sep 2003
I have fixed the
I18N part of this in CVS - using
[^[:alpha:][:digit:]]
is OK on Perl 5.6 or higher when locales are working, but the recommended
[^${TWiki::alphaNum}]
works on Perl 5.005, and with Perl 5.6+ when locales are broken. Generally, matching of A to Z should only be done with the regexes created in
CVS:lib/TWiki.pm
by the
setupRegexes
function - this will also simplify
InternationalisationUTF8 work.
It might be a good idea to factor out the 'not alphanumeric' pattern into a single compiled regex - what's really meant here is 'not a
WikiWord', since
WikiWords are now somewhat redefinable in format.
I've also put some other
I18N fixes into
CVS:lib/TWiki/Search.pm
- not tested recently but they are quite simple and were tested when I first did them in January.
--
RichardDonkin - 24 Sep 2003
since WikiWords are now somewhat redefinable in format
Do you mean that it is configurable what the rules are for WikiWords? So I can make a rule that Wiki_word is a
WikiWord, or Wiki|word, or .Word?
--
ArthurClemens - 24 Sep 2003
This flexibility is not yet really a feature, but it is much simpler now to change the code, as of
TWikiRelease01Feb2003, due to the
I18N work. In the
setupRegexes
routine of
TWiki.pm
, there are some lines that define regexes used in all other TWiki code:
# TWiki concept regexes
$wikiWordRegex = qr/[$upperAlpha]+[$lowerAlpha]+[$upperAlpha]+[$mixedAlphaNum]*/;
$webNameRegex = qr/[$upperAlpha]+[$mixedAlphaNum]*/;
$defaultWebNameRegex = qr/_[${mixedAlphaNum}_]+/;
$anchorRegex = qr/\#[${mixedAlphaNum}_]+/;
$abbrevRegex = qr/[$upperAlpha]{3,}/;
You can do things like this to allow (say)
I18nSupport to be a
WikiWord:
$wikiWordRegex = qr/[$upperAlpha]+[$lowerAlphaNum]+[$upperAlpha]+ $mixedAlphaNum]*/;
This has been tested for redefining Web names to be
WikiWords, and worked surprisingly well - see
Support.WebNameAsWikiName, now included in the alpha as
WebNameAsWikiName (actually, any upper case string followed by a mixed case alphanumeric string). Also discussed at
WebNameShouldBeMoreFlexible - there is some impact on skins but not too bad, just a matter of suitable
<nop>
or
<autolink>
tags.
The use of
$upperAlpha
etc is because the code needs to work on Perl 5.6 or higher using locales in regexes, as well as on Perl 5.005 - hence these variables are defined earlier. This makes it a bit difficult to expose this feature externally, though it would be possible given sufficient documentation and a suitably complete set of
$upperAlpha
type variables. It's not that easy to use these variables rather than Perl regexes, as the discussion here shows.
--
RichardDonkin - 24 Sep 2003
Thanks Richard. I looked to do this, but didn't understand all the talk of escape sequences.
--
JohnTalintyre - 24 Sep 2003
Do you mean the ISO-2022-JP escape sequence discussion at
JapaneseAndChineseSupport (quite esoteric and doesn't need to be supported by TWiki after investigation, since UTF-8 in the core is best approach), or something else?
--
RichardDonkin - 25 Sep 2003