Tags:
create new tag
, view all tags

Problem

In topic WikiNotation, PeterThoeny states that a WikiName is a sequence of:
  • Capital letter(s)
  • Small letter(s)
  • Capital letter(s)
  • Optional small or capital letter(s) or number(s)

Well, speaking german most of the time, , , are capital (uppercase) and , , , are small (lowercase) letters to me, but not to TWiki. Well, not yet smile

Using all lowercase and uppercase characters that I can easily reach on my keyboard (I know there are a lot more), should be a valid WikiName.

ą č ę ė į ų ū

Sketch of solution

The basic idea is to replace occurences of "A-Z" and "a-z" in wiki.pm with the desired string of uppercase and lowercase letters, whenever user names, topic names, and internal links are involved.

Since there may be different opinions on what are the uppercase and lowercase letters, these should be configurable.

Since my current browser (MSIE 5.00) garbles the "new" letters when it requests a page, substitute the garbage back to the corresponding letters (I later noticed that MSIE 4.01 does not have this problem, and I will try whether MSIE 5.01 has it still).

What I have done so far is sketched below in form of diffs to the 19990901 release of TWiki.

twiki/bin/wikicfg.pm

Define the lowercase and uppercase letters for TWiki:

61,64d60
< #                   upper case characters for WikiNames :
< $upperCase        = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
< #                   lower case characters for WikiNames :
< $lowerCase        = "abcdefghijklmnopqrstuvwxyz";

twiki/bin/wiki.pm

Use them to recognize user names and topic names (the lengthy list of replacements in $topicName is due to a bug/feature of Internet Exporer 5.00, it is not necessary with IE 4.01):

37d36
<         $upperCase $lowerCase
84,120d82
<         $topicName =~ s/?//g;
<         $topicName =~ s/-//g;
<         $topicName =~ s/?//g;
<         $topicName =~ s/ä//g;
<         $topicName =~ s/ö//g;
<         $topicName =~ s/ü//g;
<         $topicName =~ s/Y//g;
<         $topicName =~ s/á//g;
<         $topicName =~ s/é//g;
<         $topicName =~ s///g;
<         $topicName =~ s/ó//g;
<         $topicName =~ s/ú//g;
<         $topicName =~ s/â//g;
<         $topicName =~ s/ê//g;
<         $topicName =~ s/î//g;
<         $topicName =~ s/ô//g;
<         $topicName =~ s/û//g;
<         $topicName =~ s/ //g;
<         $topicName =~ s/è//g;
<         $topicName =~ s/ì//g;
<         $topicName =~ s/ò//g;
<         $topicName =~ s/ù//g;
<         $topicName =~ s/?//g;
<         $topicName =~ s/?//g;
<         $topicName =~ s/?//g;
<         $topicName =~ s/?//g;
<         $topicName =~ s/s//g;
<         $topicName =~ s/,//g;
<         $topicName =~ s/S//g;
<         $topicName =~ s/?//g;
<         $topicName =~ s/?//g;
<         $topicName =~ s/>//g;
<         $topicName =~ s/?//g;
<         $topicName =~ s/^//g;
<         $topicName =~ s/?//g;
<         $topicName =~ s/?//g;
<         $topicName =~ s/?//g;
226c188
<     my $result = `grep '\* [0-9$lowerCase$upperCase]* \- $loginUser' $userListFilename`;
---
>     my $result = `grep '\* [A-Za-z0-9]* \- $loginUser' $userListFilename`;
695,696d656
<     $text=~ s/%LOWERCASE%/$lowerCase/go;
<     $text=~ s/%UPPERCASE%/$upperCase/go;
781c741
<     if ( $name =~ /[$upperCase]+[$lowerCase]+(?:[$upperCase]+[0-9$lowerCase$upperCase]*)$/ )
---
>     if ( $name =~ /[A-Z]+[a-z]+(?:[A-Z]+[a-zA-Z0-9]*)$/ )
864,865c824,825
<        s/([\*\s][\(\-\*\s]*)([A-Z]+[a-z]*)\.([$upperCase]+[$lowerCase]+(?:[$upperCase]+[0-9$lowerCase$upperCase]*))/&internalLink($2,$3,"$TranslationToken$3$TranslationToken",$1,1)/geo;
<        s/([\*\s][\(\-\*\s]*)([$upperCase]+[$lowerCase]+(?:[$upperCase]+[0-9$lowerCase$upperCase]*))/&internalLink($webName,$2,$2,$1,1)/geo;
---
>        s/([\*\s][\(\-\*\s]*)([A-Z]+[a-z]*)\.([A-Z]+[a-z]+(?:[A-Z]+[a-zA-Z0-9]*))/&internalLink($2,$3,"$TranslationToken$3$TranslationToken",$1,1)/geo;
>        s/([\*\s][\(\-\*\s]*)([A-Z]+[a-z]+(?:[A-Z]+[a-zA-Z0-9]*))/&internalLink($webName,$2,$2,$1,1)/geo;

twiki/data/Main/TWikiRegistration

Use the new wiki variables %LOWERCASE% and %UPPERCASE% to build the WikiName in the registration form:

13c13
<   <td><input type="text" name="FirstLastName" size="40" value="" onBlur="var sIn = this.value; var sOut = ''; var chgUpper = true; for ( var i = 0; i < sIn.length; i++ ) { var ch = sIn.charAt( i ); if( ('%LOWERCASE%'.indexOf(ch) != -1) || ('%UPPERCASE%'.indexOf(ch) != -1) ) { if( chgUpper ) { ch = ch.toUpperCase(); chgUpper = false; } sOut+=ch; } else { if( ch==' ' ) { chgUpper = true; } } } this.form.WikiName.value=sOut;" > =<font color="red">**</font>= </td>
---
>   <td><input type="text" name="FirstLastName" size="40" value="" onBlur="var sIn = this.value; var sOut = ''; var chgUpper = true; for ( var i = 0; i < sIn.length; i++ ) { var ch = sIn.charAt( i ); if( ((ch>='a')&&(ch<='z')) || ((ch>='A')&&(ch<='Z')) ) { if( chgUpper ) { ch = ch.toUpperCase(); chgUpper = false; } sOut+=ch; } else { if( ch==' ' ) { chgUpper = true; } } } this.form.WikiName.value=sOut;" > =<font color="red">**</font>= </td>

Still to do

  • Discover why the strange replacement block in wiki.pm is necessary for MSIE 5.00 and not for MSIE 4.01.
  • Use (\w+) instead of [a-zA-Z0-9 ... ] in regular expressions for hyperlink recognition
    (or copy from http://language.perl.com/all_about/regexps.html where they describe a more general approach)
    (or wait if PeterThoeny has brushed this up in the new TWiki release smile
  • "automatic" setting of the variables $upperCase und $lowerCase according to the web server locale (following a suggestion of Dave Smith, http://www.c2.com/cgi/wiki?WikiWikiSuggestions ):
      require 5.004;
      use locale;
      $lowerCase = (join '', grep { uc($_) ne $_ } map { chr($_) } 1..255) . "";
      $upperCase = (join '', grep { lc($_) ne $_ } map { chr($_) } 1..255);

-- RalfHandl - 26 Apr 2000


This looks like a fairly compelling reason to PortToPerl5dot6, to get the Unicode support.

If the valid WikiNames are limited by the server locale, then you can't get an "" as a valid ucase letter on most US servers. Wouldn't it be better to attempt to map every "uppercase" Unicode char and avoid locales altogether? The attempt might fail, of course...(does the notion of "uppercase" exist in the Kanji char set?)

-- KevinKinnell - 27 Apr 2000

There is no notion of uppercase in Japanese. But it's not so important to write Japanese WikiNames (see what I wrote in KanjiCharacterSet). -- JohnBelmonte - 22 May 2000


Thanks Ralf for making this available for people who need it. I prefer not to take it into the distribution for now until it is tested for most platforms and also more browsers and languages.

Follow up also in KanjiCharacterSet.

-- PeterThoeny - 29 Apr 2000


I'm new to Wiki, but related to this issue I've been wondering why aren't WikiNames required to be escaped? (For example \ThisIsAWikiName\.) The current WikiNotation makes it difficult to discuss software which often uses names such as MyClass. Escapes are used for formatting such as italics, but why not for WikiNames themselves? It would seem to make WikiNotation more flexible and give authors more control over what is intended to be linked and what's not.

-- JohnBelmonte - 22 May 2000

The WikiName parsing originated in the first WikiWikiWeb at the Portland Pattern Repository (but please don't go there and try to discuss wiki issues--that wiki is for something else entirely.) The idea was to make it very easy for users to add simple links and basic formatting without needing to know any HTML. So, it's tradition grounded in ease-of-use issues. It seems likely that most of the TWiki's running behind firewalls are used by people who are not HTML savvy, so I doubt that this will change. Escaping NonWikiTopicsThatRunTogether isn't that difficult, anyway. :-) -- KevinKinnell - 22 May 2000

How about making the WikiNotation configurable? -- JohnBelmonte - 22 May 2000


I found this interesting information regarding Umlaut, a way to solve it with the system locale.

From http://www.c2.com/cgi/wiki?WikiWikiSuggestions :

> The parser should understand the umlauts are letters, too.
> ShldWrk.
>
> How do you type umlauts?
>
> How do you match them in perl?
>
> By ASCII value if you need to. Besides, why wouldn't the Perl
> tokenizer accept extended ASCII characters?
>
> Here's a fragment that shows one way to do this. First determine
> the set of upper- and lower-case characters
>
>     =require 5.004;=
>     =use locale;=
>     =$lc = join '', grep { uc($_) ne $_ } map { chr($_) } 1..255;=
>     =$uc = join '', grep { lc($_) ne $_ } map { chr($_) } 1..255;=
>
> next build a regular expression
>
>     =$link = "((?:[$uc][$lc]+){2,})[^$uc$lc];=
>
> then
>
>     =$text =~ m/$link/;=
>
> will match ShldWrk. --DaveSmith

-- PeterThoeny - 29 Aug 2000

Internationalised WikiWords are now supported, for any 8-bit alphabetic character set, including Cyrillic, through the recent InternationalisationEnhancements. These also enable the definition of a WikiWord to be customised. Also, DisableWikiWordLinks is now implemented, so it's possible to just use forced links (e.g. [[this wiki page]]) for all internal and external links, which is useful for JapaneseAndChineseSupport.

By the way, the 'strange replacement block' mentioned above is due to IE 5.5 UTF8-encoding URLs by default. The solution is to turn off this UTF8 encoding, which is documented at InternationalisationEnhancements#Browser_setup, and discussed in comments of end Dec 2002.

-- RichardDonkin - 07 Jan 2003

It seems to me that the above confuses the notion of a WikiName with that of a WikiWord. Is it true that WikiNames, if used for login, cannot contain non-latin1 characters due to a limitation in Apache's Basic Auth system?

-- MartinCleaver - 11 Oct 2004

This is rather an old topic, but it does confuse WikiName with WikiWord as you point out.

Would be best to delete this page in a few days as it's not very useful.

-- RichardDonkin - 13 Oct 2004

Edit | Attach | Watch | Print version | History: r14 < r13 < r12 < r11 < r10 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r14 - 2004-10-13 - RichardDonkin
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.