Question
- TWiki version: 01 Feb 2003
- Perl version: 5.005_03
- Web server & version: Apache/1.3.26 (Unix)
- Server OS: 4.6-STABLE FreeBSD i386
- Web browser & version: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3) Gecko/20030407
- Client OS: 4.8-RC FreeBSD i386
Trying to use Non-English words as
WikiWords in topic names. The URL for that non-english topics works only with Mozilla. Opera and IE use utf-8 encoding to send URL, resulting in non-existent topic. Where I should look for applying a patch to decode URL correctly and perhaps apply an iconv transformation for my local charset?
--
SergeySolyanik - 07 Apr 2003
Answer
See
InternationalisationEnhancements for details of how to support non-ASCII characters in topic names, which is built in to
TWikiRelease01Feb2003. As described in the
I18N topic, you need to turn off browser features that automatically encode UTF-8 characters, i.e. you must choose ISO-8859-1,
KOI8-U
or whatever. Real support for UTF-8 is somewhat harder to do, but should be possible in a future release - see
InternationalisationUTF8 for some discussion on this.
Let us know how you get on.
--
RichardDonkin - 07 Apr 2003
Thanks for answer. I'll will wait for new release.
--
SergeySolyanik - 14 Apr 2003
Did you have a look at
InternationalisationEnhancements? There's no need for a new release, you just need to make one change to the browser configuration.
--
RichardDonkin - 14 Apr 2003
Yes, I seen
InternationalisationEnhancements. The need to change every browser for collaboration is breaking the idea of the collaboration and is not suitable for me.
There is also some incorrect assumptions about Unicode and i18n at
InternationalisationUTF8 and TWiki code itself. So, because I have no time to make correct patches to TWiki, I will wait for new release.
May be it's my misunderstanding, but TWiki need to rewritten from scratch to support Unicode, and correct XML/HTML. TWiki lack at my view some modularity in URL/content handling, which makes impossible to insert calls to iconv...
So, I will wait for new release, while using and hacking corrent version, may be I found way to correctly implement unicode support in TWiki...
--
SergeySolyanik - 14 Apr 2003
Just checked Unicode::* perl modules and I think that is the correct way to decode URL into local charset. There should be variable to hold local charset setting, TWiki should map Unicoded URLs into that charset, and that variable should be equal to locale environment for TWiki script. Even better to determine it automatically. All conversions and encoding should be done with appropriate perl modules, not by hands!
--
SergeySolyanik - 14 Apr 2003
It would be helpful if you could outline your view of incorrect assumptions about UTF8 in
InternationalisationUTF8, and post any patches there as well. I haven't looked into this in great detail, but it doesn't seem that hard if the whole system works in UTF-8. If you want to work in Unicode in the core and local charsets externally, or to use local charset internally and Unicode externally, that's more complicated, but I wasn't planning on doing any transcoding to or from Unicode.
TWiki
I18N is mainly targetted at intranets, where the assumption is that a browser config change is possible even if not very convenient. It also supports TWiki's 'any browser' philosophy, since not all browsers support UTF-8 and some that do support UTF-8 have related bugs.
--
RichardDonkin - 15 Apr 2003
Of course I'll share my experience, I just need time to do it. About incorrect assumptions - it's not correct that we can't determine encoding of URL - the browser supply the encoding information. What we need is to decode URL in according to encoding and to map it into local charset. For example, my local filesystem is transparent to any encoding of filenames, but every utility assume that LANG is set appropriately. In my case that is the ru_RU.KOI8-R. When TWiki got URL with UTF-8 encoding it should map it into my LANG, or else it fails to locate file or directory.
--
SergeySolyanik - 15 Apr 2003
I'm not convinced that browsers always supply the encoding information - see
this message
, as well as the links from
InternationalisationEnhancements. POST transactions don't provide this information, not sure about other ones such as GET.
There is another approach to the URL encoding issue, which is for TWiki to %-encode all URLs as they are generated as part of the TWiki page - this effectively forces them into the
$siteLocale charset, e.g. KOI8-R.
--
RichardDonkin - 16 Apr 2003
PatternSkin in TWiki 4.2.0 forbids creation of topics with Unicode names due to a bug in (or related to) webtopiccreator.js. I'd file a bug report, but I reset my password and am waiting for the mail to get through a greylisting hoop. A workaround is just to remove (or rename) webtopiccreator.js, it's not a critical piece of code.
--
AndrewPantyukhin - 18 Apr 2008
Andrew, send me an e-mail if you have login issues on twiki.org.
File a bug report on the topic creation issue.
--
PeterThoeny - 03 Jun 2008