Tags:
create new tag
, view all tags

Question

  • TWiki version: 01 Feb 2003
  • Perl version: 5.005_03
  • Web server & version: Apache/1.3.26 (Unix)
  • Server OS: 4.6-STABLE FreeBSD i386
  • Web browser & version: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3) Gecko/20030407
  • Client OS: 4.8-RC FreeBSD i386

Trying to use Non-English words as WikiWords in topic names. The URL for that non-english topics works only with Mozilla. Opera and IE use utf-8 encoding to send URL, resulting in non-existent topic. Where I should look for applying a patch to decode URL correctly and perhaps apply an iconv transformation for my local charset?

-- SergeySolyanik - 07 Apr 2003

Answer

See InternationalisationEnhancements for details of how to support non-ASCII characters in topic names, which is built in to TWikiRelease01Feb2003. As described in the I18N topic, you need to turn off browser features that automatically encode UTF-8 characters, i.e. you must choose ISO-8859-1, KOI8-U or whatever. Real support for UTF-8 is somewhat harder to do, but should be possible in a future release - see InternationalisationUTF8 for some discussion on this.

Let us know how you get on.

-- RichardDonkin - 07 Apr 2003

Thanks for answer. I'll will wait for new release.

-- SergeySolyanik - 14 Apr 2003

Did you have a look at InternationalisationEnhancements? There's no need for a new release, you just need to make one change to the browser configuration.

-- RichardDonkin - 14 Apr 2003

Yes, I seen InternationalisationEnhancements. The need to change every browser for collaboration is breaking the idea of the collaboration and is not suitable for me. There is also some incorrect assumptions about Unicode and i18n at InternationalisationUTF8 and TWiki code itself. So, because I have no time to make correct patches to TWiki, I will wait for new release. May be it's my misunderstanding, but TWiki need to rewritten from scratch to support Unicode, and correct XML/HTML. TWiki lack at my view some modularity in URL/content handling, which makes impossible to insert calls to iconv... So, I will wait for new release, while using and hacking corrent version, may be I found way to correctly implement unicode support in TWiki...

-- SergeySolyanik - 14 Apr 2003

Just checked Unicode::* perl modules and I think that is the correct way to decode URL into local charset. There should be variable to hold local charset setting, TWiki should map Unicoded URLs into that charset, and that variable should be equal to locale environment for TWiki script. Even better to determine it automatically. All conversions and encoding should be done with appropriate perl modules, not by hands!

-- SergeySolyanik - 14 Apr 2003

It would be helpful if you could outline your view of incorrect assumptions about UTF8 in InternationalisationUTF8, and post any patches there as well. I haven't looked into this in great detail, but it doesn't seem that hard if the whole system works in UTF-8. If you want to work in Unicode in the core and local charsets externally, or to use local charset internally and Unicode externally, that's more complicated, but I wasn't planning on doing any transcoding to or from Unicode.

TWiki I18N is mainly targetted at intranets, where the assumption is that a browser config change is possible even if not very convenient. It also supports TWiki's 'any browser' philosophy, since not all browsers support UTF-8 and some that do support UTF-8 have related bugs.

-- RichardDonkin - 15 Apr 2003

Of course I'll share my experience, I just need time to do it. About incorrect assumptions - it's not correct that we can't determine encoding of URL - the browser supply the encoding information. What we need is to decode URL in according to encoding and to map it into local charset. For example, my local filesystem is transparent to any encoding of filenames, but every utility assume that LANG is set appropriately. In my case that is the ru_RU.KOI8-R. When TWiki got URL with UTF-8 encoding it should map it into my LANG, or else it fails to locate file or directory.

-- SergeySolyanik - 15 Apr 2003

I'm not convinced that browsers always supply the encoding information - see this message, as well as the links from InternationalisationEnhancements. POST transactions don't provide this information, not sure about other ones such as GET.

There is another approach to the URL encoding issue, which is for TWiki to %-encode all URLs as they are generated as part of the TWiki page - this effectively forces them into the $siteLocale charset, e.g. KOI8-R.

-- RichardDonkin - 16 Apr 2003

PatternSkin in TWiki 4.2.0 forbids creation of topics with Unicode names due to a bug in (or related to) webtopiccreator.js. I'd file a bug report, but I reset my password and am waiting for the mail to get through a greylisting hoop. A workaround is just to remove (or rename) webtopiccreator.js, it's not a critical piece of code.

-- AndrewPantyukhin - 18 Apr 2008

Andrew, send me an e-mail if you have login issues on twiki.org.

File a bug report on the topic creation issue.

-- PeterThoeny - 03 Jun 2008

Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r13 - 2008-08-02 - PeterThoeny
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.