SID-01257: Page: UTF-8 or ISO-whatever?
| Status: |
Answered |
TWiki version: |
4.0.1 |
Perl version: |
5.8 |
| Category: |
CategoryInternationalization |
Server OS: |
Ubuntu Linux |
Last update: |
14 years ago |
Hello,
I am working on a TWiki translation tool and run into some formatting trouble.
Please share some wisdom:
What is the correct coding of the Wiki text pages? UTF-8, or ISO-whatever?
The original TWiki web pages seem to be in UTF-8 whereas our German Main web seems to be in ISO. Why are they differently encoded!? What is correct?
Next question: How shall the text within a MAKETEXT statement be encoded? UTF-8?
Thanks for any hints,
Detlef.
--
DetlefMarxsen - 2011-08-25
Discussion and Answer
The encoding of TWiki pages depends on the initial configuration. If you intend to use international characters it is better to configure for UTF-8. If you have a site in ISO and want to switch to UTF-8 you need to convert the encoding of existing text to UTF-8. There is currently no automated way to switch if you have existing content. More on I18n at
InstallationWithI18N.
If you use MAKETEXT it needs to be encoded in UTF-8.
--
PeterThoeny - 2011-08-26
Hello,
thanks for your quick answer.
Do I understand you right that MAKETEXT won't work with a Web which is not stored in UTF-8 Format?
And ... the text in between the {"..."} of MAKETEXT must be UTF-8 also?
Thanks, %BR%
Detlef.
--
DetlefMarxsen - 2011-08-27
Text inside MAKETEXT needs to be UTF-8, it does not matter outside. For practical reasons it is probably better to have all text in UTF-8 if you have some text in MAKETEXT and other not in MAKETEXT on the same page.
Details on MAKETEXT at
VarMAKETEXT,
UserInterfaceInternationalisation,
UserInterfaceLocalisation.
--
PeterThoeny - 2011-08-27
The major section of the tool works now but I am stuck in encoding problems. I've read quite some TWiki pages concering
I18N and Unicode but still some questions are in my mind:
Is it possible to have two webs with different encodings under 4.0.1 or later releases (I assume: no)?
Unicode has been triggering quite some discussion on the TWiki pages. However, I couldn't find a hint in the release notes for the following question: Has full UTF-8 support been added in some release later than 4.0.1?
Is there a tool to do mass recoding of TWiki pages from ISO-8859-15 to UTF-8?
--
DetlefMarxsen - 2011-09-06
1. Not possible to set encoding per web; it is set for whole site with
$TWiki::cfg{Site}{CharSet} = 'utf-8' configure setting.
2. Work needs to be done for full UTF-8 support. TWiki is open source, interested folks can
get involved. See
ProposedUTF8SupportForI18N.
3. Utility to re-encode between encoding standards is pending. The utility needs to change the .txt and .txt,v files. A logical place to do that is in the
BackupRestorePlugin, see blog
TWiki Has a New Solution for Point & Click TWiki Upgrade.
--
PeterThoeny - 2011-09-06
Added note to
BackupRestorePluginDev on char-set re-encoding.
--
PeterThoeny - 2011-09-06
Closing this question after more than 30 days of inactivity. Feel free to reopen if needed. Consider engaging one of the
TWiki consultants if you need timely help. We invite you to
get involved with the community, it is more likely you get community support if you support the open source project!
--
PeterThoeny - 2012-01-23
If you answer a question - or someone answered one of your questions - please remember to edit the page and set the status to answered. The status selector is below the edit box.