Code smell stuff / questions
This is to respond to some CODE_SMELL comments re the
I18N code I added to TWiki.pm a while back.
Doesn't _setupLocale change the Apache process locale under mod_perl?
Yes, it probably does, and that might be a bad thing (e.g. could break attachment hack mentioned below) - however, there's no way for TWiki to avoid doing this. Perhaps
ModPerl users would care to comment on whether a frontend Apache reverse proxy, for all
pub
and other static content, might be a good solution?
Why is nativeUrlEncode different to _urlEncode?
The idea of nativeUrlEncode is to implement %INTURLENCODE%, whose point is to force attachment links to be URL-encoded, freezing them in the native site character set (e.g. EUC-JP, ISO-8859-1) and thereby preventing the browser from encoding the URL from (say) ISO-8859-1 to UTF-8. This means that such browsers (e.g. IE, Opera) can still access attachments when otherwise using UTF-8 URLs. Apache and other web servers don't have UTF-8 to native character set conversions built in. See comments in
EncodeURLsWithUTF8 and
ProposedUTF8SupportForI18N for more background.
This function takes account of whether TWiki is running on an EBCDIC platform, because doing the 'freeze in native' approach would break the URL completely, and such platforms have web servers that can handle UTF-8 URLs directly. See
RewritingUrlsWithEscapedCharsUnderOs390 for more gory details.
INTURLENCODE should only be used when generating attachment links within TWiki, and in some templates that use such links.
By contrast,
_urlEncode
has nothing to do with
I18N, so it doesn't care about EBCDIC etc.
Although these functions are very similar, I think it's best to keep them separate since their purposes differ markedly.
--
RichardDonkin - 29 Oct 2004