Tags:
internationalization1Add my vote for this tag create new tag
, view all tags

Bi-directional Text

From Wikipedia, http://en.wikipedia.org/wiki/Bi-directional_text:

Many of the major writing systems of the world, such as Arabic and Hebrew, are written in a form known as right-to-left (RTL), in which writing begins at the right-hand side of a page and concludes at the left-hand side. This is different from the left-to-right (LTR) direction in which languages using the Latin alphabet (such as English) are written. When LTR text is mixed with RTL in the same paragraph, each type of text should be written in its own direction, which is known as bi-directional text. This can get rather complex when multiple levels of quotation are used. Almost all writing systems originating in the Middle East are of this nature.

-- PeterThoeny - 18 Jan 2006

Discussion

To what level (if at all) does TWiki support right-to-left text, and as such, bi-directional text?

-- PeterThoeny - 18 Jan 2006

I did some experiments with Hebrew on my own website using the Feb2003 code, and it worked fine to my untutored eye.

I'm not sure if TWiki needs to do anything specific to support bidirectional text, but this is a complex area so there may well be some special requirements here.

Generally, supporting UTF-8 throughout TWiki would make it easier to experiment with this, since Unicode has good bidirectional support, but it is also possible to support non-Unicode bi-di character sets (which is how I did my tests), as long as 'logical ordering' variants of character sets are used (i.e. those that assume the browser has the ability to handle bi-di text layout.)

-- RichardDonkin - 18 Jan 2006

Is there a public site to test that?

-- PeterThoeny - 18 Jan 2006

Where does TWiki stand to the definition at http://www.wikimatrix.org/wiki/feature:right-to-left_support ?

See also CSS 'direction' and 'unicode-bidi' properties at http://www.w3.org/TR/REC-CSS2/visuren.html#direction

-- PeterThoeny - 19 Jan 2006

My public site is down unfortunately - hope to get it back up with DakarRelease soonish...

We definitely don't support those WikiMatrix bi-di CSS2 properties with the standard HTML pages and CSS stylesheets delivered with TWiki (supported by FirefoxBrowser 1.5 or higher only by the way, and not by IE6). (I haven't checked that assertion but nobody has investigated bi-di so I don't see how we could.) However, TWiki administrators/users could probably customise the CSS, and use HTML bi-di overrides, to get the bidi effects they want, since we support both HTML embedding and use of CSS.

Having read the bi-di part of the CSS2 spec, I'm not sure where they are useful - probably one use case is when you want the paragraphs to be formatted starting from right hand side, and so on. There may well be other uses, but I'm not clear on what they are.

Some useful links:

The W3C guidelines are very good, with useful examples and explanations, based on cases where RTL text quotes LTR text, and so on, including multiple levels of quotes (aka 'embedding'). The general principle seems to be to depend on the Unicode bidi rendering approach (UAX#9), but use HTML and/or CSS to override this in cases where it doesn't provide the effect the author intended. The guidelines seem to say that HTML overriding of Unicode bi-di rendering is generally preferable in many cases, because it is permanent and allows the author to control how text is rendered, whereas CSS specification of bi-di overrides would mean that the end user (reader) could change this.

However, it does say that formatting depending on the text's bi-di properties (e.g. right-aligning a paragraph of Hebrew) should be done in CSS rather than in HTML - this corresponds to general formatting recommendations for HTML, where CSS is preferred because it can easily be changed. I suspect that as long as HTML is used to set the overall RTL/LTR direction of a page, section or paragraph, the browser will render things properly as right or left aligned paragraphs, but there are clearly exceptions where formatting of bi-di text requires either HTML or CSS overrides.

Interesting area - we really need someone who knows Hebrew, Arabic, Yiddish or some other RTL language and has an interest in the standards to help navigate through this.

By the way, I recommend FirefoxBrowser for reading about I18N and particularly bi-directional issues, as it renders Hebrew and Arabic OK 'out of the box' without installation of the additional fonts (via Windows language packs) needed by InternetExplorer. The Yiddish support in Firefox is not great (doesn't render all the diacritical marks over the characters) - OperaBrowser has much better support for this.

I think WikiMatrix is being over-prescriptive here in focusing only on CSS2 for bi-di support - Wikis probably need a mix of features for good bi-di support, some of which can be embedded using normal HTML, and others of which could be handled through templates. Knowing the language in which a page is written could be useful to automatically set the RTL/LTR properly in the <html> tag for the page, but there's also a need for some overrides and probably some flexibility in the CSS stylesheets delivered with the Wiki.

Overall, I think we should rate ourselves as 'optional' on the WikiMatrix feature, since we can change the CSS and embed HTML to get the whole page or just particular sections, paragraphs or table parts to render in RTL/LTR as needed.

-- RichardDonkin - 22 Jan 2006

Useful FAQ entry from W3C about which scripts and languages are written right-to-left.

-- RichardDonkin - 05 Feb 2006

Hi,

I'm in the middle of "hebrew-support-extent" evaluation for TWiki. At the moment, I find the following obstacles:

  • RTL for the headers (should be easy to manipulate using the templates)
  • RTL for the whole text (probably site CSS change)
  • hebrew WikiWord - Would love to hear suggestions, as I understand this is not supported?
  • filenames in hebrew - are considered empty for some reasons, and files with names in hebrew could not be uploaded - any suggestions?

thanks!

-- MottiSorani - 07 Dec 2006

I haven't tried doing the RTL stuff but should be possible at the template/CSS level as you say.

Hebrew WikiWords should be supported, as long as you use a non-Unicode 8-bit character set and the locale is supported by Perl, e.g. ISO-8859-* - not sure of the best one but probably the Logical version? I did get Hebrew pages working in an old version of TWiki, and there's no reason why the basics should not work.

The empty attachment filenames are possibly because the regex used to filter filenames is not I18N-aware - if it is changed to positively match alphabetic characters and the locale is set to Hebrew, that should work, but will require some minor code changes.

-- RichardDonkin - 11 Dec 2006

Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r11 - 2006-12-12 - RichardDonkin
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.