Should I use CSS or markup to correctly format Unicode-based bidirectional (bidi) text in HTML and XML-based markup languages?
Text in written in Arabic or Hebrew scripts flows predominantly from right to left (RTL), whereas text in other scripts flows left to right (LTR). Much of the time the Unicode bidi algorithm takes care of the directionality of text, based on the properties of the characters used. However, there needs to be some way of indicating:
that the overall orientation of content and layout (such as table cell direction) should be RTL (LTR is the default) because the document is mainly written in a script such as Arabic or Hebrew
that part of a LTR document should be treated as RTL, and vice versa
what is the expected flow of text when the Unicode bidirectional algorithm is insufficient to correctly order adjacent runs of mixed direction text
that the inherent directionality provided by the Unicode bidi algorithm should be overridden.
We show examples using text in real right-to-left scripts and as an ASCII-only, visually ordered version immediately afterwards that shows English text in lower case and Hebrew or Arabic in upper case. The ordering and position of the characters reflects that of the original.
To clarify the third point above, this sample sentence provides an example of what you may get if you rely solely on the bidirectional algorithm. This is incorrect. Because the whole quote is in Hebrew, and therefore runs right to left, the text "W3C" and the comma should appear to the left of (ie. after) the Hebrew text.
The title says "פעילות הבינאום, W3C" in Hebrew.
ASCII version:
The title says "YTIVITCA NOITAZILANOITANRETNI, w3c" in Hebrew.
The correct result when displayed would look like this:
The title says "פעילות הבינאום, W3C" in Hebrew.
ASCII version:
The title says "w3c ,YTIVITCA NOITAZILANOITANRETNI" in Hebrew.
XHTML/HTML provides markup to fulfill these purposes. These include the following:
Markup | Effect |
---|---|
dir attribute |
Sets the directionality for the element to which it is attached and below. Possible values include rtl and
ltr . |
bdo element |
Overrides the directionality of text as defined by the Unicode bidi algorithm. |
CSS also provides support for text direction as follows:
Property | Values | Effect |
---|---|---|
unicode-bidi |
embed |
The text to which this is applied will assume the directional flow indicated by the direction property. |
bidi-override |
The text to which this is applied will override the Unicode bidi algorithm according to the directional flow indicated by the
direction property. |
|
direction |
ltr |
Sets a base direction of LTR for the text to which the unicode-bidi property is applied. |
rtl |
Sets a base direction of RTL for the text to which the unicode-bidi property is applied. |
The question is about whether you should use the markup or the CSS for indicating directionality in XML-based markup languages and HTML.
You should always use dedicated bidi markup to describe your content, where markup is available. Then CSS may or may not also be needed to describe the meaning of that markup. This depends on whether you are dealing with content that is handled by the user-agent as HTML or XML. (Note that XHTML may be served as either!)
‎
and ‏
cannot be used to resolve
directionality.Let's look at this in a little more detail.
Because directionality is an integral part of the document structure, markup should be used to set the directionality for a document or chunk of information, or to identify places in the text where the Unicode bidirectional algorithm alone is insufficient to achieve desired directionality.
To produce the desired right-to-left or bidirectional effect, some people simply apply CSS to whatever general paragraph or inline elements surround the relevant text. However, styling applied by CSS is not permanent. It may be turned off, be overridden, go unrecognised, or be changed/replaced in different contexts. Although bidi markup is only needed for the visual rendering of a text it is not purely decorative in function. Markup remains integrated with the document content in a persistent fashion. It also lends significant clarity to the content if you use dedicated bidi markup.
You should therefore use dedicated bidi markup whenever it is available. Do not simply attach CSS styling to a general element to achieve the effect.
Note that this presupposes that documents written in markup languages always have recourse to markup specifically dedicated to the support of mixed direction text. People designing a DTD or Schema should be encouraged to add elements or attributes for that purpose.
text/html
Use markup only. The CSS2 recommendation recommends the use of markup for bidi text in HTML. In fact it goes as far as to say that conforming HTML user agents may ignore CSS bidi properties. This is because the HTML specification clearly defines the expected behaviour of user agents with respect to the bidi markup.
application/xhtml+xml
XHTML 1.0 served as application/xhtml+xml
is expected to use the same semantics as HTML. Therefore, it also makes sense
to use markup only and no CSS for this.
application/xml
or
text/xml
Normally a user agent will not automatically recognize or know what to do with any bidi markup you use in XML documents. CSS properties should therefore be used to indicate the expected visual behaviour of text in your document.
The CSS, however, should always be linked to dedicated bidi markup in the text.
XHTML served as application/xml
or text/xml
is treated by user agents as XML, not HTML.
The following shows the CSS that would be appropriate for the set of markup available in XHTML:
*[dir="ltr"] { direction: ltr; unicode-bidi: embed }
*[dir="rtl"] { direction: rtl; unicode-bidi: embed }
bdo[dir="ltr"] { direction: ltr; unicode-bidi: bidi-override }
bdo[dir="rtl"] { direction: rtl; unicode-bidi: bidi-override }
There are situations in XHTML/HTML and possibly other XML-based markup languages where text appears in an attribute or an element that only supports character data. Neither markup nor CSS can be used to modify directionality of either attribute text or part of the text in an element that supports only character data. In these cases you will need to resort to the use of Unicode directional formatting codes. (See the FAQ (X)HTML & bidi formatting codes vs. markup for more details.)
For XML-based markup languages, bidi styling should be defined in a separate style sheet, and that style sheet included in your other style sheets or in your document. This simplifies the development of style sheets and reinforces the difference between bidi and other styling. Think about the bidi style sheet as a part of the schema information that defines the meaning of specific bidi markup, rather than as the decorative styling that can exist in various variants.