This letter is by no means a complete, precise description of our proposal -- many details are left out (most importantly, the complete proposed standard character and operator dictionary and the precise set of transformation rules for expanding the standard linear syntax macros). These details will be supplied later if the general direction of this proposal is accepted. I think enough of our proposal is explained to give a good idea of its flavor and to serve as the basis for further discussions.
Some important aspects remain to be discussed further by the group before they are well enough understood to be part of a formal proposal, notably how best to allow author extensions of the built-in character and operator dictionary and transformation rules; these aspects are left to be specified in future amendments to this proposal.
Note that this letter supersedes all prior proposals from Wolfram Research, including the "position papers" (which were in general more precise than I am trying to be in this summary, though they were at a less concrete level). Note also that this letter is not an "official document" but rather is part of our ongoing dialogue with the HTML-Math ERB.
HTML-Math is designed to be interpretable either by code in an HTML browser, by a specialized "browser plugin" program, or by a standalone program via the "foreign notation mechanism" of a general SGML processing program. In order that HTML-Math is compatible between any of these implementation modes, the first step in processing HTML-Math source text is always the simultaneous parsing of SGML entities (used to represent extended characters by name) and embedded SGML markup (begin and end tags used to represent hierarchical structure and to provide a place for adding attributes to subelements of a document).
HTML-Math is always processed by the following sequence of steps. (Several of these steps make use of built-in information, consisting of a dictionary of character and operator properties, and a set of transformation rules; in a future amendment to this proposal, this information will be author-extensible for all or part of a document.)
1. Parsing of SGML-style entities (which represent extended characters by name) and markup tags.
2. Tokenization ("lexical analysis") of the non-markup source characters (including those represented by SGML entities parsed in step 1). (Each markup tag is treated as a single token; thus the output from this step is a single linear sequence of tokens.)
3. Operator-precedence-based parsing of the resulting token sequence, to generate an "expression tree". (When this letter needs to give examples of such an expression tree in a way distinct from the source notation, it will use the "display list representation", to be described.)
4. Application of transformation rules to the expression tree, to generate another expression tree, called the final display list.
5. Rendering of the final display list, in the medium and style chosen by the user of the rendering software.
Each of these steps is explained in more detail below. (This letter does not attempt to specify every detail, however; this will be done by subsequent addendums to this letter, if what is described here is accepted.)
But first I will show how the above steps unfold for a simple example, as a general orientation to this proposal.
The solutions to the general quadratic equation <math mode=inline> ax^2+bx+c=0 </math> are given by <math> x = {-b ± &root;{b^2-4ac}} &over; 2a </math>If this example was rendered into ASCII it might look something like this:
2 The solutions to the general quadratic equation ax + bx + c = 0 are given by: ________ / 2 -b +- \/ b - 4ac x = ----------------- 2aHere is a brief description of each of the steps in the parsing of this example.
(mi "x") (mo "=") (mb "{") (mo "-") (mi "b") (mo "±") (mo "&root;") (mb "{") (mi "b") (mo "^") (mn "2") (mo "-") (mn "4") (mi "a") (mi "c") (me "}") (me "}") (mo "&over;") (mn "2") (mi "a")The division of source characters into tokens, and the token types, are determined from the dictionary of character and operator properties. Each token may also contain a list of attributes and values which are also defined by the dictionary, such as precedence for operator tokens, but these are not shown above, for the sake of clarity.
[The ways of "escaping" characters which would otherwise affect the tokenizer (like the double quote which delimits string literals (described below)) will be specified later. This can't be done with extended character notation in a straightforward way, since step 1 is free to replace it with the actual characters it represents; this is a necessary feature of an architecture in which an SGML tool might preprocess HTML-Math.]
The full details of tokenization are given below, including a way of
representing a multi-character identifier. It is also possible to give
any token directly using SGML markup, e.g.
Briefly, the token types mi, mn, and mo represent tokens which will be parsed as identifiers, numbers, and operators respectively; mb and me represent begin and end tags, in this case "invisible grouping" characters. (At this stage, the mo tokens which will typically be rendered as "linear" operators in a 2-dimensional graphic medium (i.e. shown between their operands in a horizontal row) are not distinguished from the ones which will never be rendered directly since the expressions containing them will be transformed before rendering. The mi and mn tokens will all be rendered directly by default. The desire to support other rendering media, and (eventually) both author- and user- defined transformation rules in addition to the built-in rules, is one of the reasons for not distinguishing these kinds of operators at this stage.)
The SGML entities (e.g. ±) each represent extended characters. They are treated the same as ordinary characters, in that their tokenization and subsequent parsing is determined entirely by their entries in the character and operator dictionary. The ones shown above happen to be single-character operators, but others, e.g. "α", are letters which would be tokenized as identifiers. (This proposal will be accompanied later by a complete list of extended characters and their properties, comprising the ones in standard character sets like ISOtech and many new ones (and new names for old ones). All old names will be case-sensitive, but all new names consisting of concatenated words (as most will) will be allowed with the contained words capitalized or not. These characters are used not only to represent hundreds of renderable special characters which can appear in typeset mathematics (not all of which are part of Unicode), but also some nonrendering operators and identifiers used to generate certain layout schemas or for "semantic disambiguation". Characters of all of these types are used in various examples throughout this letter.)
The parser decides as it forms each subexpression whether it is a term or an operator (and thus how it is used during further parsing). In most cases it is a term, but when operators are "embellished" (e.g. subscripted) the resulting expressions remain operators, and retain the precedence and other attributes of the base operator. (This doesn't occur in the present example.)
The parser also introduces new tokens where necessary to represent "missing terms" and "missing infix operators", and decides whether missing operators should be parsed as "multiplication" (as in the above example) or "named function application". (It's unfortunate that this decision can't be deferred until the transformation rule stage, but these two invisible operators have different precedences. Authors are free to insert them explicitly instead of letting the parser choose one. Once the present proposal is amended to allow author extensions, authors will also be free to add transformation rules which further transform expressions containing these invisible operators, or even entirely new invisible operators.)
The token inserted in place of a missing term is (mi "&MissingTerm;"). The tokens inserted in place of missing infix operators are one of (mo "&InvisibleTimes") or (mo "&FunctionApplication;"), depending on how the parser interprets this invisible operator. The rule for deciding which one is inserted is precisely this: an invisible function application operator is inserted if and only if its left operand would be an identifier or a scripted identifier, and the token to its right is a left bracket operator (such as a left parentheses). By a scripted identifier is meant any "left-nesting" of any number and type of scripting schemas (subscripts, superscripts, prescripts, under or overscripts) and non-directly-rendering schemas (e.g. font changes) around an identifier token -- that is, the parser descends into the base (first) argument of any such scripts (zero or more), then checks whether it has reached an identifier.
These inserted tokens will typically render invisibly; the reasons they are inserted explicitly by the parser are to allow them to be inserted instead by the author with identical effects, and to simplify the later use of transformation rules.
The expression tree generated by the parser can be represented in "display list format" as follows (though it won't be suitable for display until some transformation rules are applied); the "leaf nodes" are tokens as described above, and the subexpressions grouped by the parser are lists headed by "mterm" (as in this example) or "moperator":
(mterm (mi "x") (mo "=") (mterm (mterm (mterm (mo "-") (mi "b") ) (mo "±") (mterm (mo "&root;") (mterm (mterm (mi "b") (mo "^") (mn "2") ) (mo "-") (mterm (mn "4") (mo "⁢") (mi "a") (mo "⁢") (mi "c") ) ) ) ) (mo "&over;") (mterm (mn "2") (mo "⁢") (mi "a") ) ) )
The display list is a representation of a single "displayable (or renderable) object", which typically contains other displayable objects as components. Each sublist is headed by the name of a "layout schema", which can be thought of (in the terminology of object-oriented programming) as a "class" of displayable objects. The layout schema include the token types which can be rendered directly, as well as a small list of compound forms corresponding to the "expression constructors" used in most present typeset mathematics.
The complete list of layout schemas are given below in a separate section, including for each one the transformation rules used to interpret its linear syntax form, and an SGML markup form in which it can be given in full generality and with attributes. (Any HTML-Math expression can be given in full SGML markup form, so that every subexpression is a separate SGML element; or these forms can be mixed with the ordinary linear syntax forms used in this example.)
The present example is transformed by the built-in rules to give the following display list:
(mrow (mi "x") (mo "=") (mfraction (mrow (mrow (mo "-") (mi "b") ) (mo "±") (mroot (mrow (mscripts (mi "b") (mrow) (mn "2") ) (mo "-") (mrow (mn "4") (mo "⁢") (mi "a") (mo "⁢") (mi "c") ) ) ) ) (mrow (mn "2") (mo "⁢") (mi "a") ) ) )At the risk of excessive repetition: each list in a display list like the above comes in one of the forms
(layout-schema-name argument-1 argument-2 ... )where the layout-schema-name (e.g. mfraction, mrow, mroot) is one of the short fixed list of layout schemas (given below), or
(token-type-name "token-character-string")where the token-type-name (e.g. mi, mn, mo) is one of the short fixed list of token types (given below).
The process by which parsing and transformation generates the above display list is not given for this example, but should be clear from the descriptions below of the general rules for each step and from the specific descriptions of the layout schemas involved.
However, HTML-Math does not specify or require any particular rendering behavior. This is because it is intended to represent expressions in a way that allows them to be rendered to various quite different media (including, for example, interactive speech), and even within one medium, to be rendered according to the style preferences of an individual user, and in a way which suitably fits the context provided by the surrounding document.
On the other hand, HTML-Math does specify some contextual information which must be available for the rendering of any subexpression. This information includes certain attributes from surrounding HTML-Math or HTML elements (or, in the future, attributes specified by author- or user- specified rendering rules), and also certain attributes inherited from the location of a MATH element in a surrounding document (such as the text font, fontsize, and baseline position), which may ultimately be determined by non-math browser code either from the document itself or from something about its display environment. If an HTML browser supports HTML-Math embedded in an HTML document by means of an external program (e.g. a "plugin" or "helper application"), it must supply these attributes to that external program in order to allow rendered expressions to reasonably fit with their surroundings.
(A complete list of rendering attributes is not given in this letter. The mode attribute (which can be display or inline) has been mentioned already; among the attributes not mentioned so far are whether subscripts and superscripts should be positioned as is conventional for math or for chemical formulas. These attributes can be given on any HTML-Math element (when it's expressed as SGML markup) and apply to it and to all enclosed elements.)
HTML-Math also specifies a few semantic conventions which the layout primitives are intended to convey, when this might be necessary for correct rendering; for example, 2-dimensional renderers may render fractions with horizontal fraction bars or infix slashes according to the width of the fraction elements and the available width of the display, but this would not be correct for "columns" or "vertical vectors" as opposed to fractions.
Note that there are two distinct ways in which source text for an expression might be copied -- either with or without the document-supplied contextual information which modifies its interpretation or rendering in the present environment. (Even entire expressions may be affected by contextual information in larger parts of the surrounding HTML document.) It is suggested that both kinds of commands be provided. The present standard is intended to make it clear exactly which such information needs to be copied and how it can be represented in the copied source text.
Some computer algebra systems should be able to accept HTML-Math input directly. For the sake of others, some renderers may provide copy commands which translate HTML-Math into the native input form for those systems. Such commands can be considered to be doing rendering into a special medium, which is intended to be displayed to a program rather than to a human being. When renderers allow their users to specify additional transformation rules for rendering into various media and/or in various styles, it is suggested that the list of supported media and styles (as well as the rendering rules for each one) be user-extensible, and that "copy rendered form" commands be provided for each medium and style for which the user has provided any rendering rules. This will allow the creators of computer algebra systems to publish lists of suggested "rendering rules" for translating HTML-Math expressions into the input formats of their systems, which can be easily installed by users for use in their renderers, and further modified or extended by users when desired.
The purpose of the requirement that errors be obvious in the rendering is to ensure that authors can use any HTML-Math browser for "testing" their HTML-Math source text, and can be sure that if it "appears to work right" in their test browser, that it is correct standard HTML-Math and therefore can be expected to "work right" in other browsers.
∫ ⅆ x &over; xwhere the extended characters used represent (respectively) the integral sign (a large operator with precedence somewhat higher than +), the "differential d" (a high-precedence prefix operator), and an infix operator for forming fractions with horizontal bars (with precedence near that of division).
(The integral sign character can also be called &integral; or ∫. The name ∫ is provided since it is already part of the ISOtech character set.)
This source text is parsed into the form
(mterm (mo "∫") (mterm (mterm (mo "ⅆ") (mi "x") ) (mo "&over;") (mi "x") ) )and then transformed into the form for rendering
(mrow (mo "∫") (mfraction (mrow (mo "ⅆ") (mi "x") ) (mi "x") ) )When the result is rendered, the integral sign (being a large operator) is rendered in a larger font size.
If a definite integral was desired, this would be represented by embellishing the ∫ operator with a subscript and superscript, which could be done using their linear syntax forms by (for example)
∫_1%2 ⅆ x &over; x
The details of the features introduced with this example are given below.
Although the layout schemas and the typical notations rendered with them are described in this letter in 2-dimensional terms since they are most commonly understood that way, they can also be considered as abstract expression constructors, so that HTML-Math notations are not inherently tied to physically 2-dimensional media, but can equally well be rendered into other media such as interactive speech or computer algebra systems.
Primitive token types such as "variable" or "number" are also considered a form of layout schema, though they have no substructure, because each of these token types is conventionally rendered differently. In the terminology of object-oriented programming, the layout schemas can be considered the subclasses of the class of renderable objects.
Each layout schema has a name beginning with "m" (for "math"). This name is used in the "display list representation" (as shown in the example given above) as the head of the display list for an instance of a given schema, and is also the SGML element name for the SGML form of each schema (i.e. the name used in the begin and end tags). (The initial "m" is partly to avoid collisions with other HTML tag names, and to make it easy for a reader to tell which tags are specific to HTML-Math. Note that certain non-math-specific HTML tags may be embedded in HTML-Math expressions, e.g. links, anchors, or font changes.)
The tag names with just one letter after the "m" are token types; the others are layout schemas, or (in the case of mterm and moperator, which are not strictly layout schemas but are included in the following list anyway) part of the expression tree generated by the parser. (The names are not usually related to the linear syntax forms in which some instances of a layout schema can be given.)
The following list of token types and layout schemas includes for each one a description of its intended purpose, conventional rendering (not a formal part of the standard), and semantic connotations (if any). The SGML markup form and the linear syntax form is also given; for the token types, the tokenization rules are discussed.
There are also some more HTML-Math examples showing the processing steps for schemas not covered in the example discussed earlier.
Name Represents Some examples in HTML-Math source mi variable or identifier a \sin <mi>num-trees</mi> mn number literal 3.1 <mn>3.1e10</mn> mt text string "such that" <mt>such that</mt> mo operator (rendered or not) + <mo prefix=true>++</mo> mb begin tag or { { <mterm> <mfraction> me end tag or } } </mterm> </mfraction>
A backslash followed by one or more letterlike characters or digits is tokenized into a single identifier (even if it starts with a digit); thus \sin and \3d are both single identifiers. The backslash is not part of the identifier name -- thus \x\y and xy are turned into the same pair of identifier tokens.
Note that the sequence of letterlike characters forming a single token after \ can't include whitespace. If it contains extended characters given in SGML entity notation, these must be letterlike, and the entity names should be terminated with ";" rather than with whitespace (except perhaps for the last one).
E.g., to specify a single identifier which looks like "cos" except that the middle letter is a hypothetical extended character, one might use
\c&ExtendedLowercaseO;s(followed by some non-letterlike non-digit character).
Any character sequence may be designated as an identifier by enclosing it within <mi>...</mi>.
Identifiers are typically rendered (in a 2-dimensional graphic medium) by displaying the characters of the name in a closely-proportionally-spaced horizontal row, with single-character identifiers rendered in italic (except for certain characters such as double-struck capital letters like the Z often used to represent the set of integers).
In future amendments to this proposal, it will be possible to associate a semantic type and a locality of reference to a given identifier in a given scope of a source document, and to specify a instance of an identifier as a "defining instance" in some scope, but the issues involved are not discussed in this letter.
(The character dictionary determines which characters count as "digits" and as "decimal points"; in the standard dictionary these are "0123456789" and "." respectively. Note that the standard dictionary also declares "." as an operator, but its use as a decimal point in a legal number literal overrides its use as an operator.)
Neither commas nor minus signs (nor any form of "scientific notation") are automatically treated as parts of number literals. (E.g., -3 is parsed as a unary negation operator applied to 3.)
However, any character sequence may be designated as a number literal by enclosing it within <mn>...</mn>.
Number literals are typically rendered as a closely spaced row of their constituent characters, not in italics.
String literals are typically rendered the same way as text which surrounds the MATH element. When exported into a computer algebra system, they should typically be represented as string literals in the format of that system.
The dictionary of character and operator properties defines certain characters as operator characters, and certain sequences of these characters as operators, with specific values of the properties listed below. The tokenizer turns maximal sequences of operator characters into operator tokens, and gives them attributes corresponding to the properties in the dictionary. When several potential operator tokens overlap, the leftmost one is chosen.
(When the dictionary is made author extensible in a future version of this proposal, it may be possible for authors to declare character sequences as operators which would otherwise be tokenized as identifiers, but this will never be done in the standard dictionary. Rationale: authors must be able to write any sequence of letter-like characters without worrying that it will be tokenized as an operator by default in some future versions of HTML-Math.)
Any character sequence can be specified as an operator by enclosing it within <mo>...</mo> tags; the properties given below will have default values (to be specified later along with the full standard operator dictionary) unless specific values are specified using attributes within the begin tag, e.g. <mo prefix=true prec=400>++</mo>.
The properties of any operator token include:
The attribute names and values corresponding to these properties are:
prefix=true (means this form is allowed, not required)
postfix=true
infix=true
leftprec=number
rightprec=number
rightinfixprec=number
rightprefixprec=number
largeprec=number
embellisher=true (means use to embellish other operators is allowed, not
required)
large=true (means always parsed and rendered as a large operator)
stretchy=true
The numbers used as precedences must be integers (positive or negative). Higher numbers mean higher precedences, i.e., stronger binding.
The parser groups a term with the adjacent operator which has the higher precedence (assuming it is being used in a form which takes an operand on that side). If these precedences are equal, it groups the term with both operators; this feature is used to define "bracketing operators" such as parentheses so that the token sequence parsed from
(x)namely
(mo "(") (mi "x") (mo ")")groups into the single expression tree
(mterm (mo "(") (mi "x") (mo ")") )In the standard dictionary, all kinds of left brackets are prefix operators with the same right precedence (which happens to be 0), and all right brackets are postfix operators with that same value of right precedence (that is, also 0), which means that even brackets of different kinds can group together, e.g. in expressions like
[0,1)(Individual brackets can be prevented from grouping at all by enclosing them in <mterm>...</mterm> (see below). Note that the invisible grouping characters { and } are not operators at all, and can't be prevented from grouping (nor, of course, do they render as curly braces); extended characters are provided which do parse as regular brackets and render as curly braces.)
The same feature of grouping a term with both adjacent operators is used to allow certain operators to have "flat" or "n-ary" associativity, e.g. + and ⁢. This is what causes the source text "4ac" (in the example given far above) to parse to a single (mterm ...) subexpression containing three subterms (which are mn and mi tokens for 4, a, and c) separated by two (invisible) operator tokens.
Some operators are intended instead to be left or right associative; for example, the superscripting operator ^ (see the mscripts layout schema below) is right associative, meaning that a^b^c parses in the same way as a^{b^c}. This is achieved (in the standard dictionary) by giving ^ a slightly higher left precedence than right precedence. Similarly, left associative operators have a slightly higher right precedence than their left precedence.
Sometimes, more than one operator has the same left and right precedence; this is true, for example, of relational operators, so that sequences of inequalities turn into single subexpressions even when (e.g.) both < and <= (or &LessEqual;) are used in the same sequence.
Note that even infix + and infix - are flat-associative with the same precedences; this means that the source text "a - b + c" parses into a single subexpression of five tokens, which is appropriate for rendering even though it is not the most convenient structure for some other purposes such as evaluation (though it is not in any sense inconsistent with the semantic meaning of the expression).
The properties described above are sufficient to generate all possible behaviors of any operator in HTML-Math. For convenience, alternative attributes are provided which set the above properties in typical ways. (These can presently be used only in <mo> tags, but in the future will be most commonly used when authors can add new operators to the dictionary.) These attributes have the default value "unused" so that they will have no effect unless set explicitly. These attributes are:
After the parser chooses between alternative forms of an operator token, it generates a modified token with only the appropriate attributes set, for passing to subsequent stages (transformation rules and rendering). This may be important if those stages use the attribute values in some way; e.g., the renderer may wish to add a different amount of spacing to the left of a prefix operator (which has no left operand) or an infix operator (which has one), or to make the spacing depend on the absolute precedence, or a user-specified transformation rule may depend on whether an operator was used in prefix form.
Some operators are normally never rendered directly; instead they are treated as "macros" for expressing other forms (like layout schemas) in an abbreviated way. For example, this happens to all the operators in the linear syntax forms of the layout schemas. This is implemented by built-in transformation rules.
Other operators will be rendered as "themselves". Typically they are rendered as if they were the same text characters (possibly extended characters) used to name them, with surrounding spacing adjusted by the renderer to best convey the structure of the expression.
Large operators are typically rendered specially: in a larger than normal font size, and with any embellishing scripts placed in different positions depending on whether the expression mode is inline or display. Stretchy operators are also typically rendered specially (as described earlier).
These tokens are never rendered directly; what they each mean is described under the element name. It is an error for these begin and end tokens not to match exactly. (HTML-Math does not allow end tags to be left out or to be given in abbreviated forms.)
The invisible grouping characters are equivalent to the tags <mg> and </mg>. Their behavior is described under the tag name mg.
(When authors can modify the built-in transformation rules, it may become possible for these schemas to be presented to the renderer, which the renderer should treat as an error, as described in the earlier section on rendering erroneous expressions.)
These schemas can be given directly in source text using SGML begin and end tags of the same name. In this case, the tokenizer produces mb and me tokens (described above) for the begin and end tags respectively, which are then parsed by the parser as if they were a special kind of brackets (different from ordinary brackets since they must always match, and in some cases don't prevent ordinary brackets from matching "across" them), producing a schema named with the tag name.
The { and } invisible grouping characters are treated by the tokenizer precisely the same as the <mg> and </mg> tags, respectively. (In the main example I showed them as generating the tokens (mb "{") and (me "}"), which was for the clarity of that description, but they act just as if they generated (mb "mg") and (me "mg") respectively. Of course, this fact has no visible effect once the parser has properly matched them; the actual internal representation used is of course not specified by HTML-Math (for these or any other data structures) provided the behavior is as specified here.)
The complete set of such schemas is:
Name Purpose Example use mg invisible grouping (aka { }) {1-x} &over; {1+x} mterm term-like expression sequence a+b moperator embellished operator +_2 mlargeop makes regular operators large <mlargeop>⋃</mlargeop>
In some cases this forces bracket operators not to match anything outside, but this explicitly does not happen to a bracket operator, or to an embellished bracket operator, which by itself constitutes the entire renderable contents of the {...} form.
In no case does use of invisible grouping, by itself, force an operator to be treated as a term.
Source text enclosed in <mterm>...</mterm> tags is parsed normally, but then explicitly "forced" to be treated as a single term (for the purposes of further parsing) even if it would otherwise not be.
For example, <mterm>+</mterm> is parsed (before transformation rules are applied) as (mterm (mo "+")) and <mterm>+_2</mterm> is parsed (also before transformation rules) as (mterm (moperator (mo "+") (mo "_") (mn "2"))).
The explicitly added mterms will be removed by transformation rules before rendering, so their only effect on rendering is the indirect one of producing layout schemas which have operators in positions that would normally be used for terms; this may, for example, affect the spacing around those expressions, depending on the spacing rules used in the renderer.
For example, in the expression
a +_2 bthe + is embellished by the _ (the infix operator for subscript) and the 2. Thus the parser generates (before transformation rules are used) (mterm (mi "a") (moperator (mo "+") (mo "_") (mn "2")) (mi "b")).
<mlargeop>⋃</mo>_{s∈S} swhere the extended character ∈ is the set-membership operator (which looks something like a small ε).
The precedence of a large operator generated by mlargeop
is determined by the first successful method from among:
Name Represents Some examples in HTML-Math source mrow horizontal sequence a+b [0,1) ∫ ⅇ^-x^2 ⅆ x mfraction fraction 2 &over; 3 {1-x} &over; {1+x} mroot radical (nth root) &root; 2 &root; 2 % n mscripts subscript or superscript or aligned pair a_1 x^2 ∑_{x=1}%n munderscript underscript →__"word" moverscript overscript x^^⋒ mprescripts presubscript or presuperscript or aligned pair F___0 F^^^1 F___0%%%1 mbox hides all internal structure from renderer <mbox>x^2</mbox>^2All of these except mrow are typically rendered in a "2-dimensional" form when rendering into 2-dimensional graphical media.
An operator like + has no special transformation rule to specify its layout, so a "default" rule is used which turns any (mterm ...) schema which remains after other rules have been tried into an (mrow ...) layout schema with the same arguments. For example, a+b will be parsed into (mterm (mi "a") (mo "+") (mi "b")) and then transformed by this rule into the renderable form (mrow (mi "a") (mo "+") (mi "b")).
Renderers typically use spacing rules within an mrow which are sensitive to whether the constituents are terms or operators (including embellished operators), to the type of operator (e.g. prefix or infix), and sometimes to the relative precedence of nested operators.
An mrow can be specified directly as an SGML element by source text which looks like
<mrow> arg1 <mc> arg2 <mc> ... <mc> argn </mrow>where "argi" means the source text for the ith argument, and <mc> ("c" stands for "comma") is a special HTML-Math empty element used only to separate multiple arguments of the SGML forms of schemas. (It can be used in any schema which allows more than one argument.)
This form can be used to specify any mrow with one or more arguments.
A missing argument (e.g. in <mrow></mrow> or between two <mc>s) is replaced by a nested empty mrow (i.e. one with no arguments), which is neither an operator nor a term, and (typically) renders invisibly with zero width. There is no way to specify an empty mrow "by itself" in source text. (Empty mrows are used internally to represent certain other missing constructs, e.g. in the mscripts layout schema. For this use, it is important that an empty mrow is not equivalent to the missing terms or operators sometimes inserted by the parser, and that it can't be represented except as an argument to the SGML form of a schema.)
numerator &over; denominatormaking use of the extended character &over; which is an infix operator with precedence near that of division (the / infix operator).
It can also be specified in the SGML form
<mfraction> numerator <mc> denominator </mfraction>It's an error if there are other than two arguments given in this form (i.e. other than one <mc>).
[In SGML form, certain rendering attributes which will be described later can be added to the begin tag, e.g. to modify the appearance of the horizontal bar.]
This layout schema carries the semantic connotation of a fraction, i.e. something which is semantically equivalent to division. This is important, because it means renderers are allowed to render it as if it was an mrow containing the / operator, e.g. if this is necessary due to the display width being too small (or whenever their user prefers it that way).
&root; xor
&root; x % nIn the first case there is typically no "n" shown (in the place of the "nth root").
The semantic connotations include: n is equivalent to 2 if it's missing, and this expression can be rendered instead as a 1/nth power if necessary or desired.
The SGML form can be either of
<mroot> x </mroot> <mroot> x <mc> n </mroot>It's an error if there are other than one or two arguments given in this form (i.e. other than zero or one <mc>).
[In SGML form, certain rendering attributes which will be described later can be added to the begin tag, e.g. to modify the appearance of the horizontal bar above the expression whose root is being extracted.]
The following linear syntax forms generate the same thing as the following SGML forms (which all correspond to an expression tree (in display list format) of (mscripts arg1 arg2) for some arguments):
Form SGML form Explanation x_a <mscripts> x <mc> a <mc> </mscripts> subscript x^b <mscripts> x <mc> <mc> b </mscripts> superscript x_a%b <mscripts> x <mc> a <mc> b </mscripts> aligned pair x^b%a <mscripts> x <mc> a <mc> b </mscripts> aligned pairIt's an error if the mscript element has other than three arguments (separated by <mc>s).
The _ and ^ operators are each right-associative. [Full details of their precedences, and those of all other linear syntax operators related to scripts of all kinds, including the % used above, will be described later with the full table of operators and precedences.]
How the _ ^ and % operators interact to add multiple "scripts" to one "base" is described in a separate section below.
"Left-nested" mscript layout schemas (i.e. where the first argument of each one is the next one, from outermost to innermost in a chain) are interpreted as representing vertically aligned pairs of tensor indices (from farthest away to closest to the base expression). (This means that the source text will contain the indices from left to right.)
It is acceptable for some indices to be missing, and these will be rendered invisibly.
(This special interpretation of left-nested mscripts schemas also extends through left-nested mprescripts schemas in the same nested chain, and through any schemas which have no effect on rendering (such as font changing schemas), but not through any other schemas (in particular, not through moverscript, munderscript, or mbox schemas). The order of adding a new layer of scripts and a new layer of prescripts doesn't matter.
A typical rendering algorithm for this case (ignoring the possibility of unusually tall subscripts or superscripts) would determine the horizontal positions of the script arguments of an mscripts layout schema normally (i.e. based on the horizontal position and width of the entire base argument), but to determine their vertical positions would "burrow down" into the base (through any left-nested mscripts, mprescripts, and schemas with no effect on rendering) and depend only on the vertical position and height of whatever it found inside. A more careful algorithm might make use of a general grid or table layout facility to position all the scripts at once. Effectively, all the layout schemas in one of the left-nested chains being considered here form a single renderable object. (The reason they are not represented as a single level in one layout schema object is to make the transformation rules which form them from the linear syntax operators much simpler to express than would be possible that way, especially when scripts are mixed with prescripts.)
Although the present proposal defines no "official" appearance or format for those rules (nor even their properties in general), there is some use for such a format in order to document the built-in rules (and, perhaps, to allow them in practice to be read from a file rather than hardwired into the rendering code, if desired). (Furthermore, it is expected that a future amendment to this proposal will provide a way for authors to add such rules themselves, for which a format will be needed.)
To these ends, here is an example of some built-in rules and a description of their format and operation.
The actions of the infix scripting operators can be described (and are in fact implemented) using the following transformation rules:
$base _ $sub -> <mscripts> $base <mc> $sub <mc> </mscripts> $base ^ $super -> <mscripts> $base <mc> <mc> $super </mscripts> $base %_ $sub -> $base _ $sub $base %^ $super -> $base ^ $super <mscripts> $base <mc> <mc> $super </mscripts> % $sub -> <mscripts> $base <mc> $sub <mc> $super </mscripts> <mscripts> $base <mc> $sub <mc> </mscripts> % $super -> <mscripts> $base <mc> $sub <mc> $super </mscripts>The format of these rules in general makes use of the following two constructs:
Construct Purpose template -> result infix operator "->" for representing one rule $name formal parameter or "pattern variable"Such a rule is used by finding a subexpression which matches its template (or pattern) (which generates a necessary set of bindings of the pattern variables to subexpressions of the matched expression), and replacing that subexpression with the "result" after substituting the same pattern variable bindings in the result.
A list of such rules is used by using the first rule which matches, and repeatedly transforming an expression until no rules match. (But in the above example, the order of the rules doesn't matter.)
A list of rules should actually be repeatedly applied to all the subexpressions of an expression tree, deepest first; and whenever a rule is used, applied recursively and immediately to the result generated (after substitution of bindings for pattern variables). This matters in the present example (but explaning why it matters here is left as an exercise for the reader).
How the rules of the example behave specifically (i.e. what they are for) is described in the next section. [End of digression.]
Here is how these rules actually work: The first set of rules just say that the infix operators _ and ^ each make a new mscripts element from their arguments, in each case leaving the unused script position empty:
$base _ $sub -> <mscripts> $base <mc> $sub <mc> </mscripts> $base ^ $super -> <mscripts> $base <mc> <mc> $super </mscripts>The second set of rules simply say that the %_ and %^ operators do precisely the same thing:
$base %_ $sub -> $base _ $sub $base %^ $super -> $base ^ $super(The differences between these operators and _ and ^ are entirely in their precedences and associativities.)
The final set of rules say what the % operator does: it fills in an empty script position remaining in the outermost mscripts element:
<mscripts> $base <mc> <mc> $super </mscripts> % $sub -> <mscripts> $base <mc> $sub <mc> $super </mscripts> <mscripts> $base <mc> $sub <mc> </mscripts> % $super -> <mscripts> $base <mc> $sub <mc> $super </mscripts>By the use of these operators, a piece of source text can add new subscripts and superscripts to a given base from innermost to outermost, alternating subscripts with superscripts as it pleases, but once adding a script to a farther-right index position than before, can never "go back" to an empty position farther to the left.
A typical pattern for entering a tensor would be: for each pair of vertically aligned index positions, enter them in one of the forms
$base %_ $sub $base %^ $super $base %_ $sub % $superdepending on which indices are present. Since the %_ and %^ operators always skip to the next index position whereas % never does (which is all evident from the above rules), this pattern will always work (unless both of a pair of vertically aligned index positions are empty, which is presumably a very rare case!). (Authors who prefer entering superscripts first can use the $base %^ $super % $sub form when both scripts are present.)
For example, the tensor which should render something like
ab x cd(with four indices in three aligned columns) could be entered as either
x %^ a %^ b % c %_ dor
x %^ a %_ c % b %_ d(using only left-associative operators).
Other than that, mbox has no effect on rendering (it's an invisible wrapper around its argument).
Its main use is to separate left-nested mscripts or mprescripts layout schemas from being interpreted as specifying scripts in successive tensor index positions on the same base; i.e. it forces (e.g.) <mbox>x^2</mbox>^2 to look more like
2 2 xthan
2 2 x
However, since a renderer is allowed in principle to use arbitrary rules to allow subexpression structure to affect rendering of a whole expression, use of mbox may have other affects as well.