XHTML/HTML coders (using editors or scripting), script developers (PHP, JSP, etc.), Web project managers, and anyone who needs a gentle introduction to quirks vs standards modes, and how the DOCTYPE and XML declarations play into that. It is also very useful pre-reading for those wanting to know how to declare the character encoding of their documents.
This article very briefly describes some, often surprising, aspects of how servers send XHTML to the user agent (eg. a browser), and how common user agents handle the markup they receive. It describes implementation-specific issues, rather than W3C standards.
This material is taken from a tutorial about how to declare the character encoding of an HTML or XHTML document. These topics have an important bearing on that decision. This information is also helpful in explaining why some aspects of CSS styling do not appear as expected, or vary from user agent to user agent.
When a server sends a document to a user agent (eg. a browser) it also sends information in the Content-Type field of the accompanying HTTP header about what type of data format this is. This information is expressed using a MIME type label. Here is an example of an HTTP header for an HTML file using the MIME type 'text/html'. Note that the Content-Type entry can also express the character encoding of the document.
HTTP/1.1 200 OK
Date: Wed, 05 Nov 2003 10:46:04 GMT
Server: Apache/1.3.28 (Unix) PHP/4.2.3
Content-Location: CSS2-REC.en.html
Vary: negotiate,accept-language,accept-charset
TCN: choice
P3P: policyref=http://www.w3.org/2001/05/P3P/p3p.xml
Cache-Control: max-age=21600
Expires: Wed, 05 Nov 2003 16:46:04 GMT
Last-Modified: Tue, 12 May 1998 22:18:49 GMT
ETag: "3558cac9;36f99e2b"
Accept-Ranges: bytes
Content-Length: 10734
Connection: close
Content-Type: text/html; charset=utf-8
Content-Language: en
A server normally sends HTML 4.01 files with a MIME type of text/html. HTML is an SGML application.
Things are not so straightforward when dealing with XHTML 1.0, which is XML-based.
Many people prefer to use XHTML because of the advantages XML brings for editing or processing of documents. However, there is still a lack of support for XML files in mainstream browsers, so many XHTML 1.0 files are actually served using the text/html MIME type. In this case, the user agent will treat the file as HTML.
To ensure that the slight differences between XML and HTML do not trip up older user agents, you should always follow the compatibility guidelines in Appendix C of the XHTML specification when serving XHTML as HTML. These compatibility guidelines recommend, amongst other things, that you leave a space before the '/>' at the end of an empty tag (such as img, hr or br), that you always use both id and name attributes for fragment identifiers, etc.
XHTML 1.0 can also be served as XML, and XHTML 1.1 is always served as XML. To serve XHTML as XML you use one of the MIME types application/xhtml+xml, application/xml or text/xml. The W3C recommends that you serve XHTML as XML using only the first of these MIME types - ie. application/xhtml+xml.
The fact that XHTML may be served as HTML or XML makes a difference to the way encoding information needs to be declared.
Current mainstream browsers may display an HTML file in either standards mode or quirks mode. This means that different rules are applied to the display of the file, one conforming to the W3C standards interpretation of expected behavior, the other to expectations based on the non-standard behavior of older browsers.
The screen captures below illustrate some of these differences.
A document rendered in standards mode. | The same document rendered in quirks mode. |
---|---|
![]() |
![]() |
Differences illustrated above include the following:
In standards mode the width setting in CSS does not incorporate any padding and border settings, whereas in quirks mode it does - which is why the large box is thinner in the second picture.
CSS is used to set the font size quite large for the body tag (and all other elements through inheritance), and reduced by 50% within any p element. In quirks mode the table has not inherited the font size setting from the body element, so the text looks smaller. (Note that the text in the large box is the same size, since this is not in a table, but is in a p element.)
The two pictures show the same page with exactly the same markup and CSS styling. The only difference between the source of the two files is that the one on the left has a DOCTYPE declaration at the top, and the other doesn't. A file with an appropriate DOCTYPE declaration should normally be rendered in standards mode by recent versions of most browsers. No DOCTYPE, and you get quirks.
Browsers that switch in this way between standards and quirks modes are often said to do 'doctype switching'.
The following shows the source text with the DOCTYPE declaration at the top (highlighted in red italics).
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title>xhtml document</title>
<style type="text/css">
body { background: white; color: black; font-family: arial, sans-serif; font-size: 25px; }
p { font-size: 50%; }
h1 { font-size: 16px; }
div { margin: 20px; width: 170px; padding: 50px; border: 6px solid teal; }
</style>
</head>
<body>
<h1>Test file for Standards Mode</h1>
<div>
<p> Here is some text in a p in a div. </p>
</div>
<table border="1">
<tr><td><p>Text in p tag.</p></td>
<td><p>Text in p tag.</p></td>
</tr>
<tr><td>No p tag.</td>
<td>No p tag.</td>
</tr>
</table>
</body>
</html>
It is generally a good idea to always serve your pages in standards mode - ie. always include a DOCTYPE declaration.
Because XHTML 1.0 is based on XML, it is common to add an XML declaration at the beginning of the markup, even if it is served as HTML. This would make the top of the above file look like this (the XML declaration is highlighted in red italics):
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
...
In browsers such as Mozilla, Netscape, Opera, and others, with or without the XML declaration, a page served with a DOCTYPE declaration will be rendered in standards mode.
With Internet Explorer, however, if anything appears before the DOCTYPE declaration the page is rendered in quirks mode. Because Internet Explorer users count for a very large proportion of browser users, this is a significant issue. If you want to ensure that your pages are rendered in the same way on all standards-compliant browsers, you need to think carefully about how you deal with this.
Here are the options. Obviously, if your document contains no constructs that are affected by the difference between standards vs. quirks mode this is a non-issue. If, on the other hand, that is not the case, you will have to add workarounds to your CSS to overcome the differences, or omit the XML declaration.
The XHTML specification also warns that processing instructions are
rendered on some user agents. Also, some user agents interpret the XML declaration to mean that the document is unrecognized XML rather than HTML,
and therefore may not render the document as expected.
You should do testing on various user agents to decide
whether this will be an issue for you.
Note that if you decide to omit the XML declaration you should choose either UTF-8 or UTF-16 as the encoding for the page. (See Character sets & encodings in XHTML, HTML and CSS for more information about the impact on encoding declarations.)
XHTML 1.0 can be served as HTML or XML. If you serve it as XML, use the MIME type application/xhtml+xml.
It is generally a good idea to use a DOCTYPE declaration at the top of an HTML or XHTML file so that the document is rendered in standards mode by more recent user agents.
The presence of an XML declaration in an XHTML 1.0 file served as HTML will cause your file to be rendered in quirks mode on Internet Explorer (and therefore for a potentially large proportion of your audience).
For more detail on these topics, follow the Related Links, and check out the pages that they point to.
Content first published 2004-03-18. Last substantive update 2005-07-04 11:36 GMT. This version 2006-10-06 09:06 GMT