Re: HTML != XML (was Re: [ANN] Kludgey workarounds for xt)

David Megginson (david@megginson.com)
Wed, 9 Sep 1998 14:28:05 -0400


Andrew Bunner writes:

> <stating-the-obvious>Java Script engines are not easy things to
> write.</stating-the-obvious> I think it's unlikely that developers
> are going to redefine the Java Script language to interpret &lt; as
> < ... my opinion (hope) is that the standard should accomodate
> this.

The problem is that the HTML 4.0 DTD defines the <SCRIPT> element as
follows:

<!ELEMENT SCRIPT - - CDATA>

This is perfectly legal SGML, and HTML 4.0 is based on SGML. It would
actually be *wrong* to use &lt; and &amp; in a <SCRIPT> element in
HTML -- the browsers, probably by accident, have it right (at least
this far).

Here's the crux, though: HTML 4.0 is based on a non-XML subset of
SGML. That means that XML cannot represent (and was never intended to
represent) an HTML <= 4.0 document. It's just wrong. If you need to
do that, why bother with XML when there are perfectly good HTML/SGML
tools out there?

XML is *not* an extension of HTML, and there is no safe way to include
XML in an HTML <= 4.0 page (except by reference, using a <LINK>
element or something similar).

All the best,

David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/