Canonical Encoding for XML Elements

David Megginson (ak117@freenet.carleton.ca)
Fri, 9 Jan 1998 06:48:55 -0500


Chris Smith writes:

> In particular, do parsers keep CDATA sections distinct from
> character data?

CDATA sections are part of the document's physical representation
rather than of its logical structure, so they would likely be reported
only by a specialised parser designed for authoring tools or
repositories. For other purposes, it doesn't matter; the following
two are exactly equivalent:

<example><![CDATA[
<sample>text</sample>
]]></example>

<example>
&lt;sample>text&lt;/sample>
</example>

Switching between the two should produce exactly the same rendered
output from a formatting engine, exactly the same entries in a
database, etc. etc. SAX would report both as

start element: example
characters: "<sample>text</sample>"
end element: example

(some might break the second event into several smaller ones).

All the best,

David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)