Re: PCDATA vs CDATA

Tim Bray (tbray@textuality.com)
Tue, 30 Jun 1998 22:55:59 -0700


At 02:32 PM 6/30/98 -0400, Tom Otvos wrote:
>Why can an element's mixed content only be declared as PCDATA, not CDATA?

SGML compatibility, sigh. Because CDATA content is broken as designed
in SGML; for example, you can't declare element <X> to be of type CDATA
and then do:

<X>xxx<Z>..</Z>xx</X>

Because the rules say that a CDATA element is terminated not, as you
might expect, by its end tag, but by the first occurrence of ETAGO,
a.k.a. "</".

Anyhow, the more I think about it, any of these schemes that depend
on a magic end-delimiter, e.g. "</anything>" or "]]>" are just
amateurish and broken. We've known for years how to do this;
either escape delimiters or put in a byte count.

The idea of a built-in signal for a base64 encoding is starting
to look better and better to me. -Tim