Re: An incompatible CData idea

Toby Speight (tms@ansa.co.uk)
27 Oct 1997 18:20:05 +0000


-----BEGIN PGP SIGNED MESSAGE-----

David> David G. Durand <URL:mailto:david@dynamicdiagrams.com>

> In article <9710271217.ZM22092@iris.dynamicdiagrams.com>, David
> wrote:

David> It is. CDATA is not legal in XML DTDs at all (mostly because
David> it's rather broken -- any occurence of the "</" delimiters
David> closes the element).

David> Instead, you must use CDATA marked sections:
>> <![ CDATA [ <contents> to </be> quoted ]]>

David> If you need the ]]> delimiter in the CDATA area, you must
David> escape it via the use of entities.

I thought you couldn't use entities within CDATA marked sections
(since '&' and '&#' are no longer recognised as markup).

IIRC, the only way to write ']]>' in a CDATA section is to re-write it
as two sections, with literal text in between:

> <![ CDATA [ <contents> to </be> quoted, a literal]]> ]]><![ CDATA
> [, and more <contents> to </be> quoted.]]>

(although I might choose to write the #PCDATA using entities if it
helps my editor's parenthesis-matching ;-)

If you want to automate this on stuff you're including into XML
source, try this sed incantation:

sed -e 's/]]>/]]>&<![ CDATA [/g' -e '1s/^/<![ CDATA [/' -e '$s/$/]]>/'
# I didn't use i (insert) or a (append), since they introduce
# line-breaks :-(

(Using entities may or may not be a better move, depending on one's
assessment of the source material. It's an inescapable fact that all
included material needs a stage of processing before insertion. In
fact, the more I think about it, the more dangerous I think CDATA
marked sections are, because they enable one to get away with it
*almost every time*, thus increasing the likelihood of a quick-fix
solution appearing to work despite containing an (obscure) bug.)


-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv
Comment: Processed by Mailcrypt 3.4, an Emacs/PGP interface

iQB1AwUBNFTbJedsuUurvcRtAQFu7AL/UScZDEG/GLyff/TTP5H1gKCw3GB+FEZi
XfiSSi6NaktwRu8rgduuo4Lq7otWQDEHgUryGNug6oz19aq9XPMTF0r96MmpgUyL
5PaoW9zzXSRGrJ89+YFnygGdXJRCHjLG
=+UfT
-----END PGP SIGNATURE-----