Attribute value normalization

MURATA Makoto (murata@apsdc.ksp.fujixerox.co.jp)
Wed, 27 May 1998 16:17:27 +0900


While translating the XML specification, I find that I do not understand
the attribute normalization mechanism of XML.

I made an example XML document (shown below). I used the latest version
of expat, Lark, Aelfred, xp, and MSXML. I used DemoHandler of SAX to
invoke Lark, Aelfred, and xp.

xp says that the type of the attribute "a" is CDATA. MSXML reports a
fatal error. Aelfred says that the attribute value is always "test test".
Lark and expat normalize some but not all. Which one is correct?

<?xml version="1.0"?>
<!DOCTYPE test
[
<!ELEMENT test (#PCDATA|test)*>
<!ATTLIST test
a NMTOKENS #IMPLIED>
<!ENTITY D "&#xD;">
<!ENTITY A "&#xA;">
<!ENTITY DA "&#xD;&#xA;"> ]>
<test>
<test a="

test

test

"/>
<test a="&D;&A;&D;&A;test&D;&A;&D;&A;test&D;&A;&D;&A;"/>
<test a="&DA;&DA;test&DA;&DA;test&DA;&DA;"/>
<test a="&#xD;&#xA;&#xD;&#xA;test&#xD;&#xA;&#xD;&#xA;test&#xD;&#xA;&#xD;&#xA;"/>
<test a="&#xD;&#xD;test&#xD;&#xD;test&#xD;&#xD;"/>
<test a="&#xA;&#xA;test&#xA;&#xA;test&#xA;&#xA;"/>
</test>

Makoto

Fuji Xerox Information Systems

Tel: +81-44-812-7230 Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp