Character encodings

Chris von See (cvonsee@onramp.net)
Mon, 07 Dec 1998 14:49:57 -0600


Hi,

I am relatively new to XML and am trying to develop a program that can
generate XML in various encodings. In section 4.3.3, the XML spec implies
that support of ISO 10646 UCS-2 encoding (i.e. Unicode) is valid, but in
the section on autodetection of encodings (Appendix F) there's no mention
of how to detect UCS-2 encoding. I would *assume* that UCS-2 would start
with \x00 \x3c\ x00 \x3f ("<?") - is that right? If so, then is the spec
wrong in not including this in Appendix F as valid? Is it reasonable to
expect that many people will use UCS-2 because of its similarity to Unicode?

Thanks,
Chris von See
TechAdapt, Inc.