Mix encodings in a document?

Deke Smith (deke@tallent.com)
Mon, 21 Sep 98 09:46:08 -0500


I think I know the answer I am going to get, but I'll ask anyway.

Within a single XML document, is it possible to have the text encoding
change from element to element?

For example:

<?xml version="1.0"?>
<PHRASES>
<PHRASE encoding="ISO-8859-1" xml:lang="en">Hello!</PHRASE>
<PHRASE encoding="X-EUC-TW" xml:lang="zh-TW"><!--chinese language text
here--></PHRASE>
</PHRASES>

At the least, I can imagine XML browsers and parsers will cough up a hair
ball on this. My feeling is that this should NOT be valid, but I don't
know for sure. The way I see that the specs allow for this is for the
character encoding to be UTF-16 for the whole document:

<?xml version="1.0" encoding="UTF-16"?>
<PHRASES>
<PHRASE xml:lang="en">Hello!</PHRASE>
<PHRASE xml:lang="zh-TW"><!--chinese language text here--></PHRASE>
</PHRASES>

Deke

-----------------------------------------------------------------
Deke Smith
Tallent Communications Group, Brentwood TN
deke@tallent.com, 615-661-9878
-----------------------------------------------------------------
" The best way to predict the future is to invent it. "
- Alan Kay