RE: How best to represent unrepresentable characters in NAME toke

Andrew Layman (andrewl@microsoft.com)
Fri, 14 Nov 1997 14:05:49 -0800


Thank you all for the suggestions you have made to me (many privately)
regarding this question. Here is the policy I intend to follow and to
recommend:

Sometimes you will want to use a character in a name, but that character is
not an XML NameChar. In that case, encode it, using a sequence such as
"_#xHHHH_" where "HHHH" is a hexadecimal rendition of the Unicode character.
For example "Two Words" would encode as "Two_#x0020_Words". Such encoding
(and subsequent decoding) is an application function, not part of the XML
specification per-se.

(This is the closest mapping I could make to using character entities in
names.)

--Andrew Layman
AndrewL@microsoft.com