Char & Java implementation

Jeni Tennison (jft@Psychology.Nottingham.AC.UK)
Wed, 4 Mar 1998 09:13:18 +0000


I've been attempting Ye Olde XML-Parser-in-Java exercise, mainly to help me
get to grips with Java. The last week has seen me struggling over
character encodings, and I wondered if you could put my mind at rest by
confirming whether I'm understanding the XML recommendation correctly.

The definition of Char in the XML recommendation is:

From: http://www.w3.org/TR/REC-xml#charsets -
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF]
| [#xE000-#xFFFD] | [#x10000-#x10FFFF]
^^^^^^^^^^^^^^^^^^

Am I right in thinking that, since the indicated characters are longer than
16 bits, they can't be represented in Java with the char data type, and int
must be used instead? And that this also means that you can't use normal
java.lang.Strings for things like entity and attribute values?

Thanks for your help in advance,

Jeni

Jenifer Tennison
Department of Psychology, University of Nottingham
University Park, Nottingham NG7 2RD, UK
tel: +44 (0) 115 951 5151 x8352
fax: +44 (0) 115 951 5324
url: http://www.psychology.nottingham.ac.uk/staff/Jenifer.Tennison/