Re: Characters having an ASCII value > 127

Richard Tobin (richard@cogsci.ed.ac.uk)
Fri, 18 Sep 1998 13:45:14 +0100 (BST)


> I guess, to correctly interpret and display those characters I have to
> know the character set which was used to encode the original text file.
> How can I communicate this character set to an XML parser?

You can do this by putting an encoding declaration in the XML
declaration at the start of the file. For example, if the document
is in ISO Latin 1, officially named ISO-8859-1, you can use

<?xml version="1.0" encoding="ISO-8859-1"?>

Without an encoding declaration (or a mime type if the document comes
from an http server) a conforming parser will treat it as UTF-8, and
any character above 127 will be misinterpreted.

Of course, any particular parser may not support the character set you
happen to be using.

-- Richard