Thanks. I am also aware of it now :-). Can I make the assumption that:
- ISO-8859-1 and UTF-8 look identical to not-very-experienced humans.
- in principle I should be able to sort this by adding something like
<?xml version="1.0" encoding="ISO-8859-1"?>
to the top of the document
- in practice this fails because by the time it gets to the encoding
declaration it has already assumed the encoding is UTF-8 and has crashed :-)
I am not quite clear why we need this problem. Do different tools emit
different encodings? If so, what should I work with?. Can I convert this
document?
I know there has been lots of important discussions about encodings (which
I have not always read very carefully), so an authoritative statement from
a WG member would help at least one human :-)
P.
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg