How is SAX going to handle Unicode, especially sending 16-bit chars
(UTF-16) to callback functions? Sending void*'s and/or char*'s in the
callbacks will leave the application and/or parser guessing what was sent.
Sending byte order marks in every string seems rather impractical,
especially since UTF-16 can have null bytes making most string objects
useless anyway.
(sorry, my Java is NULL. But from what I can tell, the String and
String_buffer classes do not support 16- or 32-bit chars - correct me if
I'm wrong)
As a developer, it would be very nice not to have to re-code support into
my applications. I would like to see some implementation of Unicode in SAX
that is compatible with most systems and is extensible for when new
standards come along. (Wide character and encoding support is lacking in
most software languages).
I do have a couple of ideas if you would like them (omitted for brevity).
- Matthew
<<<<<<< | >>>>>>>
Matthew J. Evans
Professional Hobbyist
Santa Fe, New Mexico
mailto:mje@shakha.com