Re: SAX and Unicode question

Tim Bray (tbray@textuality.com)
Mon, 05 Jan 1998 16:51:07 -0800


At 05:36 PM 05/01/98 -0700, Matthew J. Evans wrote:
>How is SAX going to handle Unicode, especially sending 16-bit chars
>(UTF-16) to callback functions? Sending void*'s and/or char*'s in the
>callbacks will leave the application and/or parser guessing what was sent.
>Sending byte order marks in every string seems rather impractical,
>especially since UTF-16 can have null bytes making most string objects
>useless anyway.

SAX is a Java interface. Thus the Strings and chars and so no are
all 16-bit-only; the parser will have taken care of all the BOMs
and encoding jiggery-pokery and so on.

On the IDL end of things, not sure what the right way to do it is. -Tim