Re: SAX: Byte Streams and Character Streams
James Clark (jjc@jclark.com)
Sat, 18 Apr 1998 09:28:42 +0700
Let's not forget other languages, in particular C and C++. In C terms a
character stream would be a stream of wchar_t's, and a byte stream would
be stream of char's. It's very common to pass information around
internally in char's (ie UTF-8 encoded) rather than in a stream of
wchar_t's (ie UTF-16 encoded): for example, expat which is being used
both in Netscape 5 and in Perl passes data to the application in UTF-8
as a sequence of bytes not as a sequence of wchar_t's. Supporting byte
streams only in the C/C++ world causes no inefficiency: if you have the
data as an array of wchar_t's, you can simply cast your wchar_t* to a
char* and you get an array of UTF-16 encoded bytes. Byte streams gives
you all you need in the C/C++ world.
James