Re: How to process Japanese Code with XMLDSO(MS-XML)

MURATA Makoto (murata@apsdc.ksp.fujixerox.co.jp)
Thu, 06 Aug 1998 12:52:36 +0900


TAKAHASHI Masayoshi wrote:
> Can we define standard (or recommended) conversion (mapping table)
> between Unicode(UCS-2) and such encodings used in XML?

JIS X 0221, Java, and Microsoft appear to have their own. Terrible
confusion will happen in the near future. This is not at all a
fault of the Unicode consortium or ISO. If somebody should be
criticized, Japanese should be criticized.

An interesting document about this issue is available at:

http://hp.vector.co.jp/authors/VA001240/article/ucsnote.html

> I know that "Japanese profile" in KAISETSU of TR X 0008-1998
> (http://www.y-adagio.com/public/standards/xml/tutr.htm)
> define encodings used in japanese XML document, but it doesn't
> define conversions between encodings. I think it's not enough
> to guarantee exchange.

Quite.

TAKAHASHI Masayoshi wrote:
>
> # ...masaka "UTF-{8|16} igai no encoings ha buji ni koukan dekiru
> # hoshou ga nai kara jissai niha tsukatte ha ikenai" to an ni
> # niowaseteiru wake deha nai desuyone? :-) > japanese profile

So, do you plan to contribute? I can give you a SJIS XML file
containing a number of problematic characters. If you convert
them to UTF-16 and then back to SJIS by a number of software
tools and report the result to the public, that would be a great
contribution.

Makoto

Fuji Xerox Information Systems

Tel: +81-44-812-7230 Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp