Re: text/xml vs. application/xml

MURATA Makoto (murata@apsdc.ksp.fujixerox.co.jp)
Mon, 22 Dec 1997 10:32:06 +0900


David Megginson writes:
>
>I have two important queries:
>
>1) Are you certain that ignoring the encoding declaration is
> conforming behaviour?

Yes, I am certain that ignoring the encoding declaration for text/xml
is conforming behaviour. This is to allow transcoding.

>It seems to me that it would make more sense
> to report an error if the charset parameter and the encoding
> declaration differ (especially since the PR requires any document
> without a BOM or encoding declaration to be in UTF-8).

HTTP 1.1
(http://www.w3.org/Protocols/HTTP/1.1/draft-ietf-http-v11-spec-rev-01.txt)

The "charset" parameter is used with some media types to define the
character set (section 3.4) of the data. When no explicit charset
parameter is provided by the sender, media subtypes of the "text" type
are defined to have a default charset value of "ISO-8859-1" when
received via HTTP. Data in character sets other than "ISO-8859-1" or its
subsets MUST be labeled with an appropriate charset value. See section
19.8.2 for compatibility problems.

>2) Why pick a default encoding that conforming XML parsers are not
> required to support? Alfred does accept encoding="ISO-8859-1", but
> some other parsers do not. It seems to me that either the RFC or
> the PR needs to be amended.

HTTP people stick to the default 8859-1 in spite of a *lot* of effort
from W3C. On the other hand, IETF (RFC2130) recommends UTF-8 as a
default.

>I can also anticipate a different problem: few private people (as
>opposed to companies or organisations) have any control at all over
>what their HTTP servers send out.

I am sympathetic to this.

Rick Jelliffe proposed that only application/xml should be used in the
XML SIG. I will follow the consensus in the XML SIG or WG.

Makoto

Fuji Xerox Information Systems

Tel: +81-44-812-7230 Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp