Re: Why XML data typing is hard

David Brownell (db@Eng.Sun.COM)
Mon, 30 Nov 1998 14:40:30 -0800


Toby Speight wrote:
>
> Henry> Henry S. Thompson <URL:mailto:ht@cogsci.ed.ac.uk>
>
> Henry> In other words, if there are (natural) language/culture dependent
> Henry> aspects to our documents, then if we are good citizens we should
> Henry> use the xml:lang attribute to signal this.

A good "if" ... related: "if" the document is directly created or
consumed by humans, we should use some locale tagging. ("xml:lang"
identifies language, not locale!)

> It looks attractive at first sight, but think of the burden you're
> placing on processors ... readers now
> need to know (enough about) *all* the locales from which they might
> receive data.

I'd contend that the "float" (or "r4") data _type_ doesn't have any
such localization issues. IEEE floating point is a binary spec, and
nobody's proposed not assuming IEEE floating point.

Rather, this issue is an encoding issue. By and large, having just
one (canonical) form is a lot easier for programs to deal with: smaller
code, easier to debug, faster in normal cases, and so on.

But: is this data being generated by/for programs, or people? People
have different priorities than programs.

- Dave