Of course XML can be used to create non-ambiguous
transfer formats (data schlepping). But Paul,
a lot of the information that needs to be mined
is not in relational formats. Depending on the query
language and implementation, there is no reason one
cannot build an industrial strength data repository
over generalized markup. Some IETM applications
(eg, 87269, MID, etc.) are designed to do that.
Even with IADS some years ago, we had some primitive
capabilities for this although immature then. Probably
much improved now. In those designs it was always
assumed that the client language (eg, MID) was
essentially a navigation system over a set of
notations whose processors are known. It is also
assumed that the client language included a query
language or could call one. So, data warehouse may
be in need of further clarification. Applications
I work with have to have both document frameworks
and relational systems as well as all of the
ad hoc-inTransit data used to interface the
live data (sensor-derived) to the database that
is collecting and warehousing.
However, let me ask a technical
question that you can probably answer with a deeper
technical perspective than mine? How well can one query
data (or convert it for that matter) for which one
has no rigorous schema (of some kind)? (Note,
I consider a self-identifying type (eg, magic number)
to be a pre-validated file of the notation.)
len bullard