Re: XSchema question

Peter Murray-Rust (peter@ursus.demon.co.uk)
Wed, 05 Aug 1998 09:33:19


At 23:54 04/08/98 -0700, Don Park wrote:
>
>'dynamic-schema' is not driven by the contents but by the needs. It is
>somewhat similar to namespaces except with versions. *sigh* I am not
>explaining this too well, am I?

Let me have another go. [This is important for XSchema since we haven't yet
thought *how* we are going to use it.]

One simple way to use XSchema is:

myschema.xml --XCS2DTD --> myschema.dtd + mydoc.xml
|
|
V
myValidatingSAXParser
|
|
V
my tree
|
|
V
my application

In this there is a static schema which is transformed into a DTD because
that is what the parser software wants. [In the future parser might be
written to accept schemas directly ... opinions on this will differ :-)]

I think what you want to do is something like:

phase1:

<MyLog>
<schema>
... initial schema ... (time, date) ...
</xschema>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
... zillions of these with more coming in every millisecond ...
</MyLog>

This is how I think many people see XSchema being used. However we haven't
got around to defining the syntax and behaviour. IOW maybe we need a
trigger of some sort in a SAX application that specifically picks up an
XSchema event. This would probably mean a two-pass system - one to
determine that there was a schema in the file and another to re-run the
parsing using it.

I think one implication of your approach - which again XSchema has not
addressed - but has to - is that the validation or other XSchema activity
applies to what comes after it in the document. Of course, with namespaces
we may be able to restrict XSchema to process only a small part of the
document.

Then your log info changes. You don't want to make a separate log but you
need to change the schema to reflect the new info. I human changes the
schema to include (say) email:

phase2:

<MyLog>
<schema>
... initial schema ... (time, date) ...
</xschema>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date></logEvent>
... info changes ...
<xschema>
... new schema ... (time, date, email) ...
</xschema>
<logEvent><Time>...</Time><Date>...</Date><Email>...</Email></logEvent>
<logEvent><Time>...</Time><Date>...</Date><Email>...</Email></logEvent>
<logEvent><Time>...</Time><Date>...</Date><Email>...</Email></logEvent>
<logEvent><Time>...</Time><Date>...</Date><Email>...</Email></logEvent>
<!-- whoops - this site forgot - invalid!!! -->
<logEvent><Time>...</Time><Date>...</Date></logEvent>
<logEvent><Time>...</Time><Date>...</Date><Email>...</Email></logEvent>
<logEvent><Time>...</Time><Date>...</Date><Email>...</Email></logEvent>
<logEvent><Time>...</Time><Date>...</Date><Email>...</Email></logEvent>
... zillions of these with more coming in every millisecond ...
</MyLog>

This represents quite a complex processing model. It requires a processor
to run through the first few gigabytes of log and then change schemas and
run through the next. I think you would be better with a different file for
each schema change and link them together with XLink/entities.

>Am I being any clearer?

Yup. Let's solve the single schema problem first. Simon?

P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg