Re: Internal subset equivalent in new schema proposals?

Joel Bender (joel@spooky.emcs.cornell.edu)
Tue, 1 Dec 1998 14:36:16 -0500


Ketil Z Malde wrote:

> What could be useful and relatively simple, is a restriction
> of the *form* of the data, e.g. forcing the <name type="city">
> to contain only letters and start with a capital, or LC
> subjects to be two upper case letters (if that's what they are).
> Phone numbers, dates, sort keys, there are many cases where it
> would be helpful to have the parser catch these things, I think.

I was thinking along similar lines. I've been adding something like this
to my XML documents:

<prop name="state" xml:regexp="[A-Z]+">NY</prop>

So the parser can verify that the CDATA matches the regular expression.
Works OK for content, but I don't see how I can add this meta-meta-data for
attributes. That is to say, how can I tell the parser that the 'name'
attribute value for the 'prop' entity must be of the form
"[a-zA-Z_][0-9a-zA-Z_]*"?

Of course this also brings up the murky waters of grep syntax, which I've
been avoiding.

Joel