RE: XML-Data: advantages over DTD syntax? (and some wishes)

Jarle Stabell (jarle.stabell@dokpro.uio.no)
Wed, 1 Oct 1997 19:02:18 +0200


Paul Prescod wrote:
-------
"In other words, as Len has pointed out, the DTDs-as-instances efforts
are primarily focused on making life easier for vendors, perhaps at the
cost of simplicity for users. If this results in more tools, then it=20
might be a net win, but we don't know yet."
</>

I think the DTDs-as-instances also benefits new users, why should they =
have to learn two syntaxes instead of one?
One of the first questions which entered my mind when seeing a DTD for =
the first time was:
"Why didn't they code this using SGML, why a completely separate syntax =
for this?".
(Persons who already know SGML may of course object to DTD-syntax not =
being SGML-syntax, but I think this will be even more the "feeling" =
among XML users)

I assume most people today won't edit DTDs (either today's version or =
XML-Data or similar versions) in the "raw" text format. They will of =
course use tools, visualizing the hierarchy (XML-Data's =
extends/implements), selecting values from comboboxes etc.
I think some advanced functionality is very difficult with todays DTD's, =
as (to my very limited SGML-knowledge) many things are =
"simulated/hacked" by using parameter entities.
I think this parameter entity (macro) approach is much less "semantic", =
and is much more difficult for a tool to handle.

Mr. Prescod also wrote:
-------
"Of course particular DTDs-as-instances proposals may have other =
benefits,
but those benefits could as easily be added to DTD syntax as to some
new syntax."
</>

Adding new constructions to DTD syntax would force parser builders to =
update the "lower parts" of their parsers/lexers, but in a =
DTD-as-instances version the upgrade would only affect the "semantic" =
part of the engine.
And more importantly, it would be easier to communicate to users that =
"now this (DTD)element has gotten this new attribute, which means X =
etc", instead of having to introduce the new syntax for DTD-encoding and =
then explaining it's semantics. (This is why we like SGML/XML in the =
first place, not needing to use more or less unstandard syntactic =
encodings)
I don't view XML-Data as the new syntax, quite the opposite, this I find =
completely "XML syntax". I view the DTD syntax as another "non-XML" =
syntax (although this is of course technically uncorrect according to =
the draft).

A few XML wishes:

1. Please incorporate the </> tag, it would take a parser-writer 5 =
minutes to implement it, as well as save bandwitdth, diskspace, typing =
and in some cases ease reading. (It could also be used to write =
hard-to-understand/maintain documents, but that's up to the user)

2. Allow non-quoted attribute values. I guess support for this is also a =
5 minutes project for the parser-writer.

3. Add a paragraph to the XML standard document explaining why character =
references should be resolved before storing the string as the value of =
the entity.
Is it to allow <sarcastic>useful</> tricks like the example in the "C. =
Expansion of Entity and Character References" section, as well as making =
it rocket science to use character references in entity declarations?
(or only for compatibility with SGML?)
I don't know if this is "theoretically" possible, but it could save =
weeks of implementation time if all entity declarations could be parsed =
locally, and not forcing expansion and reparsing in all occurrences.
It there are no essential idioms getting lost in such a simplification, =
I definitely think such a simpler model would make life simpler for =
end-users as well, not only for the software vendors.
(Perhaps people wouldn't need to play parsers (and knowing many detailed =
rules) to read/write/debug their documents?)

Cheers,
Jarle Stabell