Murray Altheim writes:
As both a document type designer, a parser writer, and a document =
author,=20
I think one of the main advantages to XML is the requirement of =
explicitly-
named end tags.
[JS] Agree.
[MA] The save-typing argument is moot in that most people will=20
probably not hand-edit tags.
[JS] Maybe. But I know I will. Therefore I would like it. :-)
I really think
<LASTNAME>Doe</>
<FIRSTNAME>John</>
are faster/easier to read than
<LASTNAME>Doe</LASTNAME>
<FIRSTNAME>John</FIRSTNAME>
and I keep seeing lots of things like this.
Having this possibility would perhaps also prevent people from using =
"cryptic" abbreviations as element type names/ID's
I agree that closing an element having subelements with a </> would be a =
"bad thing" for a document writer to do.
[MA] For those that do, having the explicit end=20
tags is probably a Very Good Thing, in that it saves confusion. And =
while it
maybe only takes '5 minutes' (NOTHING takes five minutes) to add in a =
parser,
suddenly a simple parser must build a document tree in order to know =
which
element is being closed by '</>', which makes simple parsers into much =
more
complicated ones. This is not a benefit.
[JS] Ok, I didn't think of the possibility of anyone building XML =
parsers without building the document tree. (I won't disclose any =
estimate for building the document tree...:-) )
> [JS] 2. Allow non-quoted attribute values. I guess support for this =
is also a 5 minutes project for the=20
parser-writer.
[MA] We're up to ten minutes. Actually, this makes the parser more =
complicated,=20
since knowing that attribute values are delimited allows a simple =
'scan-literal'
approach, ie., if the first character after the equals sign is a single =
quote,
one scans to the next single quote. If a double, scan to the next =
double. If=20
they are optional things get much more complicated, and we now must care =
about
what type of characters are in the content of the literal. Options and=20
minimization features generally add a lot of work for parser writers.
[JS] I think the complexity this adds for the parser writers are =
neglible, it's a very local thing, typically located to a single =
method/routine.
If having the possibility of omitting the quotes would benefit users, =
perhaps by making it more SGML compatible, I definitely think one should =
allow this. I've already seen documents on the web stated as being XML =
documents without the quotes. If some parsers allow it (I don't know!), =
then the other parsers would seem unecessary "stubborn" from a user's =
perspective.
> [JS] 3. Add a paragraph to the XML standard document explaining why =
character references should be resolved before=20
storing the string as the value of the entity.
[MA I believe we would lose an enormous amount of expressive power and =
put
unnecessary restrictions.
[JS] This may very well be true. I'm not an SGML expert.
I'd love to see an example of this. I think a good example of this would =
make XML parser writers much more motivated when implementing it! :-)
[MA] Recursive entity resolution is not programmatically
that much extra work
[JS] Perhaps not the resolution itself.
But making it possible to give the user good error messages (and =
displaying the location(s) where the error takes place) I assume is =
quite a lot of work. Perhaps not so for the direct coding, but to come =
up with the necessary architecture/design.
I also think this simpler model would make for simpler API's for tool =
builders, at least for tools needing to have info about where entities =
were invoked in the original document. (f.i. tools which =
updates/synchronizes documents need this info, in order to not "flatten =
it out".)
[MA] and allows for various important SGML facilities. And
remember that one of the explicit goals for XML is SGML compatibility.
[JS] Yes. But it would be very sad if this made XML substantially more =
complex (without any other benefit than compatibility), both for users =
and tool vendors.
Cheers,
Jarle Stabell