Jarle Stabell wrote:
> * The ability of using non-XML-aware SGML tools on XML documents.
>=20
> How long will this benefit be of any substantial value?
David Megginson writes:
<<<<
That's hard to say. There is an enormous number of SGML document
systems now in place, and (as we know now with the Y2K pseudo-crisis)
companies are _very_ reluctant to change their software once they've
installed a new system, especially if the system is the result of an
expensive and difficult project.
>>>>=20
[JS] Then I'd say that they should stick with SGML, then they won't be =
bothered by XML not being SGML compatible and don't have anything to win =
with switching to XML if not for using better tools. (Their SGML =
documents are likely not well-formed XML documents in the first place)
> [JS] One of the benefits may be that parsing XML documents will run =
noticeably
> faster with a XML-specfic parser than a general SGML parser.
<<<<
I don't know enough about automata theory to know if this statement is
true (or even verifiable) -- it seems to me, though, that the number
of productions in the grammar shouldn't affect parsing speed, and I do
know that SGML is designed explicitly to avoid backtracking by
requiring no more than one look-ahead token (to everyone's
annoyance).
>>>>=20
[JS] I don't know the theory well enough to tell how the grammar size =
influences the speed of the parsers buildt with (LA)LR[1] tools myself, =
but I guess the typical XML parser won't be buildt by such tools at all, =
because an XML parser probably is quite simple to build "manually", and =
because one generally gets faster parsers this way.
(Hopefully some of the implementors read this, they could easily falsify =
my view)
I haven't checked all the public XML parsers, I see NXP use JavaCC which =
is a parser generator, but not a typical Yacc/Bison sort of thing.
I would be *very* suprised if someone were able to write a general SGML =
parser being as fast as the fastest XML parser (in f.i. 3 years time).
You also state "to everyone's annoyance". This is exactly what I mean, =
is the SGML compatibility so much worth that we instead will force upon =
perhaps millions of users in the next 10-20 years syntactic design =
"flaws" which are well known to us today?
<<<<
1) Credibility: by tying itself to a well-established international
standard (ISO 8879), XML can win over conservative users in
important areas like financial services and EDI.
>>>>=20
[JS] Yes. But I'm not old/wise enough to understand that doing some =
minor syntactic "fixes" should scare those away as long as it will be an =
international standard with the great ideas of SGML intact.
<<<<
2) Implementation: the XML standard will live and die partly based on
the enthusiasm of early implementors; piggy-backing on SGML gives
it a good, experienced implementor-base right from the start. =20
>>>>
[JS] Yes. But to speak for one possible implementor (myself), I would =
be much more enthusiastic about it if I believed it was as well-designed =
as it could be.
I really believe in the "semantic" beauty of SGML, the tree =
structure/groves, the separation between document type and instance, =
DSSL/XSL etc and also the "general" concrete syntax, but I also think =
the general user would be better off if XML were *simplified* SGML, not =
only a well-defined subset/fragment of it.
Paul Prescod wrote:
<<<<=20
The language that we use to encode humanity's knowledge should be a true
standard and not merely a "recommendation." That means that it should
be built upon our democratic institutions and not vendor consortiums.
>>>>=20
[JS] Agree (with your whole mail). But why must we be so "selfish" to =
let humanity struggle with SGML compatibility? (No reply necessary... =
:-) )
As current SGML (and HTML) documents typically won't be well-formed XML =
documents, I just can't see the big practical win in ensuring that =
well-formed XML documents should be SGML documents.
What I feel *would* have practical value would be SGML documents being =
wf XML documents, and/or HTML documents being wf XML documents, but none =
of these will of course be true.
Cheers,
Jarle
----
"Syntax is arbitrary"
[some french linguist or philosopher which name I can't remember the =
spelling of]