Here's what a language with no semantics looks like:
a -> b"q"c
a -> ca
b -> "d"
c -> "e"
Even given a parse tree, you can't do anything interesting with this
language, because it has no semantics. But you can do lots of interesting
stuff with a "raw" XML parse tree, even if you do not know its DTD. For
instance you can build a DOM from it, apply a stylesheet to it, check its
validity, check its conformance to an XML-Data schema and so forth.
I think that what Tim and Jon mean to avoid is a battle royale over how
elements, attributes etc. fit into various ontological philosophies. I
don't think that that avoidance is useful, but I understand the
motivation. Nevertheless, I feel it is not accurate to claim that XML is
semantic-free. There are tons of semantics, both subtle ("element type")
and explicit ("initiate this network transaction in response to this
markup.")
Consider:
"validity constraint: A rule which applies to all valid XML documents.
Violations of validity constraints are errors; they must, at user option,
be reported by validating XML processors"
How can we tell a processor that it must trigger a *side-effect* with a
legitimate (but not valid) document, and then claim that we are not
describing sematics? There are other things like this in the XML spec:
"When an XML processor recognizes a reference to a parsed entity, in order
to validate the document, the processor must include its replacement text"
-- now we're initiating network transactions. That's a semantic?
"If there are no external markup declarations, the standalone document
declaration has no meaning." -- that would imply it already had meaning. I
don't believe that there is a distinction between "meaning" and
"semantics."
"If a non-validating parser does not include the replacement text, it must
inform the application that it recognized, but did not read, the entity."
-- a constraint on the interface between processors and applications.
That's a semantic.
Paul Prescod - http://itrc.uwaterloo.ca/~papresco
"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html