I agree that defining what is and is not well-formed and valid XML ought to
be a readily achievable goal, and it is a little surprising to find an area
where the spec is ambiguous on the matter. Hence my suggestion for a formal
analysis to discover whether there are other unsuspected problems.
I also agree that defining what a conformant XML processor should do with
that XML (not to mention what it should do with erroneous XML) is
considerably harder, though I think the problem becomes tractable if the
behaviour is defined in terms of a concrete API such as SAX or DOM.
I agree with those who have pointed out that formalisms like Z are not a
good vehicle for communicating a standard to a wide audience. In my own
experience, however, the kind of thinking required to produce a formal
specification in Z is invaluable when trying to produce an unambiguous one
in clear English. I don't believe that precision and readability are
incompatible goals.
There is information about Z, by the way, on
http://www.non.com/news.answers/z-faq.html
Mike Kay