SAX, non-XML Documents, and Legal Characters

David Megginson (ak117@freenet.carleton.ca)
Mon, 13 Apr 1998 10:00:41 -0400


While we're on the topic of character streams, here's another
question: should the SAXDocumentHandler.characters() method be allowed
to deliver only XML characters?

At first, the answer "no" might seem self-evident, but what if someone
decides to build a LaTeX or RTF parser that implements the SAX
interface? Should we require the parser to strip out non-XML
characters before delivering the SAX events, or should we allow SAX to
be a general structured-document interface, and require applications
to strip out non-XML characters when exporting an XML document?

The question is, of course, moot for XML parsers, since they will have
to report a fatal error anyway if they find non-XML characters. It
would be interesting, though, to build an RTF parser with a SAX driver
and then hook it up to Don Park's SAXDOM.

Any thoughts?

All the best,

David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)