Re: XML API specification

Peter Murray-Rust (Peter@ursus.demon.co.uk)
Thu, 27 Feb 1997 14:54:40 GMT

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Peter Murray-Rust: "Multiple mails"
Previous message: Peter Murray-Rust: "Re: XML API specification"
Maybe in reply to: Tim Bray: "XML API specification"
Next in thread: Gavin Nicol: "Re: XML API specification"

In message <199702271419.JAA24115@nathaniel.ebt> gtn@ebt.com (Gavin Nicol) writes:
> I think that for the *parser*, we should define an event-handling
> interface, as it is much simpler to build certain applications
> that way, and because you can build a tree from a stream of
> events if you need to.

In CoST Joe English supported both eventStreams and trees (I'm sure Joe will
have some wisdom on this one). I started off using the event mechanism
and switched to a tree-based one but I suspect that this was the nature of the
application.

My current problem may highlight this. A CML document is highly
tree-structured and contains no mixed content, so that eventStreams don't
contribute much. BUT it also includes chunks of HTML where a tree structure
is quite inappropriate. If I take a Lark-based approach (or my own
parser) the HTML gets rendered into a tree. I am now hacking this
back into an event stream to render the hypertext. Not only does it
take more effort, but I'm sure that holding HTML as a tree has a
memory hit. Ideally when I'm parsing CML, and come to the
tag <XHTML> (sic) which contains <BODY>, I'd like to tell the parser
'stop parsing as a tree and just hold a hypertext string until </XHTML>.
We *could* do this with a PI, but would have to all agree.

>
> Some questions that will affect the API is whether one sees empty
> element as elements containing nothing, or as elements unable to

Yes.
> contain anything, and wether entity/attribute type information needs
> to be passed across thr API.

I have been convinced that entity information needs to be preserved and I
assume there are people who are concerned about attribute_type. If
nothing else, this is probably critical for ID/IDREF.
>
> What do people think? How much information must the parser pass
> along?

At least what comes out of sgmls/ESIS, probably with general entities
added. We also need to know the DOCTYPE info.
[...]

-- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/

Next message: Peter Murray-Rust: "Multiple mails"
Previous message: Peter Murray-Rust: "Re: XML API specification"
Maybe in reply to: Tim Bray: "XML API specification"
Next in thread: Gavin Nicol: "Re: XML API specification"