Re: XML processing experiments

Jarle Stabell (jarle.stabell@dokpro.uio.no)
Fri, 07 Nov 1997 18:43:37 +0100


David McKelvie wrote:
>>> "<!ENTITY name 'richard'> ... <p>my name is &name;</p>"
>
>It's worth pointing out that Richard wants ALL of the PCDATA of the
><p> element to be returned as one string of characters "my name is
>Richard", rather than as two strings "my name is " and "Richard".

Yes. But this requires one to copy (at least the first string) and a
concatenation.

Some applications may be more interested in the speedup which may result
from not doing this copying/concatenation, and happily accept the small
increase in complexity handling it.

I'm playing with a design involving two pluggable "ESIS-handlers", one
"low-level", where GI's, attribute names, attribute values, comments etc
points directly into the source. (typically via a filemapping or an
in-memory-buffer)
The "low-level" ESIS-handler may copy the data into "real" strings,
concatenate the consecutive PCDATA sections , build the tree, do validation
etc and pass the events to an optional "higher-level" ESIS-handler.

I think/hope the layer which triggers the low-level events won't be very
different from Mr Clark's "quick and dirty" parser.

(Not sure yet whether the low-level handler should just receive events, or
whether it should query for the next event/token.)

Cheers,
Jarle