> >  attribute(XmlParser, String, String, boolean)=20
 >=20
 > It seems completely wrong to have an attribute event separate from
 > start-element events.
I have worried about this myself.  My design goal with =C6lfred has bee=
n
to limit myself to two class files: one for the parser itself, and one
for the interface for the callbacks -- hence the separate event for
attributes.  This decision has forced some pretty severely hacked-up
internal code accompanied by very careful documentation.
I could send a hashtable of attribute names and values with the
startElement() callback, and let users look up types (etc.) with my
query methods, but I would have to lose a bit on two counts:
1) Allocating a new hashtable for every start tag will slow down the
   parser a fair bit.
2) I'd have no way to show which attributes were specified and which
   were defaulted (see below).
 > What's the boolean?  I don't think the application author should
 > to have to deal with anything but the name and value of attributes.
The boolean tells whether the attribute was specified or defaulted.  I
include this to allow people to do useful XML-to-XML transformations.
 > >  data(XmlParser, String)=20
 >=20
 > I feel that the 2nd argument should not be a String.  It is a recipe=
 > for disastrous inefficiency if the processor has to cook up a=20
 > java.lang.String object for every little chunk of text. =20
The overhead isn't that bad with =C6lfred because I coalesce my data
into the largest chunks possible before allocating the String.  I
think that returning a char[] array would be confusing for users, and
would lead to many bugs in their code as they ignored our warnings not
to rely on the value in the char[] array outlasting the callback.
 > Lark uses two
 > arguments, a char[] array and a character count; the app can
 > make a String if it needs to.  If you find this awkward, create
 > a new data type called Text so that if you need a String you
 > can make it with lazy-evaluation in Text.toString(), but if you
 > don't need it you don't build it.
Again, I'm reluctant to create new classes beyond XmlParser and
XmlProcessor.
 > Also, it shouldn't be named "data" - it should be named
 > characterData or charData or text or some such term that can
 > be mapped directly to the spec.
Agreed.  I will not change =C6lfred now, but I think that this is a goo=
d
idea.
 > >  resolveEntity(XmlParser, String, String, URL)=20
 >=20
 > I don't think entities have any place in the first cut of this=20
 > interface.  The processor exists to make these problems go away.
Normally, you should just return the URL argument; however, this
callback gives users a chance to do public-identifier resolution, URL
substitution, etc., and to return a different URL if desired.  For
example, if we had a DTD at
  http://www.microstar.com/XML/msldoc.dtd
and you had a local copy, you could substitute a local URL on your own
computer.  Likewise, you could do a catalogue lookup on the public
identifier "-//microstar//DTD Microstar Sample Document//EN" and
choose a different system identifier than the default supplied in the
document.
That said, I agree that this probably doesn't belong in the common
event API.
 > Generalities:=20
 > Lark has a thing where if any callback returns 'true', the
 > parser drops out of its loop... which is awfully useful and easy
 > I think.  Lark will also re-enter, but this need not be a requiremen=
t.
Awfully easy with a DFA-driven parser, but trickier with a
recursive-descent parser like =C6lfred.  I'd probably have to throw an
exception, and could not allow any kind of re-entry.
 > Also, for application programmers, especially dealing with smallish
 > objects, a tree interface is very natural.  I've written both
 > event-stream and tree apps using Lark, and the trees are a lot
 > easier to use for anything even moderately complex.  So the API=20
 > should have Element, Attribute, and Text classes.=20
Perhaps -- I may have to give in an allow =C6lfred to use more than one=
class file; or alternatively, these would be an optional extra, along
with the SAX-J layer.
 > And it shouldn't (sorry Peter) be called YAXPAPI - how about SAX, Si=
mple
 > API for XML?  Maybe SAX-J for the Java bindings. -Tim
How about RUSTY?
All the best,
David
--=20
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/