Re: Announcement: SAX 1998-01-12 Draft

Tyler Baker (tyler@infinet.com)
Thu, 02 Oct 1997 00:56:06 -0400


David Megginson wrote:

> I am happy to announce the first draft of SAX, the Simple API for XML,
> together with a Java reference implementation and drivers for the
> major Java-based XML parsers.
>
> SAX is a simple, common, event-based API for XML parsers written in
> object-oriented languages like Java, C++, or Perl5 (the reference
> implementation is in Java). SAX is similar in philosophy to
> JavaSoft's JDBC -- it allows you to write an application once, then
> plug in any XML parser that has a SAX driver, just as the JDBC allows
> you to plug in any SQL database that has a JDBC driver. The SAX API
> was developed collaboratively during a month of discussion on the
> XML-DEV mailing list.
>
> As an event-based interface, SAX is complementary to the proposed
> (tree-based) Document Object Model interface; in fact, it should be
> possible to implement a basic DOM interface on top of SAX, or a basic
> SAX interface on top of DOM. Event-based interfaces provide very
> simple, low-level access to parsing events, without straining system
> resources.
>
> For SAX documentation, a draft spec, a reference implementation of the
> SAX interfaces in Java, SAX front-end drivers for the major Java XML
> parsers (NXP, Lark, MSXML, and =C6lfred), and a sample SAX application,
> please see
>
> http://www.microstar.com/XML/SAX/
>
> I would like people to play with this for a month or two, during which
> time I'll collect suggestions and bug reports; after that, with luck,
> we can come up with a final draft. I may continue to work on the SAX
> drivers during that time, but I want to leave the rest alone for a
> while.
>
> All the best,
>
> David

In an hour I quickly did my best to map the initial SAX draft to CORBA 2.=
0 IDL as
past
discussion on an IDL form of SAX on this mailing list seemed to generate =
interest

in the idea. The mapping is not exact, and also faces some serious desig=
n flaws
as far as
distributive computing, especially since it is event based and may genera=
te a lot
of remote
invocations via the callbacks to the client application where the server =
is the
XMLProcessor.

Returning some sort of tree based "struct" structure of the XML probably =
would be
a much more
scalable solution for large documents. Nonetheless, this is a simple att=
empt at
mapping SAX-J to
IDL. Any comments would be greatly appreciated.

Thanx,

Tyler

// This is an initial attempt to map the current SAX-J draft to CORBA 2.0
// IDL. The motivation for this is that many people may want to do
// their XML processing on a remote server, rather than with the client,
// especially if the client is a thin NC or some other computing device.
// Most of the mappings are essentially exactly identical, however
// the only real changes are in that there java.lang.Exception is mapped
// to a class called XMLException and that AttributeMap is mapped to an
// array of structs called Attributes. The reason for this, is that in
// CORBA the only way you can pass things by value is using structs. I
// would think that this would be a good idea to have this information
// returned in a struct rather than a CORBA Object. I used Visigenic's
// idl2java compiler to see if the IDL was syntactically correct. You
// can also use SUN's IDL2Java compiler, which will generate identical
// java interfaces, but different stub classes as well as helper and
// holder classes.

module org {
module xml {
module sax {
typedef sequence <wchar> Chars;

exception XMLException {};

struct Attribute {
wstring name;
wstring value;

boolean entity;
boolean notation;
boolean id;
boolean idRef;

wstring entityPublicID;
wstring entitySystemID;
wstring notationNameID;
wstring notationPublicID;
wstring notationSystemID;
};
typedef sequence <Attribute> Attributes;

interface EntityHandler {
wstring resolveEntity(in wstring ename, in wstring publicID, in w=
string
systemID) raises(XMLException);

void changeEntity(in wstring systemID) raises(XMLException);
};

interface DocumentHandler {
void startDocument() raises(XMLException);

void endDocument() raises(XMLException);

void docType(in wstring name, in wstring publicID, in wstring sys=
temID)
raises(XMLException);

void startElement(in wstring name, in Attributes attributes)
raises(XMLException);

void endElement(in wstring name) raises(XMLException);

// It would be more straightforward if "char[] ch" were instead "=
String
s"
void characters(in Chars ch, in long start, in long length)
raises(XMLException);

// It would be more straightforward if "char[] ch" were instead "=
String
s"
void ignorable(in Chars ch, in long start, in long length)
raises(XMLException);

void processingInstruction(in wstring name, in wstring remainder)
raises(XMLException);
};

interface ErrorHandler {
void warning(in wstring message, in wstring systemID, in long lin=
e, in
long column) raises(XMLException);
void fatal(in wstring message, in string systemID, in long line, =
in long
column) raises(XMLException);
};

interface Parser {
void setEntityHandler(in EntityHandler handler);
void setDocumentHandler(in DocumentHandler handler);
void setErrorHandler(in ErrorHandler handler);

void parse(in wstring publicID, in wstring systemID)
raises(XMLException);
};
};
};
};