REPOST: XML Java API-an idea

Chris Lloyd (clloyd@gorge.net)
Sat, 21 Jun 1997 13:35:34 -0400


This is a repost because some of the original post was clipped.

I'm just entering this thread so I don't know what solutions have been =
discussed.=20
There is already an API to draw from in the DSSSL spec and a definition =
of the=20
SGML property set which gives us a common language to work from.=20
The problem is that an XML API to a grove should be simple with a=20
small interface and should leverage the object-oriented power and syntax =
of Java.

Personally, when working with groves I find some abstractions very =
useful in an API.=20
I would rather have an API based on iterators than one based on a set of =
navigation function calls. I'm talking about navigating the grove =
rather than building the grove. An iterator API would be extremely =
simple, well abstracted and more inline with patterns of C++ and Java =
programming than the SDQL API found in DSSSL. They could also maintain =
an adherence to the syntax of the SGML property set.

Here is an example although my naming syntax probably does not =
correspond to=20
the SGML property set here.

// Assuming we have a object provided by the parser that is a grove,=20
instantiate an iterator and navigate to the first element that is a =
TITLE tag

// A Factory is an object that defines what SGML/XML constructs the =
iterator=20
knows how to iterate. It provides the grove iterator with a different =
node=20
iterator for each property node that it knows how to walk.

ForwardGroveIterator XMLIter(OurGrove, XMLPropertySetFactory(), =
StartNodePropertyHandle);

While(XMLIter++ !=3D XMLIter.end())
{
XMLBaseProperty Prop =3D XMLIter.Object(); // in C++ we would use the =
dereference operator like this XMLBaseProperty Prop =3D *XMLIter;
If (Prop.GetClass() =3D=3D Element.Class) // is this an element?
{
Element aElement =3D Prop; // lets convert the property from a base =
class object to it's concrete class=20
// Now we have an element object and can call all it's member =
functions
if (Element.GetIdent() =3D=3D String("TITLE"))
break;
}
}

// OK lets instantiate a new iterator to walk back up to the root of the =
grove
// use the copy constructor to produce a reverse iterator from our =
forward iterator
ReverseGroveIterator XMLReverseIter(XMLIter);

While(XMLReverseIter++ !=3D XMLReverseIter.end())
{
// do stuff here
}

The navigation itself is not the same as defined in SDQL but the =
property set=20
could be made to conform to the SGML property set. This might offer a =
compromise.=20

The factory concept is very powerful because extending an iterator is as =
simple=20
as adding a new factory class and a nodeiterator class for each new =
property=20
being added to the grove. If someone wanted to inherit from the XML =
property set=20
and put metadata in their grove, they could easily extend the =
functionality=20
of the base iterators to support their new properties.=20
Because the iterator class has a small interface, It's easy to plug and =
play=20
new iterators into existing code. You can read more about iterators and=20
factories in Design Patterns, Addison Wesley, Gamma, Helm, Johnson, =
Vlissides.

Once we have the appropriate iterators then we can create an API of =
Functions=20
and Algorthimns maybe based on SDQL that can do higher-level operations =
like this

// Find the first parent object that is an element
Algorithmn::find( ReverseIter, classid<Element>()); // C++ sytax with =
templates
Algorithmn::find( ReverseIter, classid(ELEMENT)); // Java sytax without =
templates

// Find the first object that is an element and whose name is TITLE
if (Algorithmn::find( ReverseIter, AND(classid<Element>(), =
name("TITLE"))))
{
Element aElementFound =3D *ReverseIter; // get the element and use it
}

Why we need iterators
1.) Iterators hide the details of how a grove is actually linked =
together, whether is memory or in a object database, etc..=20
2.) Iterators have the same iterface regardless of the types of =
properties in the grove
3.) Iterators are extensible and can provide read-only functionality as =
well as read-write functionality
4.) Iterators are a well know and accepted design pattern and are =