RE: SAX drivers bug ... or feature !

Toivo Lainevool (david@megginson.com)
Sat, 21 Nov 1998 13:09:05 -0500 (EST)


Toivo Lainevool writes:

> Instead of clearing and reusing the the AttributeList object,
> wouldn't it be better to create a new attribute list object? If
> the old Attribute list isn't being referenced, it will be garbage
> collectible. If the old Attribute list is being reference, it
> won't be changed out from under the client. This way of doing it
> seems to offer the best of both worlds.

Depending on the virtual machine, this could be a killer. Remember
that a medium-sized XML document (such as a book) might have 10,000
elements: that would mean an extra 10,000 attribute lists allocated
and then garbage collected in what should be only a few seconds of
parsing.

In fact, since AttributeList is an interface, drivers often implement
it themselves rather than allocating an object for it, as in

import org.xml.sax.Parser;
import org.xml.sax.AttributeList;

public class MySAXDriver implements Parser, AttributeList
{
...
}

SAX was designed to be flexible so that people could write
highly-optimised implementations like this if they wanted. It's also
designed to work with languages that don't support GC out-of-the-box,
like C++.

However, we also recognised that people might want to keep around
attribute lists sometimes, so we made it very easy by adding the
org.xml.sax.helpers.AttributeListImpl class with a copy constructor:

AttributeList persistentAtts

public void startElement (String name, AttributeList atts)
{
persistentAtts = new AttributeListImpl(atts);
}

This approach does give you the best of both worlds -- it is trivially
easy to make a persistent copy when you need to, but you're not stuck
with the allocation/gc overhead when you don't need it.

All the best,

David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/