Re: finalising org.sax.xml.Parser

Michael Kay (M.H.Kay@eng.icl.co.uk)
Tue, 24 Feb 1998 11:00:50 -0000


>>From: David Megginson [SMTP:ak117@freenet.carleton.ca]
[heavily cut]
>>After considering the various discussions over the past few weeks, I
>>propose that we make the following changes:
>>
>>1) Add a parse() method that accepts a stream.
>>2) Add a parse() method that accepts a character buffer.
>>With these changes, the interface would look like this in Java:
>>

>> public void parse (InputStream is, String baseURI)
>> throws java.lang.Exception;
>> public void parse (char ch[], int start, int length, String baseURI)
>> throws java.lang.Exception;
>>NOTES:
>>
>>a. The baseURI argument is necessary for streams and character buffers
>> in case either contains a relative URI. You can supply a null
>> value if the document entity will not contain relative URIs.
>>
Comments:
1. Is the (ch, start, length) method really necessary, given that one can
supply a StringReader or whatever to the parse(InputStream) method?
2. If my "main" XML document is in a record in a database, then it is very
likely that any other entities referred to will be in the database as well.
Therefore, I think the logical approach in this situation is for the
application to resolve all URIs encountered: the parser should call the
application supplying a URI and the application should return an InputStream
to allow the parser to read it. This should presumably be done via the
EntityHandler interface.

And a question: is there a recommended way to abort a parse once the
application has got the information it needs (e.g extracting the contents of
the TITLE element)? Would an interface like parser.abort() be cleaner than
playing around with exceptions? I ask because in handling the results of a
free text search, I am parsing all the retrieved documents when I only need
a bit of text from the beginning of each, and this is obviously wasteful. I
thought perhaps of supplying a stream and generating a premature
end-of-file, and then trapping the exception that comes back.

Regards, Mike Kay