http://www.docuverse.com/htmlsdk/index.html
It is currently very small right now (about 10K ZIP file) but it contains
something I am quite sure all the SAX users will want: a HTML parser with
SAX driver. Actually, it does not contain a HTML parser, instead the HTML
parser in the latest Swing release (1.1 Beta 2) is used. Docuverse's own
HTML parser is being written but it is a painful process so this will have
to do for now.
A DOMReader implementation is also included. Note that with HTML SDK and
DOM SDK together, you can now create DOM out of any HTML files.
Best,
Don Park
Docuverse