Docuverse HTML SDK can be used to build such a converter easily. What it
contains is a SAX parser interface to Swing's HTML parser which means you
can use your XML tools on HTML documents.
However, Swing's HTML parser has mishandles unknown tags so it is not a
perfect solution.
You can find HTML SDK at http://www.docuverse.com/htmlsdk
Best,
Don Park
Docuverse