Re: Converting HTML to well formed XML

Simon St.Laurent (simonstl@simonstl.com)
Mon, 09 Nov 1998 09:07:37 -0500


At 07:17 PM 11/9/98 +0530, Ajay Gangwar wrote:

>I need to convert HTML document to well formed XML
>document. Can someone please guide me how to go
>about doing this? Are any utilities available for this?
>
>- Ajay Gangwar
> majayg@iitk.ac.in

I'm in the process of converting my site (http://www.simonstl.com) to
well-formed syntax, though still using an HTML vocabulary. I'm keeping
sort of a diary at http://www.simonstl.com/projects/html2xml/ - it lists
some helpful utilities for cleaning up the code, like Dave Raggett's TIDY,
and XML.com's RUWF well-formedness checker.

Unfortunately, no one yet (so far as I know) has created a friendly
one-step legacy HTML->well-formed XML syntax HTML converter. That's a nice
opportunity there for some publicity, if not necessarily $$$...

Of course, if you want to to use non-HTML vocabulary, you're talking about
something much larger than syntax, and I'd recommend investing in a book -
my _XML: A Primer_, John Simpson's _Just XML_, or Elliotte Rusty Harold's
_XML: Extensible Markup Language_ to get started.

Simon St.Laurent
Dynamic HTML: A Primer / XML: A Primer
Cookies / Sharing Bandwidth (November)
Building XML Applications (December)
http://www.simonstl.com