PR.xml

Peter Murray-Rust (peter@ursus.demon.co.uk)
Fri, 16 Jan 1998 14:21:05


I thought it would be a useful exercise to parse the XML version of the XML
PR, but I've run into minor problems. Using Netscape I downloaded the
version with TimBray's name and 3 December and have tried running it with
Lark and AElfred.

The first problem is that both throw errors because the DOCTYPE contains a
reference to "spec.dtd". No spec.dtd was available, so the only remedies are:
- edit out the reference to spec.dtd (i.e. use <!DOCTYPE spec [ )
- make a dummy spec.dtd

When I tried the first with AElfred it found an illegal character (-96),
which I take to be 160 (== nbsp). My edit was done with WordPad. Where did
this come from? [Note that AElfred seems quite happy with Lark.xml - Tim's
documentation for lark in XML].

I tried the second with Lark, using the line:

<!DOCTYPE spec SYSTEM "spec.dtd" [

but although I created spec.dtd in both the directory for pr.xml and the
directory where I was running Lark from it didn't work. I had to explicitly
hardcode something like:

<!DOCTYPE spec SYSTEM "file:/C:/xmlstuff/spec.dtd" [

I expect I have made some silly errors, but if so they are typical of the
sort of errors that people will make.
- Could DavidM confirm that AElfred does/does not work with the XML
version of the PR?
- Is is reasonable to expect parsers to throw fatal errors if they can't
find *.dtd? If so, we are going to have to work very hard to make sure that
*.dtd is always instantly available.
- should parsers accept relative filenames in documents - if so relative
to what?
- where has character -96 come from? [The download, wordpad?]
- or is it just a bad day?

P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg