Re: Entities in Attribute Values

Arjun Ray (aray@q2.net)
Fri, 13 Jun 1997 01:02:55 -0400 (EDT)


On Thu, 12 Jun 1997, David Schach wrote:

> Because entity references are allowed inside ot attribute values, it is
> not possible to store an unmodified URL with data in an attribute. For
> example, the following XML is not valid because the '&''s are not
> escaped inside of SELF's HREF value.

This is a problem only if '&' *must* be the field separator. Why not
something else,, like ';' ?

> <SELF
> HREF="http://someserver/scripts/oleisapi2.dll/comics.custom.cdf?comics=o
> n&dilbert=on&calvin=on&peanuts=on" />
>
> This makes it inconvenient to store URL's in XML files. Would it anyone
> be interested in changing entity processing to fix this?

IMHO, there's no need for that. Or, at any rate, there shouldn't be. Using
'&' as a field separator in "query URLs" is a historical artefact of lack
of RTFM. The problem was recognized reasonably early too, and a fix was
proposed, but no HTML browser implementor of, ah, consequence ever got a
Round Tuit.

>From RFC 1866, Section 8.2.1 "The form-urlencoded Media Type":

NOTE - The URI from a query form submission can be
used in a normal anchor style hyperlink.
Unfortunately, the use of the `&' character to
separate form fields interacts with its use in SGML
attribute values as an entity reference delimiter.
For example, the URI `http://host/?x=1&y=2' must be
written `<a href="http://host/?x=1&#38;y=2"' or `<a
href="http://host/?x=1&amp;y=2">'.

HTTP server implementors, and in particular, CGI
implementors are encouraged to support the use of
`;' in place of `&' to save users the trouble of
escaping `&' characters this way.

We're not committed to perpetauting mistakes, are we?

Arjun