Re: Too much latitude in Nmtoken characters?

Rick Jelliffe (ricko@allette.com.au)
Mon, 14 Jul 1997 15:34:32 +1000

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Previous message: Martin Bryan: "Re: Why do XML and OFX have subtle differences?"
Maybe in reply to: Eric Baatz - Sun Microsystems Labs BOS: "Too much latitude in Nmtoken characters?"

> This seems to allow Nmtokens that aren't visible to the human eye,
> for example, consisting of a single zero width non-joiner.

I think XML name tokens are better detected by exclusion not inclusion:
this
is a sensible way when you have to deal with lots of potential naming
characters. In other words, you detect the end of the name by the
presence of a sepchar or a delimiter, rather than by testing if each
character is a name character. At the reading end, such simple
token-detection is all that is needed if your document is well formed.

To stop silly tags, the SGML declaration should have ZWNJ character
(which I think has to do with cursive operation of arabic scripts,
and is as much required as accent characters) NAMECHAR not NAMESTRT.
So, in context, ZWNJ and RTL & LTR have visible effects. They are not
usually undetectable. But it is better to allow silly tags than
disallow native-language markup: only about 1/4 of the world can make
sense of English/Latin tags.

Apparantly the WG is waiting till August to finialise the naming
discipline.

Rick Jelliffe

Previous message: Martin Bryan: "Re: Why do XML and OFX have subtle differences?"
Maybe in reply to: Eric Baatz - Sun Microsystems Labs BOS: "Too much latitude in Nmtoken characters?"