Whitespace

Sean Mc Grath (digitome@iol.ie)
Tue, 19 Aug 1997 11:11:52 +0100


>> Peter Murray-Rust wrote:
>>
>>I think - along with TimB - that it is unrealistic to come up with s single
>>set of rules that will server every application. There was an enormous
>amount
>>of discussion on the XML group last year and I take it as axiomatic that we
>>cannot produce a set of rules which everyone agrees are:
>> - simple to state
>> - unambiguous
>> - intuitive and easy to learn
>> - universal (i.e. cover every situation)
>

**Warning:** Rush of blood to the head follows. Get those flame throwers
ready...

I know this whole white space thing was trashed out at length some time ago but
it worries me greatly that on XML-DEV the whole issue seems to be as problematic
as it was before XML-Lang's rulings on whitespace handling where decided upon.
It seems that the problem was not really solved - just pushed up a layer:-)

It just sounds wrong to me that white space handling is to be the subject of
application conventions rather than part of the core XML parsing activity.

Anyway, I think everyone should be allowed over-simplify the "White Space
Problem"
once in there lives! Here is my contribution:-

Ban mixed content. Mixed content is a markup minimization feature.

If you want a chunk of PCDATA in an XML doc, use the <PCDATA>
reserved element name.

<foo>
<pcdata>I am data 1</pcdata>
<pcdata>I am data 2</pcdata>
</foo>

Becomes
<foo><pcdata>I am line 1</pcdata><pcdata>I am line 2</pcdata></foo>

If you need whitespace to be something other than whitespace- i.e. a
newline to be a real newline to be passed on to the application, use an
empty element type to represent it.

<foo>
<pcdata>I am data 1</pcdata><newline/>
<pcdata>I am data 2</pcdata>
</foo>

Give me five minutes to put on the asbestos suit and then you flame
away....