(a) the content of the element to which this is attached must be
pure #PCDATA, no child elements and no references, and
(b) the content is encoded in base64, leading and trailing spaces allowed
This obviously couldn't retroactively become part of XML 1.0, but
if it went through a process and became a W3C recommendation, I bet
every parser author in the world would support it in about 15 minutes.
Base64 (a 4-for-3 encoding) wastes 33%, so I thought about perhaps
inventing Base128 (8-for-7) or maybe even a higher level to cut down
wasteage, but Base64 has the advantage that it avoids UTF8/ISO-8859
confusion and I bet Mr. LZW will eat that 33% anyhow...
I also thought about xml:encoding=, but that conflicts with
encoding= in the XML declaration in a confusing way.
Are there any gotchas I'm missing? Don't know if I could persuade
one of the WGs to take it up, but it seems pretty obvious that there
is not only industry demand but in fact people doing this already, so
the case is pretty strong I think. -Tim