Re: XML processing experiments

James Clark (jjc@jclark.com)
Sat, 08 Nov 1997 12:27:51 +0700


David McKelvie wrote:

> We started off doing something like this in LTNSL, but stopped doing
> filemapping (a) because it wasn't very portable and

What systems did you have problems with? Win32 supports it and I thought
most modern Unix systems now did.

> (b) either you do
> some tricky decisions about when you free these pointers into the
> source or it makes reading huge corpora like the 2 gigabyte BNC corpus
> impossible which we wanted to be able to do.

Yes, I can see that's a problem. How common do people think XML files
bigger than 1 gigabyte or so are going to be? How hard would it be do
use external entity references to split it up into files smaller than 1
gigabyte?

James