Re: XML tools and big documents

Don Park (donpark@quake.net)
Wed, 2 Sep 1998 17:37:17 -0700


>For my implementation, for ot.xml (a 4 meg document) only about 1-2 megs of
RAM
>is used to store the 4 meg file in RAM due to all Names being cached at the
>parser level. It also takes only 10-12 seconds with a P-120 running
Symantec's

My test results were from running on Atari 800 (just kidding <g>). My test
machine is Pentium-133 with JDK 1.1.6 with JIT enabled. Building DOM is a
slow process but there are intermediate forms I am investigating which cuts
down DOM loading drastically.

>JIT for JDK 1.2 b4 to build the entire DOM tree. For spitting out the DOM
tree
>(and normalizing all the Text nodes) it takes about 15-20 seconds of which
5
>seconds is spent normalizing text nodes and most of the rest of this time
is
>actually spent in a brute force search and replace method that scans all
>character data and attribute values and replaces any occurrences of entity
values
>with entity names. This can be very expensive but I know no other way
around it.

Why are you normalizing text nodes before writing them out? Also, blindly
replacing entity values with entity names is error prone.

Don Park
Docuverse