xerces-c-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jon Smirl" <jonsm...@mediaone.net>
Subject Re: DOM Performance
Date Thu, 01 Feb 2001 07:27:37 GMT
From: "Dean Roddey" <droddey@charmedquark.com>
> The Java parser/DOM does this, and I personally think its more complexity
> than its worth, and its worse when you really do end up touching most to
all
> of the document. It can create very good looking benchmarks, but I'm not
> convinced its a real world win overall. And of course it would require
> rewriting the entire parser system.

Most real world XSL stylesheets:
a) do not transcode, input and output charset are the same.
b) do not look at the contents of the text nodes. They manipulate the
elements and tags a lot but few do anything to the text nodes but copy them.

I don't have any data to back this up but I suspect a lot of DOM programs
have the same characteristics.

It's not obvious to me that lazy transcoding is significantly worse even if
you end up touching most of the document. The transcode on demand strategy
could allow you to control the amount of memory used for transcoded buffers
instead of forcing it all into memory at once. The smaller memory footprint
would allow DOM manipulations of large documents without paging.

In my own case I'm dynamically generating small pages (<30k) and I want more
speed anywhere I can get it. It's not just the time spent in the buffer
copies (three copies and two transcodes), the OS also spends a lot of time
allocating and tracking the memory used for the buffers.

You can't always draw parallels between Java implementations and C,
especially when lots of strings are involved. Xalan's DOM changes showed us
that.

Jon Smirl
jonsmirl@mediaone.net



Mime
View raw message