xerces-c-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charlie Hart <ch...@nc.rr.com>
Subject Re: DOM Performance
Date Thu, 01 Feb 2001 15:01:20 GMT
Is there any documentation for these new changes? I checked out the latest
source but didn't see anything. thanks...charlie.

Jon Smirl wrote:

> The C++ version of Xalan has recently (ie the version in CVS) changed from
> using the Xerces DOM to a newly written implementation. The new DOM
> implementation is completely C++ oriented and it uses standard strings.
> Right now it is meant for internal use by Xalan; not as an externally
> accessible DOM but it might be a good starting point . Xalan DOM trees work
> in a multithreaded environment, but the each DOM tree only allows singled
> threaded access.
>
> Removing the string and synchronization overhead of the Xerces DOM resulted
> in performance gains of about 30% in Xalan transforms. I expect this gain to
> be much higher on SMP machines but it hasn't been measured yet.
>
> An alternative DOM implementation that I would find very interesting is a
> DOM that does lazy charset transcoding. In other words the source document
> is read completely into memory (via memory mapped files if you have it). The
> parser then parses the document and builds a DOM using pointers to the
> source buffer plus a length. Then, only if the text node is accessed from
> the DOM is the string copied and converted to Unicode.
>
> This DOM implementation could have a big performance advantage in a system
> like Xalan. 99% of the time DOM strings are copied straight from the source
> document to the output stream without the need for transcoding since most of
> the time the source and output documents are in the same charset. Note that
> this lazy transcoding system does not stop transcoding from happening, it
> just avoids it if possible. One nice side effects is a greatly reduced
> memory footprint allowed by memory mapping the input documents.
>
> Jon Smirl
> jonsmirl@mediaone.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Mime
View raw message