xerces-c-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David_N_Bert...@lotus.com
Subject Re: DOM Performance
Date Thu, 01 Feb 2001 16:11:47 GMT

On 02/01/01, Miroslaw Dobrzanski-Neumann wrote:
> On Wed, Jan 31, 2001 at 11:20:01AM -0800, Andy Heninger wrote:
> > DOM Performance Problems
> > ========================
> >
> > Here are some thoughts on what might be done:
> >
> > o  Associate all of the storage for a DOM document with the document
> >    node object.  Applications would get document from the parser,
> >    access it, and then explicitly delete it when done.  All Nodes and
> >    strings would remain until the document was deleted.
> For a long time we (my company) had the similar memory problems. So we
> analysed how the memory is used in our code. It turned out that we needed
> kind of write once + random read optimization and taht the memory was
bound to
> some higher level object. Our solution was to use an allocator bound to
> object. It preallocates a big block and distributes it according to the
> requests. The is reuse inside of the block. The whole block will be freed
> the object dies. The implementation of such allocator is very simply and

This is something like what the new source tree implementation in Xalan-C++
does.  The document contains an allocator for each type of node in the
tree.  The allocators request memory in chunks large enough to store n
objects of that type, where n can be set for individual node types within
any document instance.

Since the XSLT source tree is not mutable, the allocators do not need any
overhead to track free space within blocks -- they create nodes on demand,
but never need to destroy them.  When the document is destroyed, the
allocators are destroyed, at which point they run the destructors for all
of the objects that they created, then delete any blocks that were

Right now, the allocators get memory from the global heap, but there is no
reason why they could not get that memory from another heap.  As an
experiment, I'm going to tweak the allocators so they can use a specific
heap, in this case, one that belongs to the document.  I'm not so sure this
is going to be any big improvement, but I do want to experiment with it.

Another interesting memory saver for a non-mutable DOM is the pooling of
strings.  This turned out to be extremely effective in Xalan-C++.
Parse/tree creation times are greatly reduced for large documents, since
many tag names, attribute names, attribute values, and text nodes are

> This solution could also be applicable for DOM. In most cases the DOM
> is build in a stright forward process with nearly no deletion or
> modifications. Also write once, random read, free when the last node dies

I agree 100%.  In those cases where the tree is not changed very much, the
savings in allocation overhead would be greater than the amount of memory
wasted for a few nodes and strings that aren't deleted when removed from
the tree.

> --
> Send bugreports, fixes, enhancements, t-shirts, money, beer & pizza to
> ===================================
> Miroslaw Dobrzanski-Neumann


View raw message