xerces-c-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Heninger" <an...@jtcsv.com>
Subject Re: DOM Performance
Date Thu, 01 Feb 2001 17:12:03 GMT
Dave Bertoni wrote

> Another interesting memory saver for a non-mutable DOM is the pooling of
> strings.  This turned out to be extremely effective in Xalan-C++.
> Parse/tree creation times are greatly reduced for large documents, since
> many tag names, attribute names, attribute values, and text nodes are
> identical.

String pooling for element and attribute names is there already in the
xerces-c DOM.  It probably makes sense for attribute values as well, at
least for most documents, and for white space only element content.

Maybe some sort of a dynamic approach could work for document content -
try pooling for the first few hundred or so strings - if a significant
number of matches are occuring, keep doing it, if not then give up
pooling, to avoid the overhead of hashing every string and creating a
potentially huge string table.

Andy Heninger
IBM XML Technology Group, Cupertino, CA

View raw message