xerces-c-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jon Smirl" <jonsm...@mediaone.net>
Subject Re: DOM Performance
Date Thu, 01 Feb 2001 16:21:43 GMT
From: <David_N_Bertoni@lotus.com>
> Another interesting memory saver for a non-mutable DOM is the pooling of
> strings.  This turned out to be extremely effective in Xalan-C++.
> Parse/tree creation times are greatly reduced for large documents, since
> many tag names, attribute names, attribute values, and text nodes are
> identical.

Excepting the text nodes, isn't this similar to having a schema structure in
memory that describes all of the tags, attributes, default attribute values,
etc? You're just discovering the schema as you built the DOM tree.

I haven't looked at how Xalan is doing this, but instead of pooling strings,
what do you think about building a schema description in memory that
describes the tag structure of the document? The schema would be built as
you discover its structure while building the DOM tree.

For example, a schema node would consist of a pointer to a tag name,
pointers to attribute names that exist on the tag, and possibly pointers to
common attribute values. When you build a node in the DOM tree, the DOM tree
node would point to the schema node and contain an array of pointers
corresponding to the attributes and values.

Taking this further, the derived schema can be used to speed up XSLT
operations by attaching templates to the schema node or using the schema as
a way to find all occurrence of a tag in the document.

Jon Smirl

View raw message