jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Memory use of item states
Date Fri, 16 May 2008 13:44:54 GMT
Hi,

I recently did some quick benchmarking and testing related to the
memory requirements of large transient spaces. My main test case was:

    Node root = ...;
    for (int i = 0; i < 100; i++) {
        Node nodeI = root.addNode("node" + i);
        for (int j = 0; j < 1000; j++) {
            Node nodeJ = nodeI.addNode("node" + j);
            for (int k = 0; k < 10; k++) {
                nodeJ.setProperty("prop" + k, "value");
            }
        }
    }

This created a transient space of about 100k nodes and 1M properties
(with the same value instance, so only the size of the internal item
structures were counted).

Creating just the transient nodes (including the mandatory
jcr:primaryType) required 87MB of memory and the properties added
154MB for a total size of 241MB. Thus a simple transient node costs
about 900 bytes and a simple transient property about 160 bytes.

I then tried a number of simple memory optimizations (with saved
memory in MB and as a percentage of the 241MB):

* Remove memorized hashCodes from item ID's: 8MB ~ 3%
* Make NodeState.propertyNames a sorted list instead of a HashSet: 18MB ~ 7%
* Use a single InternalValue object instead of an array for
single-valued properties: 17MB ~ 7%
* Use the underlying value objects directly instead of the
InternalValue wrappers in PropertyState: 18MB ~ 7%

All of these improvements were achieved with reasonably localized and
risk-free changes. We might want to look at implementing them properly
at some point.

Finally, I combined the above with a a rather major change that broke
all sorts of things but resulted in some pretty impressive memory
savings:

* Use a properties Map in NodeState instead of individually addressed
and managed PropertyStates: 106MB ~ 43%

BR,

Jukka Zitting

Mime
View raw message