jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] Commented: (JCR-2218) NodeEntryImpl.getWorkspaceId() very inefficient
Date Fri, 24 Jul 2009 12:13:14 GMT

    [ https://issues.apache.org/jira/browse/JCR-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12735039#action_12735039

Michael Dürig commented on JCR-2218:

I just checked the effect of an alternative approach: call site caching of IdFactory calls.

public NodeId getId() throws InvalidItemStateException, RepositoryException {
    IdFactory idFactory = getIdFactory();
    PathFactory pathFactory = getPathFactory();
    IdCache idCache = getIdCache();

    if (uniqueID != null) {
        NodeId nodeId = idCache.get(uniqueID);
        if (nodeId == null) {
            nodeId = idFactory.createNodeId(uniqueID);
            idCache.put(uniqueID, nodeId);
        return nodeId;
    else if (parent == null) { // root
        NodeId nodeId = idCache.get("ROOT");  
        if (nodeId == null) {
            nodeId = idFactory.createNodeId((String) null, pathFactory.getRootPath());
            idCache.put("ROOT", nodeId);  
        return nodeId;
    else {
        NodeId parentId = parent.getId();
        Name name = getName();
        int index = getIndex();
        NodeId nodeId = idCache.get(parentId, name, index);
        if (nodeId == null) {
            Path path = pathFactory.create(name, index);
            nodeId = idFactory.createNodeId(parentId, path);
            idCache.put(parentId, name, index, nodeId);
        return nodeId;

My profiling shows, that there is nothing much to be gained from this. This is in line with
an earlier observation, that looking up ItemIds from a hash map comes with about the same
cost as creating new itemIds. The main contribution coming from the equals and hashCode methods
from the various classes involved when comparing ItemIds. 

> NodeEntryImpl.getWorkspaceId() very inefficient 
> ------------------------------------------------
>                 Key: JCR-2218
>                 URL: https://issues.apache.org/jira/browse/JCR-2218
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-jcr2spi
>            Reporter: Michael Dürig
> NodeEntryImpl.getWorkspaceId() calculates its path on each call by calling itself recursively.
Further each call to getWorkspaceId() results in various calls to the path and item factories
which might be somewhat expensive by themselves. 
> In my test scenario I have a RepositoryService.getItemInfos() call returning ~1000 items.
Processing these items results in about 2700000 (!) calls to getWorkspaceId(). Profiler data
shows, that 98% of the time to process the 1000 items is spent in getWorkspaceId()  and related

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message