jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] Commented: (JCR-2218) NodeEntryImpl.getWorkspaceId() very inefficient
Date Thu, 16 Jul 2009 14:14:15 GMT

    [ https://issues.apache.org/jira/browse/JCR-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731953#action_12731953

Michael Dürig commented on JCR-2218:

To fix this I propose to cache the NodeId and/or path per NodeEntry. On certain operations
(like move, further operation to be identified) the cache needs to be invalidated. To avoid
having to invalidate the cache of each entry in the sub tree rooted at a specific item, I
propose that cache validity checks are deferred as much as possible (i.e. until getWorkspaceId()
is called). The cache for an entry is valid, if neither of its parents nor itself are marked
as invalid . If the cache for an entry is determined to be invalid, its path is recalculated
thereby clearing any invalid cache marker on the path to the root. Note that when a marker
of an entry is cleared, all child entries of that entry need to be marked (with the exception
of the child entry which path is being recalculated). 

> NodeEntryImpl.getWorkspaceId() very inefficient 
> ------------------------------------------------
>                 Key: JCR-2218
>                 URL: https://issues.apache.org/jira/browse/JCR-2218
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-jcr2spi
>            Reporter: Michael Dürig
> NodeEntryImpl.getWorkspaceId() calculates its path on each call by calling itself recursively.
Further each call to getWorkspaceId() results in various calls to the path and item factories
which might be somewhat expensive by themselves. 
> In my test scenario I have a RepositoryService.getItemInfos() call returning ~1000 items.
Processing these items results in about 2700000 (!) calls to getWorkspaceId(). Profiler data
shows, that 98% of the time to process the 1000 items is spent in getWorkspaceId()  and related

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message