jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergiy Shyrkov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCR-3572) Optimization in NodeImpl.getNodes(String) and getNodes(String[])
Date Thu, 18 Apr 2013 08:07:19 GMT

    [ https://issues.apache.org/jira/browse/JCR-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13634973#comment-13634973
] 

Sergiy Shyrkov commented on JCR-3572:
-------------------------------------

Do you mean the patch against trunk or against 2.2 branch?
                
> Optimization in NodeImpl.getNodes(String) and getNodes(String[])
> ----------------------------------------------------------------
>
>                 Key: JCR-3572
>                 URL: https://issues.apache.org/jira/browse/JCR-3572
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>    Affects Versions: 2.2.13, 2.4.3, 2.5.3, 2.6
>            Reporter: Sergiy Shyrkov
>            Priority: Minor
>
> We were doing some refactoring/optimization of our code (we are using Jackrabbit 2.2.x
code) and replacing in some places traversal over child nodes (node.getNodes()) with node.getNodes(namePattern)
as it suppose to speed up the things by only returning children, matching particular name
pattern (in our case those our internationalization nodes, starting with "translation-").
> Profiling the code after we noticed that node.getNodes(namePattern) is actually slower
than simple traversal over all children checking their name match.
> We can down to the org.apache.jackrabbit.util.ChildrenCollectorFilter which is used in
the NodeImpl.getNodes(String) and getNodes(String[])
> Even if the instance of the filter is created with collectProperties set to false the
parent (javax.jcr.util.TraversingItemVisitor.visit(Node)) still reads the properties of the
nodes and iterates over them which seems useless.
> Digging deeper it seems there is an optimization which can be done to getNodes(String)
and getNodes(String[]) earlier.
> Those methods could call the itemMgr.getChildNodes(), like the getNodes() does, passing
additionally namePattern / nameGlobs.
> The ItemManager.ChildNodeEntry() already iterates on ChildNodeEntry instances, which
have Name, so the filtering can be done here, i.e. much earlier, than in the LazyItemIterator.prefetchNext().
> The adjusted code for org.apache.jackrabbit.core.ItemManager.getChildNodes(), we've tested
on our side looks as follows:
>     synchronized NodeIterator getChildNodes(NodeId parentId)
>             throws ItemNotFoundException, AccessDeniedException, RepositoryException
{
>         return getChildNodesInternal(parentId, null, null);
>     }
>     synchronized NodeIterator getChildNodes(NodeId parentId, String namePattern, String[]
nameGlobs)
>             throws ItemNotFoundException, AccessDeniedException, RepositoryException
{
>         return getChildNodesInternal(parentId, namePattern, nameGlobs);
>     }
>     private NodeIterator getChildNodesInternal(NodeId parentId, String namePattern, String[]
nameGlobs)
>             throws ItemNotFoundException, AccessDeniedException, RepositoryException
{
>         sanityCheck();
>         ItemData data = getItemData(parentId);
>         if (!data.isNode()) {
>             String msg = "can't list child nodes of property " + parentId;
>             log.debug(msg);
>             throw new RepositoryException(msg);
>         }
>         ArrayList<ItemId> childIds = new ArrayList<ItemId>();
>         Iterator<ChildNodeEntry> iter = ((NodeState) data.getState()).getChildNodeEntries().iterator();
>         while (iter.hasNext()) {
>             ChildNodeEntry entry = iter.next();
>             // delay check for read-access until item is being built
>             // thus avoid duplicate check
>             
>             // check for name
>             if (namePattern == null && nameGlobs == null ||
>                     namePattern != null
>                     && ChildrenCollectorFilter.matches(sessionContext.getJCRName(entry.getName()),
namePattern)
>                     || nameGlobs != null
>                     && ChildrenCollectorFilter.matches(sessionContext.getJCRName(entry.getName()),
nameGlobs)) {
>                 childIds.add(entry.getId());
>             }
>         }
>         return new LazyItemIterator(sessionContext, childIds, parentId);
>     }
> and in the NodeImpl we call ItemManager.getChildNodes() as follows:
>     public NodeIterator getNodes(String namePattern) throws RepositoryException {
>         return getNodesInternal(namePattern, null);
>     }
>     public NodeIterator getNodes(final String[] nameGlobs)
>             throws RepositoryException {
>         return getNodesInternal(null, nameGlobs);
>     }
>     private NodeIterator getNodesInternal(final String namePattern, final String[] nameGlobs)
throws RepositoryException {
>         return perform(new SessionOperation<NodeIterator>() {
>             public NodeIterator perform(SessionContext context)
>                     throws RepositoryException {
>                 try {
>                     return itemMgr.getChildNodes((NodeId) id, namePattern, nameGlobs);
>                 } catch (ItemNotFoundException e) {
>                     throw new RepositoryException(
>                             "Failed to list child nodes of " + NodeImpl.this, e);
>                 } catch (AccessDeniedException e) {
>                     throw new RepositoryException(
>                             "Failed to list child nodes of " + NodeImpl.this, e);
>                 }
>             }
>             public String toString() {
>                 return "node.getNodes()";
>             }
>         });
>     }
> Could you please provide your feedback if the optimization makes sense and looks consistent
or if there are any issues with those changes?
> I could provide the patch of suggested changes. I just need to know against which branch
of Jackrabbit code it should be done.
> Thank you in advance!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message