jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lukas Eder (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (JCR-3557) Inefficient deletion of ChangeLog objects by AbstractBundlePersistenceManager
Date Thu, 04 Apr 2013 09:41:16 GMT

     [ https://issues.apache.org/jira/browse/JCR-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lukas Eder updated JCR-3557:
----------------------------

    Attachment: Yourkit_Screenshot_SQL.png
                Yourkit_Screenshot_Stacktrace.png
    
> Inefficient deletion of ChangeLog objects by AbstractBundlePersistenceManager
> -----------------------------------------------------------------------------
>
>                 Key: JCR-3557
>                 URL: https://issues.apache.org/jira/browse/JCR-3557
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Lukas Eder
>            Priority: Minor
>         Attachments: Yourkit_Screenshot_SQL.png, Yourkit_Screenshot_Stacktrace.png
>
>
> Deleting a tree of nodes in Jackrabbit results in a prohibitive amount of SQL queries
being executed when using the BundleDbPersistenceManager. Please consider the attached Screenshots
from Yourkit Profiler, which illustrate how saving the deletion of a single node through the
JCR API results in 34 SQL DELETE statements for my test case:
> - Yourkit_Screenshot_Stacktrace.png: The stack trace from a single deletion. Please disregard
the absolute values of the time column. It is biased by the profiling session.
> - Yourkit_Screenshot_SQL.png: An extract of the 34 SQL statements issued by the selected
stack subtree
> While other operations are probably not very optimal either, this one is a very low hanging
fruit, in my opinion. Consider the following piece of code in AbstractBundlePersistenceManager.storeInternal():
> {code}
> for (ItemState state : changeLog.deletedStates()) {
>     if (state.isNode()) {
>         NodePropBundle bundle = getBundle((NodeId) state.getId());
>         if (bundle == null) {
>             throw new NoSuchItemStateException(state.getId().toString());
>         }
>         deleteBundle(bundle);
>         deleted.add(state.getId());
>     }
> }
> {code}
> Now, instead of iterating over deletedStates and loading / deleting them one by one,
they should be bulk-loaded / bulk-deleted as such:
> {code}
> List<NodeId> nodeIds = new ArrayList<NodeId>();
> for (ItemState state : changeLog.deletedStates()) {
>     if (state.isNode()) {
>         nodeIds.add((NodeId) state.getId());
>     }
> }
> List<NodePropBundle> bundles = new ArrayList<NodePropBundle>();
> bundles.addAll(getBundles(nodeIds)); // Can throw NoSuchItemStateException
> deleteBundles(bundles);
> deleted.addAll(nodeIds);
> {code}
> For backwards-compatibility and convenience, AbstractBundlePersistenceManager would provide
trivial default implementations for getBundles() and deleteBundles():
> {code}
> protected List<NodePropBundle> getBundles(Collection<? extends NodeId> nodeIds)
throws ItemStateException {
>     List<NodePropBundle> result = new ArrayList<NodePropBundle>();
>     for (NodeId nodeId : nodeIds) {
>         result.add(getBundle(nodeId));
>     }
>     return result;
> }
> protected void deleteBundles(Collection<? extends NodePropBundle> bundles) throws
ItemStateException {
>     for (NodePropBundle bundle : bundles) {
>         deleteBundle(bundle);
>     }
> }
> {code}
> But the BundleDbPersistenceManager could override this default behaviour by executing
something like
> {code}
> -- getBundles()
> select NODE_ID, BUNDLE_DATA from BUNDLE where NODE_ID in (?, ?, ...)
> -- deleteBundles()
> delete from BUNDLE where NODE_ID in (?, ?, ...)
> {code}
> The same also applies for deleting from the REFS table.
> If this seems like a viable improvement and if I didn't overlook something subtle where
one-by-one selection / deletion of bundle data is important, I could go ahead and provide
a patch for this issue. Once such an improvement is in place, other bundle persistence managers
might take advantage of the new algorithm as well. Also, we could review adding / updating
bundles in another JIRA ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message