jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Reutegger (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCR-3537) Large number of SQL queries when adding nodes with version history
Date Thu, 04 Apr 2013 15:51:19 GMT

    [ https://issues.apache.org/jira/browse/JCR-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622435#comment-13622435

Marcel Reutegger commented on JCR-3537:

Thanks for the test and the instructions. This is very helpful.

The child nodes in question are part of the version storage content structure. E.g. a version
history node may have a path like this: {{/jcr:system/jcr:versionStorage/40/ed/38/40ed38c9-cb13-41dc-8602-0249e2c1b4c2}}

Now, whenever a new versionable node is created, a new version history is created and it may
happen that a new intermediate node must be created. Todd identified this already in {{InternalVersionManagerBase#getParentNode}}.
With the current implementation and the recursive behavior also all siblings of the added
node are checked whether they are modified. This may lead to up to 255 additional calls, until
all intermediate nodes are present (from '00' to 'ff' on the three levels).

The provided patch looks good and IMO addressed the problem correctly. There may be other
cases where the recursive call is unnecessary, but we can address those also later.
> Large number of SQL queries when adding nodes with version history
> ------------------------------------------------------------------
>                 Key: JCR-3537
>                 URL: https://issues.apache.org/jira/browse/JCR-3537
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: versioning
>    Affects Versions: 2.5
>         Environment: Windows 2008, tomcat application server, SQL Server 2008 database
>            Reporter: Todd Pagni
>              Labels: newbie, patch
>         Attachments: jackrabbit-core-2.5.0-version-history.patch, jackrabbit-debug.rar
> We are adding a large number of documents to a jackrabbit 2.5 database repository.  We
are using the bundle.MSSqlPersistenceManager and we are seeing a large number of SQL queries
(300+) when adding a single folder, file, and file content. This appears to create a significant
performance bottleneck when adding documents when the repository size is over 300k documents/nodes.
The repository structure is a hierarchy with less than 1000k child nodes per parent.   The
following is an example structure of the repo with the (New child folder) representing the
new content being added. 
> -- Root node  
> -- Parent node  
>                 --New child folder (mix:versionable,mix:lockable)
>                                 --new file (mix:versionable,mix:lockable)
>                    --new document content
> --Existing Child Folder
> -- Parent node                                  
> The vast majority of the 200-300+ queries that execute when adding a node look like the
> exec sp_execute 2,0x5740D9A36F2E4032BFF0BA652D89FFB8
> exec sp_execute 2,0xBBFE059BF7E44947A8B0858F3CE33DB8
> exec sp_execute 2,0xC2AD22DBE1DB43A083BCA1B2C94E07CC
> The majority of the queries that are executed appear to be related to versioning.  When
a node is added the version history for the node stored/saved, the parent node is saved, which
ultimately cascades and saves all children of the parent, so adding a child node saves the
parent and all other children.  
> We have created a patch for jackrabbit-core 2.5.0 to prevent the cascade to store all
other child nodes when saving/storing the version history of a new node.  This cuts the number
of queries that are executed in half.  Does anyone see a problem with this technique?   All
unit tests are still passing.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message