jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Trieloff (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-642) Support flat content hierarchies
Date Thu, 30 Jul 2009 16:31:14 GMT

    [ https://issues.apache.org/jira/browse/JCR-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737163#action_12737163

Lars Trieloff commented on JCR-642:

>> I think that limiting children to 100k nodes is artificial. 

I did not say that - because I know that this limitation is organic.

> The good thing about that limitation is that it forces you to think about a good content
model, ie. one that is also browsable by a human. 
> Nevertheless in some cases it might be good if Jackrabbit would scale better with flat

Yes, especially if there are aspects about your content model that are outside your control
and introducing a fake hierarchy only makes things more complicated at an application level.

As far as I can see, the other source of NodeStates is in the ItemStateManager, which is being
created by the RepositoryImpl with a reference to the PersistenceManager, so that there are
no API breaks necessary.

Making a lazy list writable is certainly hard, but not impossible. In the end following things
can happen:
- a node is being removed - this node has to be fetched before it can be removed
- a node is being added at the end of the list - easy
- a node is being added relative to another node - this other node has to be fetched beforehand

> Support flat content hierarchies
> --------------------------------
>                 Key: JCR-642
>                 URL: https://issues.apache.org/jira/browse/JCR-642
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Jukka Zitting
> The current best practice with Jackrabbit is to avoid flat content structures due to
performance concerns.
> These concerns are caused by the fact that the NodeState implementation requires the
list of child node names and identifiers to be available at all times.  In fact many (all?)
current persistence managers implement this requirement by storing and loading this list as
a part of the serialized node state. When this list grows, the performance and memory overhead
of managing the list grows as well. As a side note, this also creates potential consistency
issues since the parent/child links are stored both within the child list of the parent node
and as the parent link of the child node.
> To solve this issue, I believe we need to break the tight bonding between the node state
and the list of child nodes. This will likely require major refactoring of the Jackrabbit
core, including breaking the NodeState and PersistenceManager interfaces, so I don't expect
a solution in near future. However, we should start thinking about how to best do this, and
at least be concerned about building in any more assumptions about the list of child nodes
always being readily available.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message