jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Klimetschek (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-642) Support flat content hierarchies
Date Thu, 30 Jul 2009 16:07:14 GMT

    [ https://issues.apache.org/jira/browse/JCR-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737154#action_12737154

Alexander Klimetschek commented on JCR-642:

> I think that limiting children to 100k nodes is artificial.

The good thing about that limitation is that it forces you to think about a good content model,
ie. one that is also browsable by a human. Nevertheless in some cases it might be good if
Jackrabbit would scale better with flat hierarchies.

> If a PersistenceManager would return a LazyNodeState that would itself contain a LazyChildNodeEntries

This has to work in both directions: if the long list of child node entries is created in
the first place, this happens in the transient space before the PM gets that node in order
to store it. Already at that early point things can get slow and memory-consuming (although
I have no measurements to present). Also, the implementation of the child node entries mus
therefore be working in both directions, ie. a LazyChildNodeEntries that fetches the entries
step by step is probably hard to make writeable. Last but not least it also depends on how
the list is actually used inside Jackrabbit.

> Support flat content hierarchies
> --------------------------------
>                 Key: JCR-642
>                 URL: https://issues.apache.org/jira/browse/JCR-642
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Jukka Zitting
> The current best practice with Jackrabbit is to avoid flat content structures due to
performance concerns.
> These concerns are caused by the fact that the NodeState implementation requires the
list of child node names and identifiers to be available at all times.  In fact many (all?)
current persistence managers implement this requirement by storing and loading this list as
a part of the serialized node state. When this list grows, the performance and memory overhead
of managing the list grows as well. As a side note, this also creates potential consistency
issues since the parent/child links are stored both within the child list of the parent node
and as the parent link of the child node.
> To solve this issue, I believe we need to break the tight bonding between the node state
and the list of child nodes. This will likely require major refactoring of the Jackrabbit
core, including breaking the NodeState and PersistenceManager interfaces, so I don't expect
a solution in near future. However, we should start thinking about how to best do this, and
at least be concerned about building in any more assumptions about the list of child nodes
always being readily available.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message