jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cris Daniluk" <cris.dani...@gmail.com>
Subject Re: workspace / repository scalability
Date Wed, 23 May 2007 15:11:47 GMT
> > When you say adequate hierarchical structure, does this imply that we should
> > try to keep our tree "bushy"? Really, because we rely on the external search
> > engine for location, we only direct query on sequential ID at the database.
> > Should a partitioning strategy be used? If so, what sort of depth might we
> > aim for?
> i see... i think it is important to mention that jackrabbit is not optimized for
> long lists of child nodes currently, so i would recommend to stay away if
> possible from more than a couple of hundred child nodes.
> as a guidance for hierarchy i usually use something like:
> "if i wouldn't do it in a filesystem, i don't do it in a content repository"
> (assuming that i view a node as a file or folder)
> so let's assume your sequential hex-id is something like "123abc" i would
> recommend something like a partitioning for the node structure as follows:
> /12/3a/bc which leaves you with 256 child nodes per node.

So some form of hashing sounds desirable. If the hashed nodes were
mapped to a separate node structure / workspace that did not hash to
create a logical view, would the performance impact still apply?

> personally, i like to use the derby persistence manager with external
> fs based blobs (standard setup). with this setup i do
> "hot backups" by just backing up the full repository folder in the filesystem.

GBs of data in derby of course makes me nervous on a probably
irrational level. The fs-based blobs is interesting, and may make DR
replication easier. You mentioned some testing... it would be great
for us to do similar testing with mock data reflecting our
environment. If the tests you performed included any special
harnesses, configuration, etc, would it be possible to see them?

Thanks for all your help!


View raw message