jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From FolDeRol <folde...@gmail.com>
Subject Re: Scalability
Date Fri, 13 Apr 2007 08:59:41 GMT

Sounds optimistic. Some follow-up questions...

What are BundlePersistenceManagers? I could not find ones in source code of
JR 1.2.3.

It seems to me that synchronizing repository changes using a network file
system is not reliable and fast enough. How do you think, could
this approach be extended so to reuse JBoss/JGroups features of cluster-wide
object replication, and how difficult this might be? We would probably
contribute to implement this feature in case it is feasible.

Another question is in which cases Jackrabbit decides the indices are
inconsistent and should be rebuilt from the persistance storage? I did not
noted that this operation is performed every time the server starts. Is this
operation performed on the whole bunch of data or it can cover a specified
set of items? Once, my database was purged but local indices remained
intact, and I always saw warning messages on the console that nodes with
particular ids were not found. These messages continued to appear even after
restart of the server unless I deleted the indices too.


On 4/13/07, David Nuescheler <david.nuescheler@gmail.com> wrote:
> Hi,
> The good news first ;) :
> Jackrabbit is designed to cluster a number of nodes backed by
> a single RDBMS.
> Please find more information on how to configure this here:
> http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200611.mbox/%3C66c10f230611060808w3c8c33danc50c8035e4541e61@mail.gmail.com%3E
> I would also like to comment on your observations:
> (1)  How the data is stored in the Database largely depends
> on the persistence manager used. The
> BundlePersistenceManagers (which are the ones that that I
> would recommend for a bigger DB backed installation), store
> the representation of a Node and its properties in a compressed
> binary format in the database.
> (2) To satisfy the requirements of a content repository as specified
> by JCR, I think it is not possible to use just the database index
> anyway. In particular for features like inheritance, fulltext or searching
> unstructured information in a fine grained fashion.
> This is why Jackrabbit (just like any other repository implementation
> that I am aware of) keeps an additional index.
> This additional index is synched through clustering and
> does not need to be backed-up, since it can be rebuilt from
> the information in the database in a recovery scenario.
> So a Jackrabbit instance can be cloned or restored entirely
> by just restoring the Database and supplying the repository.xml.
> regards,
> david
> On 4/13/07, FolDeRol <folderol@gmail.com> wrote:
> > Dear team,
> >
> > Could anybody clarify me the situation with Jackrabbit's scalability?
> >
> > We are considering Jackrabbit as a back-end for a large application with
> > high level of data flow in a clustered environment. When I started the
> > evaluation of Jackrabbit having read that it could employ an RDBMS as a
> > persistance layer, I though that we could set up a number of cluster
> nodes
> > using Model 2 of deployment which would use the same logical instance
> > (probably clustered) of the database and thus be scalable. I could not
> find
> > any details on this, and decided to learn the database schema and trace
> > calls so to estimate the performance.
> >
> > What was my wonder when I had known the truth. The data is stored in the
> > RDBMS as a serialized Java objects and query operations are not handled
> by
> > the RDBMS at all but rather directly by the Jackrabbit engine on indices
> > stored on the file system. Now, I'm seriously alarmed that Jackrabbit
> might
> > be inappropriate solution for our goal.
> >
> > Please someone confirm or deny my assumptions.
> >
> > Regards
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message