jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Nuescheler" <david.nuesche...@gmail.com>
Subject Re: Scalability
Date Fri, 13 Apr 2007 08:26:39 GMT
Hi,

The good news first ;) :
Jackrabbit is designed to cluster a number of nodes backed by
a single RDBMS.
Please find more information on how to configure this here:
http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200611.mbox/%3C66c10f230611060808w3c8c33danc50c8035e4541e61@mail.gmail.com%3E

I would also like to comment on your observations:

(1)  How the data is stored in the Database largely depends
on the persistence manager used. The
BundlePersistenceManagers (which are the ones that that I
would recommend for a bigger DB backed installation), store
the representation of a Node and its properties in a compressed
binary format in the database.

(2) To satisfy the requirements of a content repository as specified
by JCR, I think it is not possible to use just the database index
anyway. In particular for features like inheritance, fulltext or searching
unstructured information in a fine grained fashion.
This is why Jackrabbit (just like any other repository implementation
that I am aware of) keeps an additional index.
This additional index is synched through clustering and
does not need to be backed-up, since it can be rebuilt from
the information in the database in a recovery scenario.
So a Jackrabbit instance can be cloned or restored entirely
by just restoring the Database and supplying the repository.xml.

regards,
david

On 4/13/07, FolDeRol <folderol@gmail.com> wrote:
> Dear team,
>
> Could anybody clarify me the situation with Jackrabbit's scalability?
>
> We are considering Jackrabbit as a back-end for a large application with
> high level of data flow in a clustered environment. When I started the
> evaluation of Jackrabbit having read that it could employ an RDBMS as a
> persistance layer, I though that we could set up a number of cluster nodes
> using Model 2 of deployment which would use the same logical instance
> (probably clustered) of the database and thus be scalable. I could not find
> any details on this, and decided to learn the database schema and trace JDBC
> calls so to estimate the performance.
>
> What was my wonder when I had known the truth. The data is stored in the
> RDBMS as a serialized Java objects and query operations are not handled by
> the RDBMS at all but rather directly by the Jackrabbit engine on indices
> stored on the file system. Now, I'm seriously alarmed that Jackrabbit might
> be inappropriate solution for our goal.
>
> Please someone confirm or deny my assumptions.
>
> Regards
>

Mime
View raw message