jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Re: Storing indexes in the database
Date Sat, 09 Jun 2007 08:12:45 GMT
Hi,

On 6/6/07, woolly <p.barr@lbs-ltd.com> wrote:
> I currently have everything mapped to the database using the repository.xml
> below. Is it possible to also store the indexes in the database? Is that a
> good idea?

Unfortunately it is not possible at the moment. Even though the
Jackrabbit configuration format does allow you to specify a
<FileSystem/> within the <SearchIndex/> configuration entry, the
current org.apache.jackrabbit.core.query.lucene.SearchIndex class
ignores such configuration and always uses the local file system for
storing the search index.

The main reason for always using the local file system is performance.
Jackrabbit uses Lucene as the query engine, and Lucene accesses its
segment files using a random access pattern. Typical databases do not
support efficient random access of blob values, which essentially
prevents any decent search performance with a database backend.

> I'd like to have a "clean" application that doesn't create files
> on the file system at startup.

This is a common theme we are hearing from many users, so I think it's
worth repeating and perhaps pushing also in the issue tracker.
However, whether we should actually support that "feature" is a tricky
question.

Architecturally Jackrabbit occupies the same layer as RDBMs systems. A
content repository would clearly be backend component in a typical
n-tier deployment scenario. This suggests that ideally Jackrabbit
shouldn't even be relying on any external databases, and should
instead handle all storage, both item persistence and search indexes,
locally within the specified repository home directory. This is in
fact the scenario that the original Jackrabbit persistence layer was
designed for, and interestingly we are currently seeing some advanced
development ideas that are going back to a similar design.

However, at the moment the only way to achieve proper ACID features in
Jackrabbit is to use either an embedded or a remote RDBMS for
persistence. Also we currently do not have a high-performance remoting
layer, and native Jackrabbit backup tools are still severely lacking.
All these issues make remote database persistence for all Jackrabbit
content very desirable and I can well understand why many people are
asking for this.

BR,

Jukka Zitting

Mime
View raw message