jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bertrand Delacretaz" <bdelacre...@apache.org>
Subject Re: Jackrabbit Scalability / Performance
Date Sat, 28 Apr 2007 11:07:23 GMT
On 4/28/07, Christoph Kiehl <christoph@sulu3000.de> wrote:

> ...Our current solution is to shutdown the
> repository for a short time start the rdbms backup and copy the index files.
> When index file copying is finished we startup the repository again...

Note that the Lucene-based Solr indexer
(http://lucene.apache.org/solr/) has a clever way of allowing online
backups of Lucene indexes, without having to stop anything (or for a
very short time only).

In short, it works like this:

-Solr can be configured to launch a "snapshotter" script at a point in
time when it's not writing anything to the index.

-The script takes a snapshot of the index files using hard links
(won't work on Windows AFAIK), which is very quick on Unixish

-Solr waits until the script is done (a few milliseconds I guess) and
resumes indexing.

-Another asynchronous backup script can then copy the snapshot
anywhere, from the hard linked files, without disturbing Solr.

This won't help for the RDBMS part, but implementing something similar
might help for online backups of index files.

See http://wiki.apache.org/solr/CollectionDistribution for more
details - the main goal described there is index replication, but it
obviously works for backups as well.


View raw message