jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Boston <...@tfd.co.uk>
Subject Re: [jr3] Search index in content
Date Mon, 22 Feb 2010 22:14:37 GMT

On 17 Feb 2010, at 15:47, Alexander Klimetschek wrote:

> Thus a broken search index must not break repository startup or it
> must be possible to delete it with a tool w/o requiring a full
> repository start.

Over in an earlier version of Sakai we have a distributed Lucene Index, where the index is
produced by all nodes in the cluster indexing  items and distributing the updates to segments.
On one hand this distributes the load well giving a single index over the whole cluster, however,
there is increased latency compared to the JR implementation, and, critically, when the index
gets corrupted, its often hard to recover. Snapshots are taken real time, but you have to
know how far back to go and then reindex. In 24 months of running JR 1.4 and this search index
we have had to recover the distributed search index several times, but not had to recover
the JR index once, although its become corrupted almost as many times (once or twice max).

The distributed index, uses local file storage and segment update shipping. We did try various
types of off machine storage but found them all to be way to slow. IMVHO seek performance
is critical to Lucene performance. 

Ian


Mime
View raw message