jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ard Schrijvers <a.schrijv...@onehippo.com>
Subject Re: Adding new nodes to a cluster
Date Tue, 14 Sep 2010 11:14:57 GMT
On Tue, Sep 14, 2010 at 11:55 AM, Ian Boston <ieb@tfd.co.uk> wrote:
> On 14 Sep 2010, at 19:41, Ard Schrijvers wrote:
>>> 1. Take a complete snapshot (disk image) of one of the live nodes -
>>> including the Lucene index
>> Although I am not to familiar with the clustered setup (others at my
>> company are), I know that this is not possible unfortunately. The
>> problem is that the most recent Lucene index is an in-memory one. You
>> cannot get correct snapshots from the index. It is something I'd love
>> to get improved in some time in Jackrabbit.
> OMG, Ard,  no, really, did that come in with Jackrabbit 2 or was it also the case in
JR1.x ?

This was part of JR1.x. Back in the days, when re-opening an index
reader in Lucene was not possible, there was a very good reason for
doing so. That's why (I wasn't involved yet though, so I am a little
guessing but qute confident), jr originally build a very nice state of
the art multi-index solution, with merging between indexes just like
segments. This combined with read only indexreaders, internal memory
bitsets keeping track of deleted lucene docs etc etc, was I think
ahead of Lucene. However, Lucene now does pretty much all Jackrabbit
needed to solve internally within a single lucene index. I did some
testing, and I see no reason anymore to keep a multi-index. Our needs
are very comparable to Hibernate, however, they had the luck they
started with Lucene when this was already in place.

Current jackrabbit lucene architecture also doesn't fit something like
infinispan LuceneDirectory (it needs a reopen() on every call), which
would be a very nice thing to be able to use. Anyway, tons of ideas I
have, all I need is some (much) time :-(((

> Is there anyway to get it pushed to disk when a snapshot is required.

I don't think so. Lucene has these day a snapshot policy you can make
use of. Didn't look at it, but it would be one of the things I'd
certainly like to have (just as rebuilding an entire index in the
background to replace an existing one when it is finished)

> We  (also) have a requirement to run in a cluster and we cant a) hold a journal going
back to 0, that would soon be impractical and b) cant take nodes offline to snapshot
> Well we could do (b) but it would be very bad politically.

I would have to ask how our sysadmins do it...I unfortunately do not
know this, but I know it should be easier then it currently is

Regards Ard

> Ian

View raw message