jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Boston <...@tfd.co.uk>
Subject Re: Adding new nodes to a cluster
Date Tue, 14 Sep 2010 10:04:53 GMT

On 14 Sep 2010, at 20:01, Vidar Ramdal wrote:

> On Tue, Sep 14, 2010 at 11:48 AM, Ian Boston <ieb@tfd.co.uk> wrote:
>> 
>> On 14 Sep 2010, at 19:27, Vidar Ramdal wrote:
>> 
>>> We're setting up a clustered Jackrabbit application.
>>> The application has hight traffic, so we're concerned that the Journal
>>> table will be very large. This, in turn, will make setting up new
>>> nodes a time-consuming task, when the new node starts replaying the
>>> journal to get up to date.
>>> 
>>> At [1], the concept of the janitor is described, which cleans the
>>> journal table at certain intervals. However, the list of caveats
>>> states that "If the janitor is enabled then you loose the possibility
>>> to easily add cluster nodes. (It is still possible but takes detailed
>>> knowledge of Jackrabbit.)"
>>> 
>>> What detailed knowledge does this take? Can anyone give me some hints
>>> of what we need to look into?
>> 
>> Sure,
>> you need to take a good copy of the local state of a node and for good measure extract
the journal log number for that state from the journal revision file. (have a look at the
ClusterNode impl to locate it and the format IIRC its 2 binary longs)
>> 
>> Getting a good local state means one that is consistent with itself and wasn't written
to between the start and the end of the snapshot operation. If you have high write traffic
you almost certainly wont be able to snapshot the local state live, and will have to take
a node offline before taking a snapshot. If its high read low write you might be able to use
repetitive rsync operations to get a good snapshot.
> 
> Ian, thanks for your answer.
> 
> So a "good copy of the local state" should be possible by shutting
> down the source node before taking a snapshot. That is fine with us,
> at least for node > 2, as long as we can leave one node online.

yes

> 
>>> Also, we're not 100% sure we know what happens when a new node is
>>> added.
>> 
>> If there is no snapshot to start form, it will replay all journal records since record
0 to build the search index and anything else. If there is a snapshot it will read the record
number of the snapshot and replay from that point forwards.
>> 
>> Before using a snapshot you must make certain that all references to the server name
are correct in the snapshot (look in repository.xml after startup)
> 
> Yes, I know the cluster node ID in repository.xml - but is that the
> only place the ID is held?

AFAIK, yes.

> 
>>> We understand that the journal needs to be replayed so that the
>>> Lucene index kan be updated. But is the Lucene index the only thing
>>> that needs modification when a new node is started?
>> 
>> AFAIK yes,
>> 
>>> If so, should this procedure work:
>>> 1. Take a complete snapshot (disk image) of one of the live nodes -
>>> including the Lucene index
>>> 2. Use the disk image to setup a new node
>>> 4. Assign a new, uniqe cluster node ID to the new node
>> 
>> yes (didnt need to write all that I did above :) )
>> 
>>> 
>>> However, when trying this procedure, we still experienced that the new
>>> node replayed the entire journal.
>> 
>> hmm, did you get the local journal record number with the snapshot ?
> 
> Will have to double check that.
> 
> 
> -- 
> Vidar S. Ramdal <vidar@idium.no> - http://www.idium.no
> Sommerrogata 13-15, N-0255 Oslo, Norway
> + 47 22 00 84 00 / +47 22 00 84 76
> Quando omni flunkus moritatus!


Mime
View raw message