jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Stocker <christian.stoc...@liip.ch>
Subject Re: Add more options to make Jackrabbit more failsafe and/or scale-out
Date Mon, 02 May 2011 12:48:51 GMT


On 02.05.11 14:43, Bart van der Schans wrote:
> On Mon, May 2, 2011 at 1:39 PM, Christian Stocker
> <christian.stocker@liip.ch> wrote:
>> Hi all
>>
>> My favourite topic again. Building a fail-safe and/or scalable
>> jackrabbit setup.
>>
>> We had the wish to make our setup datacenter-fail resistant. eg. if one
>> DC goes down, we can still serve pages from a backup jackrabbit
>> instance. We use MySQL as  perstistant store, this is no given, but I
>> guess the problems are everywhere the same.
>>
>> With a traditional setup, if the main DC goes down, your Store goes down
>> with it and the jackrabbit instance in the other DC can't access it
>> anymore as well. That's why we thought about replicating the MySQL DB to
>> the 2nd DC and just read from there (we can make sure that nothing
>> writes to the backup jackrabbit instance). This works fine. As we can
>> already point the cluster journal "store" to another place than the PM,
>> we just point the journal store to the central one in the 1st DC and
>> read the data from the PM in the MySQL slave in the 2nd DC. A read-only
>> jackrabbit only has to write to the journal table and nowhere else
>> AFAIK, so that works well even with replicating MySQLs.
>>
>> All fine and good and even if the master MySQL goes down the Jackrabbit
>> instance in the 2nd DC serves its nodes as nothing happened.
>>
>> The one problem which there is left is that there's a replication lag
>> between the master and the slave MySQL (there's one, even if the sit
>> just besides each other). What can happen with this is that a writing
>> jackrabbit writes a new node and the journal entry and then the backup
>> jackrabbit reads from the journal (from the mysql master) but the actual
>> content didn't end up in the mysql slave (where the backup jackrabbit
>> reads its PM data from). This can easily be tested with stopping the
>> mysql replication.
>>
>> The solution I came up with was to read the journal entries also from
>> the MySQL slave (but still write the LOCAL_REVISION to the master). With
>> this we can make sure the jackrabbit in the 2nd DC only reads entries,
>> which are already in its mysql slave. A patch which makes this work is here
>>
>> https://gist.github.com/951467
>>
>> The only thing I had to change was to read the "selectRevisionsStmtSQL"
>> from the slave instead of the master, the rest can still go to the master.
>>
>> What do you think of this approach? Would this be worth adding to
>> jackrabbit? Any input for the patch what I could improve?
>>
>> Besides the fail-over scenario you also can easily do scaling with that
>> approach, so you can serve your "read-only" webpages from a totally
>> differnt DC without having too much traffic between the DCs (it's
>> basically just the MySQL replication traffic). That's why I didn't want
>> to read from the Master in the backup jackrabbit and only switch to the
>> replicating slave, when things fail (which would be a solution, too, of
>> course)
>>
>> any input is appreciated
> 
> I've played many times with the idea of creating some kind SlaveNode
> next to the ClusterNode which only needs read access to the database
> (slave). I don't think the local revision of the slave isn't much use
> to the master so that could be kept on disk locally with the slave.

AFAICT, the janitor needs to know, where all the cluster-instances are
to safely delete everything it isn't needed anymore. That's why it needs
to be stored in a central place.

chregu

-- 
Liip AG  //  Feldstrasse 133 //  CH-8004 Zurich
Tel +41 43 500 39 81 // Mobile +41 76 561 88 60
www.liip.ch // blog.liip.ch // GnuPG 0x0748D5FE


Mime
View raw message