jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bart van der Schans <b.vandersch...@onehippo.com>
Subject Re: Add more options to make Jackrabbit more failsafe and/or scale-out
Date Wed, 11 May 2011 10:01:29 GMT
Hi Christian,

Nice blog post!

I've chosen a slightly different approach because the write connection
back from the slave node is usually not allowed due to firewall rules
with our customers. Connections going "out" are allowed from an
intranet to a dmz, but connections going back from the dmz to the
intranet are not.

I've created a DatabaseSlaveJournal that reads the journal from the
database, but keeps it's own/local state on the filesystem like the
FileSystemJournal does. The diff can be viewed at:

https://github.com/schans/jackrabbit/commit/e2e3842321a62b57f7749dcf12c23e21f00c1474

Everything seems to be working just fine. There is one minor issue at
shutdown though: the LockManagerImpl.close() calls save() which which
calls DatabaseFileSystem$2.close  tries to do an update to the
database:

ERROR [org.apache.jackrabbit.core.util.db.ConnectionHelper$RetryManager.doTry():462]
Failed to execute SQL (stacktrace on DEBUG
 log level)
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: UPDATE
command denied to user 'user'@'localhost' for table 'DEFAULT_FSENTRY'
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
        at com.mysql.jdbc.Util.getInstance(Util.java:382)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3593)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3525)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1986)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2140)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2626)
        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2111)
        at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1362)
        at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172)
        at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172)
        at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172)
        at org.apache.jackrabbit.core.util.db.ConnectionHelper.execute(ConnectionHelper.java:438)
        at org.apache.jackrabbit.core.util.db.ConnectionHelper.reallyExec(ConnectionHelper.java:284)
        at org.apache.jackrabbit.core.util.db.ConnectionHelper$1.call(ConnectionHelper.java:267)
        at org.apache.jackrabbit.core.util.db.ConnectionHelper$1.call(ConnectionHelper.java:263)
        at org.apache.jackrabbit.core.util.db.ConnectionHelper$RetryManager.doTry(ConnectionHelper.java:458)
        at org.apache.jackrabbit.core.util.db.ConnectionHelper.exec(ConnectionHelper.java:263)
        at org.apache.jackrabbit.core.fs.db.DatabaseFileSystem$2.close(DatabaseFileSystem.java:732)
        at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:301)
        at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:130)
        at java.io.OutputStreamWriter.close(OutputStreamWriter.java:216)
        at java.io.BufferedWriter.close(BufferedWriter.java:248)
        at org.apache.commons.io.IOUtils.closeQuietly(IOUtils.java:160)
        at org.apache.jackrabbit.core.lock.LockManagerImpl.save(LockManagerImpl.java:356)
        at org.apache.jackrabbit.core.lock.LockManagerImpl.close(LockManagerImpl.java:218)
        at org.apache.jackrabbit.core.RepositoryImpl$WorkspaceInfo.doDispose(RepositoryImpl.java:2222)
        at org.apache.jackrabbit.core.RepositoryImpl$WorkspaceInfo.dispose(RepositoryImpl.java:2157)
        at org.apache.jackrabbit.core.RepositoryImpl.doShutdown(RepositoryImpl.java:1114)
        at org.apache.jackrabbit.core.RepositoryImpl.shutdown(RepositoryImpl.java:1065)

Does anybody have some pointer why it's trying to write? It seem to
shutdown just fine anyway.. It doesn't make much sense to me to create
a ReadOnlyDatabaseFileSystem just to avoid this shutdown issue.

I think the solution I created is a bit more narrow as it only aims to
allow to run JR on a database slave node independently. I think
Christian's idea can be expanded to separate all write database
operations from read operations in JR, which would allow for scaling
out in heavy read dependent environments  with multiple db slaves
while keeping a single db master for handling all writes.

I'm really interested in how other devs dealt with database slaves and
their opinion about the solutions provided by Christian and me.

Regards,
Bart

On Wed, May 4, 2011 at 3:25 PM, Christian Stocker
<christian.stocker@liip.ch> wrote:
> Hi All
>
> I made a broader blog post to the topic at
>
> http://blog.liip.ch/archive/2011/05/04/how-to-make-jackrabbit-globally-distributable-fail-safe-and-scalable-in-one-go.html
>
> How should I proceed if I'd like to get that into Jackrabbit? Just open
> a ticket, add the patch and hopefully someone takes it from there? Or do
> you think it doesn't have a chance to get into the jackrabbit-core?
>
> greetings
>
> chregu
>
> On 02.05.11 14:48, Christian Stocker wrote:
>>
>>
>> On 02.05.11 14:43, Bart van der Schans wrote:
>>> On Mon, May 2, 2011 at 1:39 PM, Christian Stocker
>>> <christian.stocker@liip.ch> wrote:
>>>> Hi all
>>>>
>>>> My favourite topic again. Building a fail-safe and/or scalable
>>>> jackrabbit setup.
>>>>
>>>> We had the wish to make our setup datacenter-fail resistant. eg. if one
>>>> DC goes down, we can still serve pages from a backup jackrabbit
>>>> instance. We use MySQL as  perstistant store, this is no given, but I
>>>> guess the problems are everywhere the same.
>>>>
>>>> With a traditional setup, if the main DC goes down, your Store goes down
>>>> with it and the jackrabbit instance in the other DC can't access it
>>>> anymore as well. That's why we thought about replicating the MySQL DB to
>>>> the 2nd DC and just read from there (we can make sure that nothing
>>>> writes to the backup jackrabbit instance). This works fine. As we can
>>>> already point the cluster journal "store" to another place than the PM,
>>>> we just point the journal store to the central one in the 1st DC and
>>>> read the data from the PM in the MySQL slave in the 2nd DC. A read-only
>>>> jackrabbit only has to write to the journal table and nowhere else
>>>> AFAIK, so that works well even with replicating MySQLs.
>>>>
>>>> All fine and good and even if the master MySQL goes down the Jackrabbit
>>>> instance in the 2nd DC serves its nodes as nothing happened.
>>>>
>>>> The one problem which there is left is that there's a replication lag
>>>> between the master and the slave MySQL (there's one, even if the sit
>>>> just besides each other). What can happen with this is that a writing
>>>> jackrabbit writes a new node and the journal entry and then the backup
>>>> jackrabbit reads from the journal (from the mysql master) but the actual
>>>> content didn't end up in the mysql slave (where the backup jackrabbit
>>>> reads its PM data from). This can easily be tested with stopping the
>>>> mysql replication.
>>>>
>>>> The solution I came up with was to read the journal entries also from
>>>> the MySQL slave (but still write the LOCAL_REVISION to the master). With
>>>> this we can make sure the jackrabbit in the 2nd DC only reads entries,
>>>> which are already in its mysql slave. A patch which makes this work is here
>>>>
>>>> https://gist.github.com/951467
>>>>
>>>> The only thing I had to change was to read the "selectRevisionsStmtSQL"
>>>> from the slave instead of the master, the rest can still go to the master.
>>>>
>>>> What do you think of this approach? Would this be worth adding to
>>>> jackrabbit? Any input for the patch what I could improve?
>>>>
>>>> Besides the fail-over scenario you also can easily do scaling with that
>>>> approach, so you can serve your "read-only" webpages from a totally
>>>> differnt DC without having too much traffic between the DCs (it's
>>>> basically just the MySQL replication traffic). That's why I didn't want
>>>> to read from the Master in the backup jackrabbit and only switch to the
>>>> replicating slave, when things fail (which would be a solution, too, of
>>>> course)
>>>>
>>>> any input is appreciated
>>>
>>> I've played many times with the idea of creating some kind SlaveNode
>>> next to the ClusterNode which only needs read access to the database
>>> (slave). I don't think the local revision of the slave isn't much use
>>> to the master so that could be kept on disk locally with the slave.
>>
>> AFAICT, the janitor needs to know, where all the cluster-instances are
>> to safely delete everything it isn't needed anymore. That's why it needs
>> to be stored in a central place.
>>
>> chregu
>>
>



-- 
Hippo
----------------------------------------------------------------------------------------------
Europe  •  Amsterdam  Oosteinde 11  •  1017 WT Amsterdam  •  +31 (0)20 522 4466
USA  • San Francisco  755 Baywood Drive  •  Petaluma CA. 94954 •  +1
(877) 414 4776
Canada    •   Montréal  5369 Boulevard St-Laurent #430 •  Montréal QC
H2T 1S5  •  +1 (707) 658-4535
----------------------------------------------------------------------------------------------
www.onehippo.com  •  www.onehippo.org  •  info@onehippo.com
----------------------------------------------------------------------------------------------

Mime
View raw message