activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hiram Chirino <hi...@hiramchirino.com>
Subject Re: does master sync to disk on successful replication?
Date Fri, 17 May 2013 14:33:26 GMT
Hi Christian,

Ok. I've implemented being able to control how the syncs are done.
Take a peek at the doco for the sync property at:
https://cwiki.apache.org/confluence/display/ACTIVEMQ/Replicated+LevelDB+Store#ReplicatedLevelDBStore-ReplicatedLevelDBStoreProperties

Let me know what you think.

On Thu, May 9, 2013 at 10:05 AM, Christian Posta
<christian.posta@gmail.com> wrote:
> All,
> chatted with Hiram about how syncs on replicated leveldb works... didn't
> mean for it to be private :) I'm forwarding the email thread...
>
> See the discussion below and add any comments/thoughts as desired..
>
> Thanks,
> Christian
>
> ---------- Forwarded message ----------
> From: Hiram Chirino <chirino@gmail.com>
> Date: Thu, May 9, 2013 at 6:31 AM
> Subject: Re: does master sync to disk on successful replication?
> To: Christian Posta <christian.posta@gmail.com>
>
>
> Yeah think your right.. might be better of with something like
> syncTo="<type>":
> where <type> can be space separated list of:
>  * disk - Sync to the local disk
>  * replica - Sync to remote replica's memory
>  * replicaDisk - Sync to remote replicas disk.
>
> And we just default that to replica.
>
> On Thu, May 9, 2013 at 9:16 AM, Christian Posta
> <christian.posta@gmail.com> wrote:
>> But i think we need sync to be true for the replication as it stand right
>> now? If sync option is true then we hit this line in the client's store
>> method which is the hook into the replication:
>>
>>         if( syncNeeded && sync ) {
>>           appender.force
>>         }
>>
>> If we change to false, then replication won't be kicked off. We could
> remove
>> the && sync, but then persistent messages would be sync'd even if
>> sync==false... prob don't want.
>>
>> *might* need another setting "forceReplicationSyncToDisk" or something...
>> or.. move the replication out of the appender.force method... in activemq
>> 5.x you have the following in DataFileAppender which delegates to a
>> replicator:
>>
>>                 ReplicationTarget replicationTarget =
>> journal.getReplicationTarget();
>>                 if( replicationTarget!=null ) {
>>
>> replicationTarget.replicate(wb.writes.getHead().location, sequence,
>> forceToDisk);
>>                 }
>>
>>
>> On Thu, May 9, 2013 at 6:02 AM, Hiram Chirino <chirino@gmail.com> wrote:
>>>
>>> Yeah... perhaps we keep using the sync config option, just change the
>>> default to false in the replicated scenario.
>>>
>>> Very hard to verify proper operation of fsync.
>>>
>>> Best way I've found is by comparing performance of writes followed by
>>> fsync and and writes not followed by fsync.  Then looking at the
>>> numbers and comparing it to the hardware being used and seeing if it
>>> makes sense.  On a spinning disk /w out battery backed write cache,
>>> you should not get more than 100-300 writes per second /w fsync.  But
>>> once you start looking at SDDs or battery backed write cache hardware,
>>> then that assumption goes out the window.
>>>
>>>
>>> On Thu, May 9, 2013 at 8:48 AM, Christian Posta
>>> <christian.posta@gmail.com> wrote:
>>> > Your thoughts above make sense. Maybe we can add the option and leave
> it
>>> > disabled for now?
>>> > I can write a test for it and do it. As fsync vs fflush are quite OS
>>> > dependent, do you know of a good way to write tests to verify fsync?
>>> > Just
>>> > read the contents from the file?
>>> >
>>> >
>>> > On Wed, May 8, 2013 at 7:02 PM, Hiram Chirino <chirino@gmail.com>
> wrote:
>>> >>
>>> >> Nope. your not missing anything.  Instead of disk syncing, we are
>>> >> doing replica syncing.  If the master dies and he looses some of his
>>> >> recent log entries, it's not a big deal since we can recover from the
>>> >> log file of the slave.
>>> >>
>>> >> The only time you could possibly loose data is in the small likelihood
>>> >> that the master and the salve machines die at the same time.  But if
>>> >> that is likely to happen your really don't have a very HA deployment.
>>> >>
>>> >> But if folks do think that's a possibility, then perhaps we should add
>>> >> an option to really disk sync.
>>> >>
>>> >> On Wed, May 8, 2013 at 6:06 PM, Christian Posta
>>> >> <christian.posta@gmail.com> wrote:
>>> >> > Hey,
>>> >> >
>>> >> > Might be some trickery that I'm missing... but in the replication
>>> >> > sequence,
>>> >> > when the master writes to its log, it also tries to tell its slaves
>>> >> > about
>>> >> > the write (in the overridden log appender in MasterLevelDBClient,
> the
>>> >> > overridden methods force and flush... looks like we tell the slaves
>>> >> > about
>>> >> > our updates in flush by calling store.replicate_wal, and then we
> wait
>>> >> > for
>>> >> > acks in force by calling store.wal_sync_to(position).... what i'm
>>> >> > missing is
>>> >> > when file sync is required, the master doesn't do it. The force
>>> >> > method
>>> >> > in
>>> >> > the original LogAppender does the call to channel.force()... but
it
>>> >> > might be
>>> >> > missing in the overridden log appender. Do you see the same? Maybe
>>> >> > i'm
>>> >> > missing something...
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Christian Posta
>>> >> > http://www.christianposta.com/blog
>>> >> > twitter: @christianposta
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Regards,
>>> >> Hiram
>>> >>
>>> >> Blog: http://hiramchirino.com
>>> >>
>>> >> Open Source SOA
>>> >> http://fusesource.com/
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Christian Posta
>>> > http://www.christianposta.com/blog
>>> > twitter: @christianposta
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Hiram
>>>
>>> Blog: http://hiramchirino.com
>>>
>>> Open Source SOA
>>> http://fusesource.com/
>>
>>
>>
>>
>> --
>> Christian Posta
>> http://www.christianposta.com/blog
>> twitter: @christianposta
>
>
>
> --
> Regards,
> Hiram
>
> Blog: http://hiramchirino.com
>
> Open Source SOA
> http://fusesource.com/
>
>
>
> --
> *Christian Posta*
> http://www.christianposta.com/blog
> twitter: @christianposta



-- 
Hiram Chirino

Engineering | Red Hat, Inc.

hchirino@redhat.com | fusesource.com | redhat.com

skype: hiramchirino | twitter: @hiramchirino

blog: Hiram Chirino's Bit Mojo

Mime
View raw message