qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Praveen M <lefthandma...@gmail.com>
Subject Re: Qpid Java Broker High Availability solution?
Date Fri, 20 Jan 2012 16:34:23 GMT
Ah. okie, got it :) I was wondering if you were using some replication
software that augments BDB that I wasn't aware of.

A SAN explains your architecture. Thanks a lot for writing back :)

On Fri, Jan 20, 2012 at 8:29 AM, Rob Godfrey <rob.j.godfrey@gmail.com>wrote:

> On 20 January 2012 17:13, Praveen M <lefthandmagic@gmail.com> wrote:
>
> > Hi Rob,
> >
> > Thanks for writing. Please see inline.
> >
> > On Fri, Jan 20, 2012 at 1:35 AM, Rob Godfrey <rob.j.godfrey@gmail.com
> > >wrote:
> >
> > > Hi Praveen,
> > >
> > > On 14 January 2012 02:47, Praveen M <lefthandmagic@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > >   Are there any java broker high availability/clustering solutions
> that
> > > > are currently present? I tried googling around and didn't find
> anything
> > > to
> > > > my luck.
> > > >
> > > > Can you please suggest a HA strategy that you've used working with
> the
> > > Qpid
> > > > Java Broker?
> > > >
> > > >
> > > So where I work we have two separate strategies for "HA" and disaster
> > > recovery.
> > >
> > > For HA we use synchronous replication of the BDB store, with external
> > > software monitoring the availability of the primary broker machine.  If
> > the
> > > primary broker machine goes down, the external software starts up the
> > > secondary broker machine, which points to the synchronously replicated
> > > instance of the store... it can also handle reassignment of the IP
> > address
> > > / DNS name.
> > >
> >
> > *Is there a reason that you use an external software to monitor the
> > availability of the primary broker machine.?*
> > *Shouldn't the connection failover model be sufficient enough for this?
> Or
> > does the failover model have any limitations? *
> > *
> >
> >
> The JMS clients failover automatically, the architectural design was not
> driven by limits in the failover model... however the HA solution is not
> focused solely on Qpid and aims to provide a service which is as seamless
> as possible to end user applications
>
>
> > *Also, you mention synchronous replication of BDB. Can you please write a
> > bit about how you go about doing this? I think with syncCommit false,
> sync
> > replication could be something that could work for us too without
> > really jeopardizing the enqueue latencies.*
> >
> >
> >
> The synchronous replication in our case is done at the "hardware" level.
> The storage attached to the machines provides this replication.
>
>
> > > For DR we take regular snapshots of the BDB store files and ship these
> > > using an FTP-like mechanism to a DR site.  Clearly with this solution
> you
> > > run the risk of loss as you only have a snapshot from a known point in
> > > time, not from the very moment the system went down.
> > >
> > > *Ah yes, this runs the risk of losing messages. Did you not consider a
> > synchronous replication in this case too?*
> >
>
> DR sites are necessarily far enough away from primary sites to make
> synchronous replication (at least at the storage level) impractical.
>
>
> > *Or is it because of the distance of the DR site that could contribute to
> > high latency round trips. Just curious.*
> >
> >
> Exactly.
>
> In general the message broker forms only one part of an application, in a
> DR scenario many different components with their own stores will have to be
> restarted.  At this point the application design needs to be able to
> recover - most importantly applications need to tolerate duplicates cause
> by replaying from a point earlier in time than the point at which failure
> occurred.
>
>
> > In our model our transaction store which contains a copy of the message
> > will be DR'ed.
> >
> >
> > > > I found a Message Federation design proposal document, but I'm
> guessing
> > > > it's not implemented yet (Please correct me if I'm wrong).
> > > >
> > > >
> > > There is an alpha/beta implementation of Message Federation in the Java
> > > Broker, which follows the same design as that in the C++ broker and
> uses
> > > the same toolset to create routes.  This code is broken in the most
> > recent
> > > releases of the Java Broker, but should work "better" from trunk...
> > however
> > > I'm not going to give any guarantees on it's suitability for a
> production
> > > system right now (I hope to be doing some serious testing/fixing over
> the
> > > next couple of months).
> > >
> > >
> > > > I plan to spin off two brokers on two different machines and use a
> > > failover
> > > > connection model to route messages to one if the other goes down.
> This
> > > > works well for message enqueues.
> > > > But still, I'd run the risk of not being able to process the messages
> > in
> > > > the broker that just went down (until it's back up). It will be nice
> to
> > > > know if someone had solved a similar problem by other
> > > > strategies/solutions available with the broker.
> > > >
> > > > Also, has someone tried replicating the database used for
> > > > the persistent store to solve this problem (BDB/Derby ?)
> > > >
> > > >
> > > As above, we use replication, but managed by hardware/external
> software.
> > > I've not yet tried using BDB's own HA solutions to provide replication.
> > >
> > > *well. Is the replication  too driven by an external software. I'm
> > curious on how you go about doing a synchronous*
> > *replication with BDB (as this is the route that we might want to take).
> > Any tips here will be useful. *
> > *
> > *
> >
>
> As above the replication I describe is at the storage level. Essentially
> we're talking about facilities offered by certain Storage Area Network
> products :-)
>
>
> > *If you are allowed to talk about the hardware/external software piece
> I'd
> > love to hear more about your HA*
> > *architecture. (I do understand sometimes NDAs might stop you. If so,
> it's
> > okie).*
> >
> >
> >
> We use a standard commercial High Availability Cluster software for this
> purpose, I'm not really at liberty to say which of these products we use -
> but I imagine that all are equally functional in this area.
>
> Cheers,
> Rob
>



-- 
-Praveen

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message