accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Fuchs <>
Subject Re: Accumulo Between Two Centers (DR - disaster recovery)
Date Wed, 26 Sep 2012 14:43:09 GMT
Another way to say this is that cross-data center replication for Accumulo
is left to a layer on top of Accumulo (or the application space). Cassandra
supports a mode in which you can have a bigger write replication than write
quorum, allowing writes to eventually propagate and reads to happen on
stale versions of the data. This increases availability at the cost of
consistency, which is important when dealing with links that are less
reliable or higher latency (but does nothing special for lower bandwidth
links). Cassandra, running in this mode, leaves dealing with eventual
consistency to the application space, which might be only slightly less
challenging than implementing a cross-data center replication scheme.


On Wed, Sep 26, 2012 at 9:46 AM, Eric Newton <> wrote:

> I think you're talking about 2 different things.
> Accumulo is architected to run on fast connections.  If you add one
> slowly connected computer, generally speaking, it will make everything
> run slowly.
> Replication is typically used to send copies from one data center to
> another, so that each has a local copy.  Typically, the trick uses
> extra latency in updates to the copies to compensate for the
> relatively slow connections between data centers.
> Accumulo does not presently support replication.  See ACCUMULO-378.
> -Eric
> On Wed, Sep 26, 2012 at 8:08 AM, Christopher Tubbs <>
> wrote:
> > I believe Accumulo can work across data centers, if the underlying DFS
> > span data centers. I also believe the latency tolerance is
> > configurable, and matters for servers holding locks in Zookeeper and
> > heartbeat messages to the Master. I'm not sure what the defaults for
> > these are, though.
> >
> > On Wed, Sep 26, 2012 at 8:00 AM, David Medinets
> > <> wrote:
> >> I recall a conversation in which people were pointed to Cassandra for
> >> its ability to replicate between data centers. I have forgotten what
> >> Accumulo offers on this topic. And does latency matter? If latency
> >> matters, what is the highest acceptable latency?

View raw message