accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Data Replication
Date Thu, 13 Oct 2016 16:06:06 GMT
The Accumulo (Data Center) Replication feature is for having multiple 
active Accumulo clusters all containing the same data.

HDFS provides replication as a means for durability of the data it is 
storing. The files that Accumulo creates on one HDFS instance are 
replicated by HDFS. This does not help if your entire cluster become 
unavailable. That is what the data center replication Accumulo feature 
solves.

While both can be called "replication", they serve very different purposes.

Yamini Joshi wrote:
> Hello
>
> I was going through some Accumulo docs and found out about replication.
> To enable replication,one needs to make some config settings as
> described in
> https://github.com/apache/accumulo/blob/master/docs/src/main/asciidoc/chapters/replication.txt.
> I cannot seem to grasp the difference between this replication conf and
> the replication on HDFS level. What exactly is the use case for
> replication? Are the replicated instances visible to the clients?
>
> Best regards,
> Yamini Joshi

Mime
View raw message