accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yamini Joshi <yamini.1...@gmail.com>
Subject Re: Data Replication
Date Sun, 16 Oct 2016 15:32:49 GMT
In other words, what helps in load balancing? HDFS replication or Data
center replication?

Best regards,
Yamini Joshi

On Sat, Oct 15, 2016 at 10:44 PM, Yamini Joshi <yamini.1691@gmail.com>
wrote:

> So HDFS is for durability while replication is for availability? I'm
> assuming that the client is unaware of the replicated instance and queries
> the DB with no knowledge of which instance/table will return the result.
>
> Best regards,
> Yamini Joshi
>
> On Thu, Oct 13, 2016 at 11:46 AM, Josh Elser <josh.elser@gmail.com> wrote:
>
>> I'm not familiar with MongoDB. Perhaps someone else can confirm this for
>> you.
>>
>> Yamini Joshi wrote:
>>
>>> So, can I say that if I have a table split across nodes (i.e. num
>>> tablets > 1) and HDFS replication in my system, it is sort of equivalent
>>> to a sharded and replicated mongo architecture?
>>>
>>> Best regards,
>>> Yamini Joshi
>>>
>>> On Thu, Oct 13, 2016 at 11:06 AM, Josh Elser <josh.elser@gmail.com
>>> <mailto:josh.elser@gmail.com>> wrote:
>>>
>>>     The Accumulo (Data Center) Replication feature is for having
>>>     multiple active Accumulo clusters all containing the same data.
>>>
>>>     HDFS provides replication as a means for durability of the data it
>>>     is storing. The files that Accumulo creates on one HDFS instance are
>>>     replicated by HDFS. This does not help if your entire cluster become
>>>     unavailable. That is what the data center replication Accumulo
>>>     feature solves.
>>>
>>>     While both can be called "replication", they serve very different
>>>     purposes.
>>>
>>>
>>>     Yamini Joshi wrote:
>>>
>>>         Hello
>>>
>>>         I was going through some Accumulo docs and found out about
>>>         replication.
>>>         To enable replication,one needs to make some config settings as
>>>         described in
>>>         https://github.com/apache/accumulo/blob/master/docs/src/main
>>> /asciidoc/chapters/replication.txt
>>>         <https://github.com/apache/accumulo/blob/master/docs/src/mai
>>> n/asciidoc/chapters/replication.txt>.
>>>         I cannot seem to grasp the difference between this replication
>>>         conf and
>>>         the replication on HDFS level. What exactly is the use case for
>>>         replication? Are the replicated instances visible to the clients?
>>>
>>>         Best regards,
>>>         Yamini Joshi
>>>
>>>
>>>
>

Mime
View raw message