hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul Bhattacharjee <rahul.rec....@gmail.com>
Subject Re: Multidata center support
Date Wed, 04 Sep 2013 04:34:49 GMT
Under replicated blocks are also consistent from a consumers point. Care of
explain the relation to weak consistency to hadoop.

Thanks,
Rahul


On Wed, Sep 4, 2013 at 9:56 AM, Rahul Bhattacharjee <rahul.rec.dgp@gmail.com
> wrote:

> Adam's response makes more sense to me to offline replicate generated data
> from one cluster to another across data centers.
>
> Not sure if configurable block placement block placement policy is
> supported in Hadoop.If yes , then alone side with rack awareness , you
> should be able to achieve the same.
>
> I could not follow your question related to weak consistency.
>
> Thanks,
> Rahul
>
>
>
> On Wed, Sep 4, 2013 at 2:20 AM, Baskar Duraikannu <
> baskar.duraikannu@outlook.com> wrote:
>
>> Rahul
>> Are you talking about rack-awareness script?
>>
>> I did go through rack awareness. Here are the problems with rack
>> awareness w.r.to my (given) "business requirment"
>>
>> 1.  Hadoop , default places two copies on the same rack and 1 copy on
>> some other rack.  This would work as long as we have two data centers. if
>> business wants to have three data centers, then data would not be spread
>> across. Separately there is a question around whether it is the right thing
>> to do or not. I have been promised by business that they would buy enough
>> bandwidth such that each data center will be few milliseconds apart (in
>> latency).
>>
>> 2. I believe Hadoop automatically re-replicates data if one or more node
>> is down. Assume when one out of 2 data center goes down. There will be a
>> massive data flow to create additional copies.  When I say data center
>> support, I should be able to configure hadoop to say
>>      a) Maintain 1 copy per data center
>>      b) If any data center goes down, dont create additional copies.
>>
>> Above requirements that I am pointing will essentially move hadoop from
>> strongly consistent to a week/eventual consistent model. Since this changes
>> fundamental architecture, it will probably break all sort of things...
>> Might not be possible ever in Hadoop.
>>
>> Thoughts?
>>
>> Sadak
>> Is there a way to implement above requirement via Federation?
>>
>> Thanks
>> Baskar
>>
>>
>> ------------------------------
>> Date: Sun, 1 Sep 2013 00:20:04 +0530
>>
>> Subject: Re: Multidata center support
>> From: visioner.sadak@gmail.com
>> To: user@hadoop.apache.org
>>
>>
>> What do you think friends I think hadoop clusters can run on multiple
>> data centers using FEDERATION
>>
>>
>> On Sat, Aug 31, 2013 at 8:39 PM, Visioner Sadak <visioner.sadak@gmail.com
>> > wrote:
>>
>> The only problem i guess hadoop wont be able to duplicate data from one
>> data center to another but i guess i can identify data nodes or namenodes
>> from another data center correct me if i am wrong
>>
>>
>> On Sat, Aug 31, 2013 at 7:00 PM, Visioner Sadak <visioner.sadak@gmail.com
>> > wrote:
>>
>> lets say that
>>
>> you have some machines in europe and some  in US I think you just need
>> the ips and configure them in your cluster set up
>> it will work...
>>
>>
>> On Sat, Aug 31, 2013 at 7:52 AM, Jun Ping Du <jdu@vmware.com> wrote:
>>
>> Hi,
>>     Although you can set datacenter layer on your network topology, it is
>> never enabled in hadoop as lacking of replica placement and task scheduling
>> support. There are some work to add layers other than rack and node under
>> HADOOP-8848 but may not suit for your case. Agree with Adam that a cluster
>> spanning multiple data centers seems not make sense even for DR case. Do
>> you have other cases to do such a deployment?
>>
>> Thanks,
>>
>> Junping
>>
>> ------------------------------
>> *From: *"Adam Muise" <amuise@hortonworks.com>
>> *To: *user@hadoop.apache.org
>> *Sent: *Friday, August 30, 2013 6:26:54 PM
>> *Subject: *Re: Multidata center support
>>
>>
>> Nothing has changed. DR best practice is still one (or more) clusters per
>> site and replication is handled via distributed copy or some variation of
>> it. A cluster spanning multiple data centers is a poor idea right now.
>>
>>
>>
>>
>> On Fri, Aug 30, 2013 at 12:35 AM, Rahul Bhattacharjee <
>> rahul.rec.dgp@gmail.com> wrote:
>>
>> My take on this.
>>
>> Why hadoop has to know about data center thing. I think it can be
>> installed across multiple data centers , however topology configuration
>> would be required to tell which node belongs to which data center and
>> switch for block placement.
>>
>> Thanks,
>> Rahul
>>
>>
>> On Fri, Aug 30, 2013 at 12:42 AM, Baskar Duraikannu <
>> baskar.duraikannu@outlook.com> wrote:
>>
>> We have a need to setup hadoop across data centers.  Does hadoop support
>> multi data center configuration? I searched through archives and have found
>> that hadoop did not support multi data center configuration some time back.
>> Just wanted to see whether situation has changed.
>>
>> Please help.
>>
>>
>>
>>
>>
>> --
>> *
>> *
>> *
>> *
>> *Adam Muise*
>> Solution Engineer
>> *Hortonworks*
>> amuise@hortonworks.com
>> 416-417-4037
>>
>> Hortonworks - Develops, Distributes and Supports Enterprise Apache Hadoop.<http://hortonworks.com/>
>>
>> Hortonworks Virtual Sandbox <http://hortonworks.com/sandbox>
>>
>> Hadoop: Disruptive Possibilities by Jeff Needham<http://hortonworks.com/resources/?did=72&cat=1>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>>
>>
>>
>>
>

Mime
View raw message