hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Baskar Duraikannu <baskar.duraika...@outlook.com>
Subject RE: Multidata center support
Date Tue, 03 Sep 2013 20:50:49 GMT
RahulAre you talking about rack-awareness script? 
I did go through rack awareness. Here are the problems with rack awareness w.r.to my (given)
"business requirment"
1.  Hadoop , default places two copies on the same rack and 1 copy on some other rack.  This
would work as long as we have two data centers. if business wants to have three data centers,
then data would not be spread across. Separately there is a question around whether it is
the right thing to do or not. I have been promised by business that they would buy enough
bandwidth such that each data center will be few milliseconds apart (in latency).
2. I believe Hadoop automatically re-replicates data if one or more node is down. Assume when
one out of 2 data center goes down. There will be a massive data flow to create additional
copies.  When I say data center support, I should be able to configure hadoop to say     
a) Maintain 1 copy per data center     b) If any data center goes down, dont create additional
Above requirements that I am pointing will essentially move hadoop from strongly consistent
to a week/eventual consistent model. Since this changes fundamental architecture, it will
probably break all sort of things... Might not be possible ever in Hadoop. 
SadakIs there a way to implement above requirement via Federation? 

Date: Sun, 1 Sep 2013 00:20:04 +0530
Subject: Re: Multidata center support
From: visioner.sadak@gmail.com
To: user@hadoop.apache.org

What do you think friends I think hadoop clusters can run on multiple data centers using FEDERATION

On Sat, Aug 31, 2013 at 8:39 PM, Visioner Sadak <visioner.sadak@gmail.com> wrote:

The only problem i guess hadoop wont be able to duplicate data from one data center to another
but i guess i can identify data nodes or namenodes from another data center correct me if
i am wrong

On Sat, Aug 31, 2013 at 7:00 PM, Visioner Sadak <visioner.sadak@gmail.com> wrote:

lets say that 
you have some machines in europe and some  in US I think you just need the ips and configure
them in your cluster set upit will work...

On Sat, Aug 31, 2013 at 7:52 AM, Jun Ping Du <jdu@vmware.com> wrote:

Hi,    Although you can set datacenter layer on your network topology, it is never enabled
in hadoop as lacking of replica placement and task scheduling support. There are some work
to add layers other than rack and node under HADOOP-8848 but may not suit for your case. Agree
with Adam that a cluster spanning multiple data centers seems not make sense even for DR case.
Do you have other cases to do such a deployment?

From: "Adam Muise" <amuise@hortonworks.com>

To: user@hadoop.apache.org
Sent: Friday, August 30, 2013 6:26:54 PM
Subject: Re: Multidata center support

Nothing has changed. DR best practice is still one (or more) clusters per site and replication
is handled via distributed copy or some variation of it. A cluster spanning multiple data
centers is a poor idea right now.

On Fri, Aug 30, 2013 at 12:35 AM, Rahul Bhattacharjee <rahul.rec.dgp@gmail.com> wrote:

My take on this.

Why hadoop has to know about data center thing. I think it can be installed across multiple
data centers , however topology configuration would be required to tell which node belongs
to which data center and switch for block placement.


On Fri, Aug 30, 2013 at 12:42 AM, Baskar Duraikannu <baskar.duraikannu@outlook.com>

We have a need to setup hadoop across data centers.  Does hadoop support multi data center
configuration? I searched through archives and have found that hadoop did not support multi
data center configuration some time back. Just wanted to see whether situation has changed.

Please help. 		 	   		  


Adam MuiseSolution EngineerHortonworks


Hortonworks - Develops, Distributes and Supports Enterprise Apache Hadoop.

Hortonworks Virtual Sandbox

Hadoop: Disruptive Possibilities by Jeff Needham

CONFIDENTIALITY NOTICENOTICE: This message is intended for the use of the individual or entity
to which it is addressed and may contain information that is confidential, privileged and
exempt from disclosure under applicable law. If the reader of this message is not the intended
recipient, you are hereby notified that any printing, copying, dissemination, distribution,
disclosure or forwarding of this communication is strictly prohibited. If you have received
this communication in error, please contact the sender immediately and delete it from your
system. Thank You.

View raw message