Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of rajkrrsingh@gmail.com
 designates 209.85.216.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAHD=qRTY3uyPh75epYY5y+ViN8x328NG+Hr7VUCOQmRjsMSsag@mail.gmail.com>
References: 
 <CAHD=qRTY3uyPh75epYY5y+ViN8x328NG+Hr7VUCOQmRjsMSsag@mail.gmail.com>
From: Raj K Singh <rajkrrsingh@gmail.com>
Date: Tue, 20 May 2014 13:09:00 +0530
Message-ID: 
 <CADcmSaq2zudShZQLo5Fs9h6EDhsoOXr-hAw5Ywsv2QG-uyOu7w@mail.gmail.com>
Subject: Re: Rack awareness and pipeline write
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=001a11c16db4ee95ef04f9cffad5

--001a11c16db4ee95ef04f9cffad5
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

it's fine two place 2 replica on the local rack's nodes first and then the
third replica on the different rack node if the replica count is 3.
now consider the scenario for the replication factor 2. if it place these
two replica on the same rack, then you can loose all of your replica when
the rack goes down.
hope it will clear your doubt.

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://in.linkedin.com/in/rajkrrsingh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Sun, May 11, 2014 at 8:25 AM, jianan hu <hujianan@gmail.com> wrote:

> Hi everyone,
>
> See HDFS documents, It says "For the common case, when the replication
> factor is three, HDFS=E2=80=99s placement policy is to put one replica on=
 one node
> in the local rack, another on a node in a different (remote) rack, and th=
e
> last on a different node in the same remote rack."
>
> Assume there are two racks A and B. According to rack awareness, the firs=
t
> block is put in rack A, and the the other two replicated blocks will be
> pushed into rack B.
>
> However, why not store the first and second replicas in the local rack
> (A), and the last in a different remote rack (B)? Both two scenarios have
> same network traffic. What's the disadvantage of it?
>
> Thanks.
>
> Best Regards,
> Jianan
>

--001a11c16db4ee95ef04f9cffad5
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-family:trebuche=
t ms,sans-serif">it&#39;s fine two place 2 replica on the local rack&#39;s =
nodes first and then the third replica on the different rack node if the re=
plica count is 3.</div>

<div class=3D"gmail_default" style=3D"font-family:trebuchet ms,sans-serif">=
now consider the scenario for the replication factor 2. if it place these t=
wo replica on the same rack, then you can loose all of your replica when th=
e rack goes down.</div>

<div class=3D"gmail_default" style=3D"font-family:trebuchet ms,sans-serif">=
hope it will clear your doubt.</div><div class=3D"gmail_extra"><br clear=3D=
"all"><div><div dir=3D"ltr"><font face=3D"trebuchet ms, sans-serif">:::::::=
:::::::::::::::::::::::::::::::::<br>

Raj K Singh</font><div><font face=3D"trebuchet ms, sans-serif"><a href=3D"h=
ttp://in.linkedin.com/in/rajkrrsingh" target=3D"_blank">http://in.linkedin.=
com/in/rajkrrsingh</a><br></font><div><font face=3D"trebuchet ms, sans-seri=
f"><a href=3D"http://www.rajkrrsingh.blogspot.com" target=3D"_blank">http:/=
/www.rajkrrsingh.blogspot.com</a><br>

Mobile=C2=A0 Tel: +91 (0)9899821370</font><br></div></div></div></div>
<br><br><div class=3D"gmail_quote">On Sun, May 11, 2014 at 8:25 AM, jianan =
hu <span dir=3D"ltr">&lt;<a href=3D"mailto:hujianan@gmail.com" target=3D"_b=
lank">hujianan@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmai=
l_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left=
:1ex">

<div dir=3D"ltr"><span style=3D"font-family:arial,sans-serif;font-size:14px=
">Hi everyone,</span><div style=3D"font-family:arial,sans-serif;font-size:1=
4px"><br></div><div style=3D"font-family:arial,sans-serif;font-size:14px">S=
ee HDFS documents, It says &quot;For the common case, when the replication =
factor is three, HDFS=E2=80=99s placement policy is to put one replica on o=
ne node in the local rack, another on a node in a different (remote) rack, =
and the last on a different node in the same remote rack.&quot;</div>


<div style=3D"font-family:arial,sans-serif;font-size:14px"><br></div><div s=
tyle=3D"font-family:arial,sans-serif;font-size:14px">Assume there are two r=
acks A and B. According to rack awareness, the first block is put in rack A=
, and the the other two replicated blocks will be pushed into rack B.</div>


<div style=3D"font-family:arial,sans-serif;font-size:14px"><br></div><div s=
tyle=3D"font-family:arial,sans-serif;font-size:14px">However, why not store=
 the first and second replicas in the local rack (A), and the last in a dif=
ferent remote rack (B)? Both two scenarios have same network traffic. What&=
#39;s the disadvantage of it?</div>


<div style=3D"font-family:arial,sans-serif;font-size:14px"><br></div><div s=
tyle=3D"font-family:arial,sans-serif;font-size:14px">Thanks.</div><div styl=
e=3D"font-family:arial,sans-serif;font-size:14px"><br></div><div style=3D"f=
ont-family:arial,sans-serif;font-size:14px">


Best Regards,</div><div style=3D"font-family:arial,sans-serif;font-size:14p=
x">Jianan</div></div>
</blockquote></div><br></div></div>

--001a11c16db4ee95ef04f9cffad5--