Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of dejan.menges@gmail.com
 designates 209.85.217.178 as permitted sender)
MIME-Version: 1.0
From: Dejan Menges <dejan.menges@gmail.com>
Date: Wed, 10 Jun 2015 11:22:15 +0000
Message-ID: 
 <CAEf6Z5Je4QA7vZ1bmZxp0hU-EKTQO=Td44uJ_9Z0XCSKi75X2Q@mail.gmail.com>
Subject: When is DataNode 'bad'?
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Content-Type: multipart/alternative; boundary=001a11c317e8f79a620518281684

--001a11c317e8f79a620518281684
Content-Type: text/plain; charset=UTF-8

Hi,

>From time to time I see some reduces failing with this:

Error: java.io.IOException: Failed to replace a bad datanode on the
existing pipeline due to no more good datanodes being available to try. The
current failed datanode replacement policy is DEFAULT, and a client may
configure this via
'dfs.client.block.write.replace-datanode-on-failure.policy' in its
configuration.

I don't see any issues in HDFS during this period (for example, for
specific node on which this happened, I checked the logs, and only thing
that was happening at that specific point was that pipeline was
recovering).

So not quite sure how there's no more good datanodes in cluster of 15 nodes
with replication factor three?

Also, regarding
http://blog.cloudera.com/blog/2015/03/understanding-hdfs-recovery-processes-part-2/
- there is parameter called dfs.client.block.write.replace-datanode-on-
failure.best-effort which I can not find currently. From which Hadoop
version this parameter can be used, and how much sense it makes to use it
to avoid issues like this one from above?

It's about Hadoop 2.4, Hortonworks 2.1, and currently preparing upgrade to
2.2 and not sure if this is maybe some known issue or something I don't get.

Thanks a lot,
Dejan

--001a11c317e8f79a620518281684
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi,<div><br></div><div>From time to time I see some reduce=
s failing with this:</div><div><br></div><div><span style=3D"color:rgb(34,3=
4,34);font-family:sans-serif;font-size:14.3999996185303px;line-height:norma=
l;background-color:rgba(255,255,255,0.0980392)">Error: java.io.IOException:=
 Failed to replace a bad datanode on the existing pipeline due to no more g=
ood datanodes being available to try.=C2=A0</span><span style=3D"color:rgb(=
34,34,34);font-family:sans-serif;font-size:14.3999996185303px;line-height:n=
ormal;background-color:rgba(255,255,255,0.0980392)">The current failed data=
node replacement policy is DEFAULT, and a client may configure this via =
9;dfs.client.block.write.replace-datanode-on-failure.policy&#39; in its con=
figuration.</span><br></div><div><span style=3D"color:rgb(34,34,34);font-fa=
mily:sans-serif;font-size:14.3999996185303px;line-height:normal;background-=
color:rgba(255,255,255,0.0980392)"><br></span></div><div>I don&#39;t see an=
y issues in HDFS during this period (for example, for specific node on whic=
h this happened, I checked the logs, and only thing that was happening at t=
hat specific point was that pipeline was recovering).=C2=A0<span style=3D"c=
olor:rgb(34,34,34);font-family:sans-serif;font-size:14.3999996185303px;line=
-height:normal;background-color:rgba(255,255,255,0.0980392)"><br></span></d=
iv><div><br></div><div>So not quite sure how there&#39;s no more good datan=
odes in cluster of 15 nodes with replication factor three?</div><div><br></=
div><div>Also, regarding=C2=A0<a href=3D"http://blog.cloudera.com/blog/2015=
/03/understanding-hdfs-recovery-processes-part-2/">http://blog.cloudera.com=
/blog/2015/03/understanding-hdfs-recovery-processes-part-2/</a> - there is =
parameter called=C2=A0<span class=3D"crayon-v" style=3D"font-family:Monaco,=
MonacoRegular,&#39;Courier New&#39;,monospace;height:inherit;font-size:12px=
;line-height:15px;white-space:pre;color:rgb(0,45,122)!important;background-=
color:rgb(253,253,253)">dfs</span><span class=3D"crayon-sy" style=3D"font-f=
amily:Monaco,MonacoRegular,&#39;Courier New&#39;,monospace;height:inherit;f=
ont-size:12px;line-height:15px;white-space:pre;color:rgb(51,51,51)!importan=
t;background-color:rgb(253,253,253)">.</span><span class=3D"crayon-v" style=
=3D"font-family:Monaco,MonacoRegular,&#39;Courier New&#39;,monospace;height=
:inherit;font-size:12px;line-height:15px;white-space:pre;color:rgb(0,45,122=
)!important;background-color:rgb(253,253,253)">client</span><span class=3D"=
crayon-sy" style=3D"font-family:Monaco,MonacoRegular,&#39;Courier New&#39;,=
monospace;height:inherit;font-size:12px;line-height:15px;white-space:pre;co=
lor:rgb(51,51,51)!important;background-color:rgb(253,253,253)">.</span><spa=
n class=3D"crayon-v" style=3D"font-family:Monaco,MonacoRegular,&#39;Courier=
 New&#39;,monospace;height:inherit;font-size:12px;line-height:15px;white-sp=
ace:pre;color:rgb(0,45,122)!important;background-color:rgb(253,253,253)">bl=
ock</span><span class=3D"crayon-sy" style=3D"font-family:Monaco,MonacoRegul=
ar,&#39;Courier New&#39;,monospace;height:inherit;font-size:12px;line-heigh=
t:15px;white-space:pre;color:rgb(51,51,51)!important;background-color:rgb(2=
53,253,253)">.</span><span class=3D"crayon-v" style=3D"font-family:Monaco,M=
onacoRegular,&#39;Courier New&#39;,monospace;height:inherit;font-size:12px;=
line-height:15px;white-space:pre;color:rgb(0,45,122)!important;background-c=
olor:rgb(253,253,253)">write</span><span class=3D"crayon-sy" style=3D"font-=
family:Monaco,MonacoRegular,&#39;Courier New&#39;,monospace;height:inherit;=
font-size:12px;line-height:15px;white-space:pre;color:rgb(51,51,51)!importa=
nt;background-color:rgb(253,253,253)">.</span><span class=3D"crayon-v" styl=
e=3D"font-family:Monaco,MonacoRegular,&#39;Courier New&#39;,monospace;heigh=
t:inherit;font-size:12px;line-height:15px;white-space:pre;color:rgb(0,45,12=
2)!important;background-color:rgb(253,253,253)">replace</span><span class=
=3D"crayon-o" style=3D"font-family:Monaco,MonacoRegular,&#39;Courier New=
9;,monospace;height:inherit;font-size:12px;line-height:15px;white-space:pre=
;color:rgb(0,111,224)!important;background-color:rgb(253,253,253)">-</span>=
<span class=3D"crayon-v" style=3D"font-family:Monaco,MonacoRegular,&#39;Cou=
rier New&#39;,monospace;height:inherit;font-size:12px;line-height:15px;whit=
e-space:pre;color:rgb(0,45,122)!important;background-color:rgb(253,253,253)=
">datanode</span><span class=3D"crayon-o" style=3D"font-family:Monaco,Monac=
oRegular,&#39;Courier New&#39;,monospace;height:inherit;font-size:12px;line=
-height:15px;white-space:pre;color:rgb(0,111,224)!important;background-colo=
r:rgb(253,253,253)">-</span><span class=3D"crayon-v" style=3D"font-family:M=
onaco,MonacoRegular,&#39;Courier New&#39;,monospace;height:inherit;font-siz=
e:12px;line-height:15px;white-space:pre;color:rgb(0,45,122)!important;backg=
round-color:rgb(253,253,253)">on</span><span class=3D"crayon-o" style=3D"fo=
nt-family:Monaco,MonacoRegular,&#39;Courier New&#39;,monospace;height:inher=
it;font-size:12px;line-height:15px;white-space:pre;color:rgb(0,111,224)!imp=
ortant;background-color:rgb(253,253,253)">-</span><span class=3D"crayon-v" =
style=3D"font-family:Monaco,MonacoRegular,&#39;Courier New&#39;,monospace;h=
eight:inherit;font-size:12px;line-height:15px;white-space:pre;color:rgb(0,4=
5,122)!important;background-color:rgb(253,253,253)">failure</span><span cla=
ss=3D"crayon-sy" style=3D"font-family:Monaco,MonacoRegular,&#39;Courier New=
&#39;,monospace;height:inherit;font-size:12px;line-height:15px;white-space:=
pre;color:rgb(51,51,51)!important;background-color:rgb(253,253,253)">.</spa=
n><span class=3D"crayon-v" style=3D"font-family:Monaco,MonacoRegular,&#39;C=
ourier New&#39;,monospace;height:inherit;font-size:12px;line-height:15px;wh=
ite-space:pre;color:rgb(0,45,122)!important;background-color:rgb(253,253,25=
3)">best</span><span class=3D"crayon-o" style=3D"font-family:Monaco,MonacoR=
egular,&#39;Courier New&#39;,monospace;height:inherit;font-size:12px;line-h=
eight:15px;white-space:pre;color:rgb(0,111,224)!important;background-color:=
rgb(253,253,253)">-</span><span class=3D"crayon-v" style=3D"font-family:Mon=
aco,MonacoRegular,&#39;Courier New&#39;,monospace;height:inherit;font-size:=
12px;line-height:15px;white-space:pre;color:rgb(0,45,122)!important;backgro=
und-color:rgb(253,253,253)">effort </span>which I can not find currently. F=
rom which Hadoop version this parameter can be used, and how much sense it =
makes to use it to avoid issues like this one from above?</div><div><br></d=
iv><div>It&#39;s about Hadoop 2.4, Hortonworks 2.1, and currently preparing=
 upgrade to 2.2 and not sure if this is maybe some known issue or something=
 I don&#39;t get.</div><div><br></div><div>Thanks a lot,</div><div>Dejan</d=
iv></div>

--001a11c317e8f79a620518281684--