Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of nitinpawar432@gmail.com
 designates 209.85.128.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAGparvVdAfFnSwMxwLi6eQSbYgPKBXZ_Z8q8U7teAnBxAT3kMg@mail.gmail.com>
References: 
 <CAOTDg_RaZCut64Ay1AmguEd8Uih44uThBOQB9gBMSEenY-cfeg@mail.gmail.com>
	<CADdVdVGNWkh1xXjegGga0StfDciOjY5ZDyEaBvttXZeo7TbVXA@mail.gmail.com>
	<5D3AA85E9BCE45D0B204586E397BAE6E@gmail.com>
	<CAOcnVr3E1qm9Ke7cUZjvhU+9CuBtW6HXgKi9RudUMZMU=YAfow@mail.gmail.com>
	<CAGparvVdAfFnSwMxwLi6eQSbYgPKBXZ_Z8q8U7teAnBxAT3kMg@mail.gmail.com>
Date: Wed, 30 Jan 2013 22:09:49 +0530
Message-ID: 
 <CAORpBsjE1HGK_g8orPFCMwL7i=ZnjP_429m42CWnS0-0WaOUtA@mail.gmail.com>
Subject: Re: what will happen when HDFS restarts but with some dead nodes
From: Nitin Pawar <nitinpawar432@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=bcaec54eed56a06c8d04d484281a

--bcaec54eed56a06c8d04d484281a
Content-Type: text/plain; charset=ISO-8859-1

following are the configs it looks for . Unless Admin forces it to come out
of safenode, it respects below values

dfs.namenode.safemode.threshold-pct0.999fSpecifies the percentage of blocks
that should satisfy the minimal replication requirement defined by
dfs.namenode.replication.min. Values less than or equal to 0 mean not to
wait for any particular percentage of blocks before exiting safemode.
Values greater than 1 will make safe mode permanent.
dfs.namenode.safemode.min.datanodes0Specifies the number of datanodes that
must be considered alive before the name node exits safemode. Values less
than or equal to 0 mean not to take the number of live datanodes into
account when deciding whether to remain in safe mode during startup. Values
greater than the number of datanodes in the cluster will make safe mode
permanent.dfs.namenode.safemode.extension30000Determines extension of safe
mode in milliseconds after the threshold level is reached.


On Wed, Jan 30, 2013 at 10:06 PM, Chen He <airbots@gmail.com> wrote:

> Hi Harsh
>
> I have a question. How namenode gets out of safemode in condition of data
> blocks lost, only administrator? Accordin to my experiences, the NN (0.21)
> stayed in safemode about several days before I manually turn safemode off.
> There were 2 blocks lost.
>
> Chen
>
>
> On Wed, Jan 30, 2013 at 10:27 AM, Harsh J <harsh@cloudera.com> wrote:
>
>> NN does recalculate new replication work to do due to unavailable
>> replicas ("under-replication") when it starts and receives all block
>> reports, but executes this only after out of safemode. When in
>> safemode, across the HDFS services, no mutations are allowed.
>>
>> On Wed, Jan 30, 2013 at 8:34 AM, Nan Zhu <zhunansjtu@gmail.com> wrote:
>> > Hi, all
>> >
>> > I'm wondering if HDFS is stopped, and some of the machines of the
>> cluster
>> > are moved,  some of the block replication are definitely lost for moving
>> > machines
>> >
>> > when I restart the system, will the namenode recalculate the data
>> > distribution?
>> >
>> > Best,
>> >
>> > --
>> > Nan Zhu
>> > School of Computer Science,
>> > McGill University
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>


-- 
Nitin Pawar

--bcaec54eed56a06c8d04d484281a
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">following are the configs it looks for . Unless Admin forc=
es it to come out of safenode, it respects below values=A0<div><br></div><d=
iv><table border=3D"1" style=3D"font-family:&#39;Times New Roman&#39;"><tbo=
dy><tr>
<td><a name=3D"dfs.namenode.safemode.threshold-pct">dfs.namenode.safemode.t=
hreshold-pct</a></td><td>0.999f</td><td>Specifies the percentage of blocks =
that should satisfy the minimal replication requirement defined by dfs.name=
node.replication.min. Values less than or equal to 0 mean not to wait for a=
ny particular percentage of blocks before exiting safemode. Values greater =
than 1 will make safe mode permanent.</td>
</tr><tr><td><a name=3D"dfs.namenode.safemode.min.datanodes">dfs.namenode.s=
afemode.min.datanodes</a></td><td>0</td><td>Specifies the number of datanod=
es that must be considered alive before the name node exits safemode. Value=
s less than or equal to 0 mean not to take the number of live datanodes int=
o account when deciding whether to remain in safe mode during startup. Valu=
es greater than the number of datanodes in the cluster will make safe mode =
permanent.</td>
</tr><tr><td><a name=3D"dfs.namenode.safemode.extension">dfs.namenode.safem=
ode.extension</a></td><td>30000</td><td>Determines extension of safe mode i=
n milliseconds after the threshold level is reached.</td></tr></tbody></tab=
le>
</div></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">O=
n Wed, Jan 30, 2013 at 10:06 PM, Chen He <span dir=3D"ltr">&lt;<a href=3D"m=
ailto:airbots@gmail.com" target=3D"_blank">airbots@gmail.com</a>&gt;</span>=
 wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hi Harsh<br><br>I have a question. How namen=
ode gets out of safemode in condition of data blocks lost, only administrat=
or? Accordin to my experiences, the NN (0.21) stayed in safemode about seve=
ral days before I manually turn safemode off. There were 2 blocks lost.<spa=
n class=3D"HOEnZb"><font color=3D"#888888"><br>

<br>Chen</font></span><div class=3D"HOEnZb"><div class=3D"h5"><br><br><div =
class=3D"gmail_quote">On Wed, Jan 30, 2013 at 10:27 AM, Harsh J <span dir=
=3D"ltr">&lt;<a href=3D"mailto:harsh@cloudera.com" target=3D"_blank">harsh@=
cloudera.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
NN does recalculate new replication work to do due to unavailable<br>
replicas (&quot;under-replication&quot;) when it starts and receives all bl=
ock<br>
reports, but executes this only after out of safemode. When in<br>
safemode, across the HDFS services, no mutations are allowed.<br>
<div><div><br>
On Wed, Jan 30, 2013 at 8:34 AM, Nan Zhu &lt;<a href=3D"mailto:zhunansjtu@g=
mail.com" target=3D"_blank">zhunansjtu@gmail.com</a>&gt; wrote:<br>
&gt; Hi, all<br>
&gt;<br>
&gt; I&#39;m wondering if HDFS is stopped, and some of the machines of the =
cluster<br>
&gt; are moved, =A0some of the block replication are definitely lost for mo=
ving<br>
&gt; machines<br>
&gt;<br>
&gt; when I restart the system, will the namenode recalculate the data<br>
&gt; distribution?<br>
&gt;<br>
&gt; Best,<br>
&gt;<br>
&gt; --<br>
&gt; Nan Zhu<br>
&gt; School of Computer Science,<br>
&gt; McGill University<br>
&gt;<br>
&gt;<br>
<br>
<br>
<br>
</div></div><span><font color=3D"#888888">--<br>
Harsh J<br>
</font></span></blockquote></div><br>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
Nitin Pawar<br>
</div>

--bcaec54eed56a06c8d04d484281a--