Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
MIME-Version: 1.0
In-Reply-To: 
 <CALtbyzqJgOkXjM3-ZZ+S5uQdCqMFzzHF+yLBoyv77C28KCgb0A@mail.gmail.com>
References: 
 <CANYdkkOQEpFyr=Mr1QX9+m05t5FS9H1vYncOHEBbEz_M-EH-bQ@mail.gmail.com>
	<CALtbyzqJgOkXjM3-ZZ+S5uQdCqMFzzHF+yLBoyv77C28KCgb0A@mail.gmail.com>
Date: Wed, 28 Nov 2012 07:22:14 -0800
Message-ID: 
 <CANYdkkP0V_7ru+LsSW7Ketg7EF5bL8rV1TA3g6N4b4jNu6B+pg@mail.gmail.com>
Subject: Re: Replacing a hard drive on a slave
From: Mark Kerzner <mark.kerzner@shmsoft.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=047d7bdc8d262b1da904cf8fbbe3

--047d7bdc8d262b1da904cf8fbbe3
Content-Type: text/plain; charset=ISO-8859-1

What happens if I stop the datanode, miss the 10 min 30 seconds deadline,
and restart the datanode say 30 minutes later? Will Hadoop re-use the data
on this datanode, balancing it with HDFS? What happens to those blocks that
correspond to file that have been updated meanwhile?

Mark

On Wed, Nov 28, 2012 at 6:51 AM, Stephen Fritz <stephenf@cloudera.com>wrote:

> HDFS will not start re-replicating blocks from a dead DN for 10 minutes 30
> seconds by default.
>
> Right now there isn't a good way to replace a disk out from under a
> running datanode, so the best way is:
> - Stop the DN
> - Replace the disk
> - Restart the DN
>
>
>
>
> On Wed, Nov 28, 2012 at 9:14 AM, Mark Kerzner <mark.kerzner@shmsoft.com>wrote:
>
>> Hi,
>>
>> can I remove one hard drive from a slave but tell Hadoop not to replicate
>> missing blocks for a few minutes, because I will return it back? Or will
>> this not work at all, and will Hadoop continue replicating, since I removed
>> blocks, even for a short time?
>>
>> Thank you. Sincerely,
>> Mark
>>
>
>

--047d7bdc8d262b1da904cf8fbbe3
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

What happens if I stop the datanode, miss the 10 min 30 seconds deadline, a=
nd restart the datanode say 30 minutes later? Will Hadoop re-use the data o=
n this datanode, balancing it with HDFS? What happens to those blocks that =
correspond to file that have been updated meanwhile?<br>
<br>Mark<br><br><div class=3D"gmail_quote">On Wed, Nov 28, 2012 at 6:51 AM,=
 Stephen Fritz <span dir=3D"ltr">&lt;<a href=3D"mailto:stephenf@cloudera.co=
m" target=3D"_blank">stephenf@cloudera.com</a>&gt;</span> wrote:<br><blockq=
uote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc =
solid;padding-left:1ex">
HDFS will not start re-replicating blocks from a dead DN for 10 minutes 30 =
seconds by default.<br><br>Right now there isn&#39;t a good way to replace =
a disk out from under a running datanode, so the best way is:<br>- Stop the=
 DN<br>

- Replace the disk<br>- Restart the DN<div class=3D"HOEnZb"><div class=3D"h=
5"><br><br><br><br><div class=3D"gmail_quote">On Wed, Nov 28, 2012 at 9:14 =
AM, Mark Kerzner <span dir=3D"ltr">&lt;<a href=3D"mailto:mark.kerzner@shmso=
ft.com" target=3D"_blank">mark.kerzner@shmsoft.com</a>&gt;</span> wrote:<br=
>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hi,<br><br>can I remove one hard drive from =
a slave but tell Hadoop not to replicate missing blocks for a few minutes, =
because I will return it back? Or will this not work at all, and will Hadoo=
p continue replicating, since I removed blocks, even for a short time?<br>


<br>Thank you. Sincerely,<br>Mark<br>
</blockquote></div><br>
</div></div></blockquote></div><br>

--047d7bdc8d262b1da904cf8fbbe3--