hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mirko Kämpf <mirko.kae...@gmail.com>
Subject Re: Decommissioning a data node and problems bringing it back online
Date Thu, 24 Jul 2014 16:37:10 GMT
After you added the nodes back to your cluster you run the balancer tool,
but it will not bring in exactly the same blocks like before.

Cheers,
Mirko



2014-07-24 17:34 GMT+01:00 andrew touchet <adt027@latech.edu>:

> Thanks for the reply,
>
> I am using Hadoop-0.20. We installed from Apache not cloundera, if that
> makes a difference.
>
> Currently I really need to know how to get the data that was replicated
> during decommissioning back onto my two data nodes.
>
>
>
>
>
> On Thursday, July 24, 2014, Stanley Shi <sshi@gopivotal.com> wrote:
>
>> which distribution are you using?
>>
>> Regards,
>> *Stanley Shi,*
>>
>>
>>
>> On Thu, Jul 24, 2014 at 4:38 AM, andrew touchet <adt027@latech.edu>
>> wrote:
>>
>>> I should have added this in my first email but I do get an error in the
>>> data node's log file
>>>
>>> '2014-07-12 19:39:58,027 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 blocks
>>> got processed in 1 msecs'
>>>
>>>
>>>
>>> On Wed, Jul 23, 2014 at 3:18 PM, andrew touchet <adt027@latech.edu>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I am Decommissioning data nodes for an OS upgrade on a HPC cluster .
>>>> Currently, users can run jobs that use data stored on /hdfs. They are able
>>>> to access all datanodes/compute nodes except the one being decommissioned.
>>>>
>>>> Is this safe to do? Will edited files affect the decommissioning node?
>>>>
>>>> I've been adding the nodes to /usr/lib/hadoop-0.20/conf/hosts_exclude
>>>> and running   'hadoop dfsadmin -refreshNodes' on the name name node.  Then
>>>> I simply wait for log files to report completion. After upgrade, I simply
>>>> remove the node from hosts_exlude and start hadoop again on the datanode.
>>>>
>>>> Also: Under the namenode web interface I just noticed that the node I
>>>> have decommissioned previously now has 0 Configured capacity, Used,
>>>> Remaining memory and is now 100% Used.
>>>>
>>>> I used the same /etc/sysconfig/hadoop file from before the upgrade,
>>>> removed the node from hosts_exclude, and ran '-refreshNodes' afterwards.
>>>>
>>>> What steps have I missed in the decommissioning process or while
>>>> bringing the data node back online?
>>>>
>>>>
>>>>
>>>>
>>>
>>

Mime
View raw message