hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sudhir Vallamkondu <Sudhir.Vallamko...@icrossing.com>
Subject Re: common-user Digest 8 Dec 2010 00:07:01 -0000 Issue 1611
Date Wed, 08 Dec 2010 03:54:37 GMT
There is a proper decommissioning process to remove dead nodes. See the FAQ
link here: 
http://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_
taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F

For a fact $HADOOP_HOME/conf/slaves is not used by the name node to keep
track of datanodes/tasktracker. It is merely used by the stop/start hadoop
scripts to know which nodes to start datanode / tasktracker services.
Similarly there is confusion regarding understanding the
$HADOOP_HOME/conf/master file. That file contains the details of the machine
where secondary name node is running, not the name node/job tracker.

With regards to not all java/hadoop processes getting killed, this may be
happening due to hadoop loosing track of pid files. By default the pid files
are configured to be created in the /tmp directory. If these pid files get
deleted then stop/start scripts cannot detect running hadoop processes. I
suggest changing location of pid files to a persistent location like
/var/hadoop/. The $HADOOP_HOME/conf/hadoop-env.sh file has details on
configuring the PID location

- Sudhir


On 12/7/10 5:07 PM, "common-user-digest-help@hadoop.apache.org"
<common-user-digest-help@hadoop.apache.org> wrote:

> From: Tali K <ncherryus@hotmail.com>
> Date: Tue, 7 Dec 2010 10:40:16 -0800
> To: <core-user@hadoop.apache.org>
> Subject: Help: 1) Hadoop processes still are running after we stopped
> hadoop.2)  How to exclude a dead node?
> 
> 
> 1)When I stopped hadoop, we checked all the nodes and found that 2 or 3
> java/hadoop processes were still running on each node.  So we went to each
> node and did a 'killall java' - in some cases I had to do 'killall -9 java'.
> My question : why is is this happening and what would be recommendations , how
> to make sure that there is no hadoop processes running after I stopped hadoop
> with stop-all.sh?
>  
> 2) Also we have a dead node. We  removed this node  from
> $HADOOP_HOME/conf/slaves.  This file is supposed to tell the namenode
>  which machines are supposed to be datanodes/tasktrackers.
> We  started hadoop again, and were surprised to see a dead node in  hadoop
> 'report' ("$HADOOP_HOME/bin/hadoop dfsadmin -report|less")
> It is only after blocking a deadnode and restarting hadoop, deadnode no longer
> showed up in hreport.
> Any recommendations, how to deal with dead nodes?


iCrossing Privileged and Confidential Information
This email message is for the sole use of the intended recipient(s) and may contain confidential
and privileged information of iCrossing. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by reply email
and destroy all copies of the original message.



Mime
View raw message