Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 58860 invoked from network); 29 Jan 2009 22:48:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 29 Jan 2009 22:48:35 -0000 Received: (qmail 66000 invoked by uid 500); 29 Jan 2009 22:48:29 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 65965 invoked by uid 500); 29 Jan 2009 22:48:29 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 65954 invoked by uid 99); 29 Jan 2009 22:48:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Jan 2009 14:48:29 -0800 X-ASF-Spam-Status: No, hits=-1.0 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [130.215.36.91] (HELO mail1.wpi.edu) (130.215.36.91) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Jan 2009 22:48:18 +0000 Received: from SMTP.WPI.EDU (SMTP.WPI.EDU [130.215.36.186]) by mail1.wpi.edu (8.14.3/8.14.3) with ESMTP id n0TMlv5d019684 for ; Thu, 29 Jan 2009 17:47:57 -0500 Received: from boss.admin.wpi.edu (BOSS.WPI.EDU [130.215.5.44]) by SMTP.WPI.EDU (8.14.2/8.14.2) with ESMTP id n0TMluUW031802 for ; Thu, 29 Jan 2009 17:47:56 -0500 (envelope-from alyssa@wpi.edu) Received: from EXCHANGEMAIL.admin.wpi.edu ([130.215.33.56]) by boss.admin.wpi.edu ([130.215.5.44]) with mapi; Thu, 29 Jan 2009 17:47:56 -0500 From: "Hargraves, Alyssa" To: "core-user@hadoop.apache.org" Date: Thu, 29 Jan 2009 17:45:57 -0500 Subject: RE: decommissioned node showing up ad dead node in web based interface to namenode (dfshealth.jsp) Thread-Topic: decommissioned node showing up ad dead node in web based interface to namenode (dfshealth.jsp) Thread-Index: AcmCYqj4K65Pz0HUTpiNuRYdjNPR1gAALD92 Message-ID: <97534EA9160306479CFD9FAF444DAF2CD2EFFF75B6@EXCHANGEMAIL.admin.wpi.edu> References: <3b5f72030901271408o3c968074yfd028be0223a0056@mail.gmail.com> ,<3b5f72030901291440ma9894e2l743ab217bf67fe1b@mail.gmail.com> In-Reply-To: <3b5f72030901291440ma9894e2l743ab217bf67fe1b@mail.gmail.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Bill- I believe once the node is decommissioned you'll also have to run bin/hadoo= p-daemon.sh start datanode and bin/hadoop-daemon.sh start tasktracker (both= run on the slave node, not master) to revive the dead node. Just removing= it from exclude and refreshing doesn't work for me either, but with those = two additional commands it does. - Alyssa ________________________________________ From: Bill Au [bill.w.au@gmail.com] Sent: Thursday, January 29, 2009 5:40 PM To: core-user@hadoop.apache.org Subject: Re: decommissioned node showing up ad dead node in web based int= erface to namenode (dfshealth.jsp) Not sure why but this does not work for me. I am running 0.18.2. I ran hadoop dfsadmin -refreshNodes after removing the decommissioned node from the exclude file. It still shows up as a dead node. I also removed it fro= m the slaves file and ran the refresh nodes command again. It still shows up as a dead node after that. I am going to upgrade to 0.19.0 to see if it makes any difference. Bill On Tue, Jan 27, 2009 at 7:01 PM, paul wrote: > Once the nodes are listed as dead, if you still have the host names in yo= ur > conf/exclude file, remove the entries and then run hadoop dfsadmin > -refreshNodes. > > > This works for us on our cluster. > > > > -paul > > > On Tue, Jan 27, 2009 at 5:08 PM, Bill Au wrote: > > > I was able to decommission a datanode successfully without having to st= op > > my > > cluster. But I noticed that after a node has been decommissioned, it > shows > > up as a dead node in the web base interface to the namenode (ie > > dfshealth.jsp). My cluster is relatively small and losing a datanode > will > > have performance impact. So I have a need to monitor the health of my > > cluster and take steps to revive any dead datanode in a timely fashion. > So > > is there any way to altogether "get rid of" any decommissioned datanode > > from > > the web interace of the namenode? Or is there a better way to monitor > the > > health of the cluster? > > > > Bill > > >