hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Espen Amble Kolstad <es...@trank.no>
Subject Re: Decommission in hadoop-0.12.2
Date Tue, 27 Mar 2007 08:22:36 GMT
On Tuesday 27 March 2007 10:03:41 Andrzej Bialecki wrote:
> Espen Amble Kolstad wrote:
> > On Tuesday 27 March 2007 09:27:58 Andrzej Bialecki wrote:
> >> Espen Amble Kolstad wrote:
> >>> Hi,
> >>>
> >>> I'm trying to decommission a node with hadoop-0.12.2.
> >>> I use the property dfs.hosts.exclude, since the command haddop
> >>> dfsadmin -decommission seems to be gone.
> >>> I then start the cluster with an emtpy exclude-file, add the name of
> >>> the node to decommission and run hadoop dfsadmin -refreshNodes.
> >>> The log then says:
> >>> 2007-03-27 08:42:59,168 INFO  fs.FSNamesystem - Start Decommissioning
> >>> node 81.93.168.215:50010
> >>>
> >>> But nothing happens.
> >>> I've left it in this state over night, but still nothing.
> >>>
> >>> Am I missing something ?
> >>
> >> What does the dfsadmin -report says about this node? It takes time to
> >> ensure that all blocks are replicated from this node to other nodes.
> >
> > Hi,
> >
> > dfsadmin -report:
> >
> > Name: 81.93.168.215:50010
> > State          : Decommission in progress
> > Total raw bytes: 1438871724032 (1.30 TB)
> > Used raw bytes: 270070137404 (0.24 TB)
> > % used: 18.76%
> > Last contact: Tue Mar 27 09:42:26 CEST 2007
> >
> > In the web-interface (dfshealth.jsp) no change can be seen in % or the
> > number of blocks on any of the nodes.
>
> You may want to check the datanode logs if there are any exceptions
> reported.. Also, things are taking time - I believe the datanodes
> synchronize their block information piecewise, so that they don't
> overwhelm the namenode. It surely takes some time in my case, even
> though the disk size per node that I use is much smaller.
>
> Regarding the number of blocks - if all blocks are already present on
> other datanodes at least in 1 copy, then no new blocks need to be
> created - I'm not sure when the namenode decides that these blocks
> should get additional replicas: during the decommissioning or after it's
> complete ...
>
> It would be nice to have a progress meter on the decommissioning
> process, though.

Hi,

I have replication set to 1 for the whole hdfs, so there should not be any 
other replicas.
I can't find any errors in my logs. And the namenode-log looks like this (at 
INFO level):
2007-03-27 08:42:59,168 INFO  fs.FSNamesystem - Start Decommissioning node 
81.93.168.215:50010
2007-03-27 09:04:48,831 INFO  fs.FSNamesystem - Roll Edit Log
2007-03-27 09:04:49,500 INFO  fs.FSNamesystem - Roll FSImage
2007-03-27 10:04:50,221 INFO  fs.FSNamesystem - Roll Edit Log
2007-03-27 10:04:50,360 INFO  fs.FSNamesystem - Roll FSImage

- Espen

Mime
View raw message