hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhanlei Ma <...@vmware.com>
Subject How to Recommission?
Date Thu, 01 Apr 2010 03:12:56 GMT
How to Recommission or decommission DataNode(s) in hadoop???
Decommission(Del some Datanodes):
On a large cluster removing one or two data-nodes will not lead to any data loss, because
name-node will replicate their blocks as long as it

will detect that the nodes are dead. With a large number of nodes getting removed or dying
the probability of losing data is higher.

Hadoop offers the decommission feature to retire a set of existing data-nodes. The nodes to
be retired should be included into the exclude file,

and the exclude file name should be specified as a configuration parameter dfs.hosts.exclude.
This file should have been specified during

namenode startup. It could be a zero length file. You must use the full hostname, ip or ip:port
format in this file. Then the shell command

bin/hadoop dfsadmin -refreshNodes

should be called, which forces the name-node to re-read the exclude file and start the decommission

Decommission does not happen momentarily since it requires replication of potentially a large
number of blocks and we do not want the cluster to

be overwhelmed with just this one job. The decommission progress can be monitored on the name-node
Web UI. Until all blocks are replicated the

node will be in "Decommission In Progress" state. When decommission is done the state will
change to "Decommissioned". The nodes can be removed

whenever decommission is finished.

But how to Recommission? Wish your help.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message