hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 麦树荣 <shurong....@qunar.com>
Subject Why replication of Under-Replicated blocks in decommissioned datanodes is so slow
Date Tue, 07 Apr 2015 03:18:31 GMT
version: hadoop-2.2.0

There were 13 nodes in our hdfs cluster. We wanted to decommission 7 nodes. We used two methods
as follow:

Method 1:
At the beginning, we set the dfs.hosts.exclude parameter and successfully decommissioned 7
nodes, so there were many Under-Replicated blocks need to replicate. However, it spent about
20 hours and the replication didn’t finish yet. We observed the speed of replication is
very slow.

Method 2:
Later, we gave up the method, and used another method of stopping datanode node by node. We
stopped one datanode. When replication of Under-Replicated blocks of the node finished, we
continued to stop another datanode till 7 nodes were stopped. It spent about 12 hours and
the speed of replication is obviously much faster the method 1.

We thought method 1 should be faster method 2. But factually, method 2 is much faster than
method 1. Why ?
View raw message