hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin Kincaid Williams <disc...@uw.edu>
Subject decommissioning disks on a data node
Date Fri, 17 Oct 2014 00:03:31 GMT
We have been seeing some of the disks on our cluster having bad blocks, and
then failing. We are using some dell PERC H700 disk controllers that create
"virtual devices".

Our hosting manager uses a dell utility which reports "virtual device bad
blocks". He has suggested that we use the dell tool to remove the "virtual
device bad blocks", and then re-format the device.

 I'm wondering if we can remove the disks in question from the
hdfs-site.xml, and restart the datanode , so that we don't re-replicate the
hadoop blocks on the other disks. Then we would go ahead and work on the
troubled disk, while the datanode remained up. Finally we would restart the
datanode again after re-adding the freshly formatted { possibly new } disk.
This way the data on the remaining disks doesn't get re-replicated.

I don't know too much about the hadoop block system. Will this work ? Is it
an acceptable strategy for disk maintenance ?

View raw message