hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Thomas <tho...@hep.caltech.edu>
Subject fixing misreplicated blocks
Date Mon, 13 Jul 2009 15:34:30 GMT
Is there a way to get HDFS (Haodop 0.19.1) to automatically shuffle
around blocks to fix misreplicated blocks due to the redefinition of
which datanodes are in which racks?

When we first set up our HDFS cluster, we had 64 nodes in a single rack.
 Since we had only one rack, we left topology.script.file.name empty.
Now we have 3 racks have have set topology.script.file.name to our own
script defining which node is in which rack.  However, this now means
that every file gets reported as having misreplicated blocks with
'hadoop fsck /'.  I've tried restarting the namenode and running the
balancer, but neither one fixed any of the misreplicated blocks when I
checked again the next morning.  We have 100TB of data in HDFS, so
fixing this manually is not an option.  :)

--Mike

Mime
View raw message