hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sam liu <samliuhad...@gmail.com>
Subject Both hadoop fsck and dfsadmin can not detect missing replica in time?
Date Thu, 13 Nov 2014 05:43:16 GMT
Hi Experts,

In my hdfs, there is a file named /tmp/test.txt belonging to 1 block with 2
replica. The block id is blk_1073742304_1480 and the 2 replica resides on
datanode1 and datanode2.

Today I manually removed the block file on datanode2:
./current/BP-1640683473-9.181.64.230-1415757100604/current/finalized/subdir52/blk_1073742304.
And then, I failed to read hdfs /tmp/test.txt file from datanode2, and
encountered an exception: "IOException: Got error for OP_READ_BLOCK...". It
makes sense as I already removed one replica from datanod2.

However, both 'hadoop fsck /tmp/test.txt -files -blocks -locations' and
'hadoop dfsadmin -report' say hdfs is healthy and no replica is missed.
Even after waiting several minutes(I think datanode will send heartbeats to
namenode to report the recent status), the fsck/dfsadmin tools still did
not find the replica missing. Why?

Thanks!

Mime
View raw message