hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serge Blazhievsky <hadoop...@gmail.com>
Subject Re: Both hadoop fsck and dfsadmin can not detect missing replica in time?
Date Fri, 14 Nov 2014 14:32:08 GMT
It might take sometime for hadoop to realize that blocks are missing. If
you restart the cluster, does it detect that blocks are missing?

On Thu, Nov 13, 2014 at 9:55 PM, sam liu <samliuhadoop@gmail.com> wrote:

> I manually removed the block replica file on datanode and the removed file
> path is '${dfs.datanode.data.dir}/current/BP-1640683473-9.181.
> 64.230-1415757100604/current/finalized/subdir52/blk_1073742304'.
>
> 2014-11-14 11:15 GMT+08:00 daemeon reiydelle <daemeonr@gmail.com>:
>
>> Exactly HOW did you manually remove the block?
>>
>> sent from my mobile
>> Daemeon C.M. Reiydelle
>> USA 415.501.0198
>> London +44.0.20.8144.9872
>> On Nov 12, 2014 9:45 PM, "sam liu" <samliuhadoop@gmail.com> wrote:
>>
>>> Hi Experts,
>>>
>>> In my hdfs, there is a file named /tmp/test.txt belonging to 1 block
>>> with 2 replica. The block id is blk_1073742304_1480 and the 2 replica
>>> resides on datanode1 and datanode2.
>>>
>>> Today I manually removed the block file on datanode2:
>>> ./current/BP-1640683473-9.181.64.230-1415757100604/current/finalized/subdir52/blk_1073742304.
>>> And then, I failed to read hdfs /tmp/test.txt file from datanode2, and
>>> encountered an exception: "IOException: Got error for OP_READ_BLOCK...". It
>>> makes sense as I already removed one replica from datanod2.
>>>
>>> However, both 'hadoop fsck /tmp/test.txt -files -blocks -locations' and
>>> 'hadoop dfsadmin -report' say hdfs is healthy and no replica is missed.
>>> Even after waiting several minutes(I think datanode will send heartbeats to
>>> namenode to report the recent status), the fsck/dfsadmin tools still did
>>> not find the replica missing. Why?
>>>
>>> Thanks!
>>>
>>
>

Mime
View raw message