hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Junjun (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-4318) validateBlockMetadata reduce the success rate of block recover
Date Sun, 16 Dec 2012 15:08:12 GMT
Li Junjun created HDFS-4318:
-------------------------------

             Summary: validateBlockMetadata reduce the success rate of block recover
                 Key: HDFS-4318
                 URL: https://issues.apache.org/jira/browse/HDFS-4318
             Project: Hadoop HDFS
          Issue Type: Wish
          Components: datanode
    Affects Versions: 1.0.1
            Reporter: Li Junjun
            Priority: Minor


some logs like "java.io.IOException: Block blk_3272028001529756059_11883841 length is 20480
does not match block file length 21376 "
when recovery block

when datanode perform startBlockRecovery  it call validateBlockMetadata( in FSDataset.startBlockRecovery
 ),
check the file lenth match the block's numBytes 

so , let us see how  block's numBytes was updated in datanode 
when write block in BlockReceiver.receivePacket , write->flush->setVisibleLength, that
means 
it is  normal and reasonable  that the file length > the block's numBytes if write or flush
throw exception . 
In startBlockRecovery( or other situation,to be check) 
we just need to guarantee  the file length < the block's numBytes never happens .


I suggest change the validateBlockMetadata , because it  reduced the success rate of block
recover.

when you have a pipline , a->b->c  when a got error in network , b got error in write->flush
, we can only 
count on c!



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message