hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3605) Missing Block in following sceanrio.
Date Fri, 06 Jul 2012 19:27:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408238#comment-13408238

Uma Maheswara Rao G commented on HDFS-3605:

Great catch Brahma.

Here I have one question, why we are keeping all the blocks which are having the same blockID
and different genstamps due to append recovery etc.? I think we should maintain only the latest
block which is reported recently. Mostly this block will have the higher genstamp.

Other part is:
if (changeMade) {
        // The state or gen-stamp of the block has changed. So, we may be
        // able to process some messages from datanodes that we previously
        // were unable to process.
In updateBlocks we have done this, because of optimizing the queued block processing?
Due to this, it may mark block as corrupt right if have queued older genstamp block?
What if we maintain only recently reported genstamp block in postPonedDNMessages and do only
after loading complete edits?
> Missing Block in following sceanrio.
> ------------------------------------
>                 Key: HDFS-3605
>                 URL: https://issues.apache.org/jira/browse/HDFS-3605
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 2.0.0-alpha, 2.0.1-alpha
>            Reporter: Brahma Reddy Battula
> Open file for append
> Write data and sync.
> After next log roll and editlog tailing in standbyNN close the append stream.
> Call append multiple times on the same file, before next editlog roll.
> Now abruptly kill the current active namenode.
> Here block is missed..
> this may be because of All latest blocks were queued in StandBy Namenode. 
> During failover, first OP_CLOSE was processing the pending queue and adding the block
to corrupted block. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message