hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5134) FSNamesystem#commitBlockSynchronization adds under-construction block locations to blocksMap
Date Sun, 08 Feb 2009 05:52:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671560#action_12671560
] 

dhruba borthakur commented on HADOOP-5134:
------------------------------------------

At first I thought that the fix is simple. commitBlockSync should not update the blocksMap.
But this could cause a subtle problem. The problem is the above fix could cause a call to
"append" fail when it should not. let me try to explain.

Suppose, a writer is writing data to a file and dies before closing the file. A new writer
starts and invokes "append" to try to append to this file. This "append" call triggers lease
recovery, and thereby causes the Primary Datanode to invoke commitBlockSync. Meanwhle, the
append call fails (as expected) with AlreadyBeingCreatedExcetion". The commitBlockSync call
removes the lease but does not update block locations of the last block. The datanode(s) that
have the last block  sends blockReceived to the NN, but before they reach the NN, the NN starts
processing another call to "append". This append will now find that the file is not under
construction anymore but that the last block of the file does not have any block locations
associated with it. This means that the "append" call will fail. This is not expected behaviour.

> FSNamesystem#commitBlockSynchronization adds under-construction block locations to blocksMap
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5134
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5134
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.2
>            Reporter: Hairong Kuang
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.4
>
>
> From my understanding of sync/append design, an under construction block should not have
any block locations associated with it in the blocksMap. So an under construction block will
not be managed by ReplicationMonitor.
> However, if there is an error in the write pipeline, a lease recovery will trigger a
call, commitBlockSynchronization, to NN. This call will add the successfully-recovered datanodes
to blocksMap. This seems to violate the design. It should update the targets of the last block
at INode instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message