hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5557) Write pipeline recovery for the last packet in the block may cause rejection of valid replicas
Date Tue, 03 Dec 2013 11:34:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837598#comment-13837598
] 

Hudson commented on HDFS-5557:
------------------------------

FAILURE: Integrated in Hadoop-Hdfs-0.23-Build #809 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/809/])
svn merge -c 1547173 merging from trunk to branch-0.23 to fix: HDFS-5557. Write pipeline recovery
for the last packet in the block may cause rejection of valid replicas. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1547181)
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java


> Write pipeline recovery for the last packet in the block may cause rejection of valid
replicas
> ----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-5557
>                 URL: https://issues.apache.org/jira/browse/HDFS-5557
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.23.9, 2.4.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>             Fix For: 3.0.0, 2.4.0, 0.23.10
>
>         Attachments: HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch
>
>
> When a block is reported from a data node while the block is under construction (i.e.
not committed or completed), BlockManager calls BlockInfoUnderConstruction.addReplicaIfNotPresent()
to update the reported replica state. But BlockManager is calling it with the stored block,
not reported block.  This causes the recorded replicas' gen stamp to be that of BlockInfoUnderConstruction
itself, not the one from reported replica.
> When a pipeline recovery is done for the last packet of a block, the incremental block
reports with the new gen stamp may come before the client calling updatePipeline(). If this
happens, these replicas will be incorrectly recorded with the old gen stamp and get removed
later.  The result is close or addAdditionalBlock failure.
> If the last block is completed, but the penultimate block is not because of this issue,
the file won't be closed. If this file is not cleared, but the client goes away, the lease
manager will try to recover the lease/block, at which point it will crash. I will file a separate
jira for this shortly.
> The worst case is to reject all good ones and accepting a bad one. In this case, the
block will get completed, but the data cannot be read until the next full block report containing
one of the valid replicas is received.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message