hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data
Date Thu, 25 May 2017 23:04:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025525#comment-16025525

Hudson commented on HDFS-11817:

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11785 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11785/])
HDFS-11817. A faulty node can cause a lease leak and NPE on accessing (kihwal: rev 2b5ad48762587abbcd8bdb50d0ae98f8080d926c)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirTruncateOp.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockUnderConstructionFeature.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestBlockUnderConstruction.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockUnderConstructionFeature.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java

> A faulty node can cause a lease leak and NPE on accessing data
> --------------------------------------------------------------
>                 Key: HDFS-11817
>                 URL: https://issues.apache.org/jira/browse/HDFS-11817
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>             Fix For: 3.0.0-alpha3, 2.8.2
>         Attachments: HDFS-11817.branch-2.patch, hdfs-11817_supplement.txt, HDFS-11817.v2.branch-2.8.patch,
HDFS-11817.v2.branch-2.patch, HDFS-11817.v2.trunk.patch
> When the namenode performs a lease recovery for a failed write, the {{commitBlockSynchronization()}}
will fail, if none of the new target has sent a received-IBR.  At this point, the data is
inaccessible, as the namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}.
> The lease recovery will be retried in about an hour by the namenode. If the nodes are
faulty (usually when there is only one new target), they may not block report until this point.
If this happens, lease recovery throws an {{AlreadyBeingCreatedException}}, which causes LeaseManager
to simply remove the lease without  finalizing the inode.  
> This results in an inconsistent lease state. The inode stays under-construction, but
no more lease recovery is attempted. A manual lease recovery is also not allowed. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message