hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8676) Delayed rolling upgrade finalization can cause heartbeat expiration and write failures
Date Tue, 13 Oct 2015 16:51:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955259#comment-14955259
] 

Hudson commented on HDFS-8676:
------------------------------

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #531 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/531/])
HDFS-8676. Delayed rolling upgrade finalization can cause heartbeat (kihwal: rev 5b43db47a313decccdcca8f45c5708aab46396df)
* hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java


> Delayed rolling upgrade finalization can cause heartbeat expiration and write failures
> --------------------------------------------------------------------------------------
>
>                 Key: HDFS-8676
>                 URL: https://issues.apache.org/jira/browse/HDFS-8676
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Kihwal Lee
>            Assignee: Walter Su
>            Priority: Critical
>         Attachments: HDFS-8676.01.patch, HDFS-8676.02.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks can pile up
in the datanode trash directories until an upgrade is finalized.  When it is finally finalized,
the deletion of trash is done in the service actor thread's context synchronously.  This blocks
the heartbeat and can cause heartbeat expiration.  
> We have seen a namenode losing hundreds of nodes after a delayed upgrade finalization.
 The deletion of trash directories should be made asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message