Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Mon, 12 Sep 2016 17:25:21 +0000 (UTC)
From: "Kihwal Lee (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.13004364.1473701057000.549776.1473701121459@Atlassian.JIRA>
In-Reply-To: <JIRA.13004364.1473701057000@Atlassian.JIRA>
References: <JIRA.13004364.1473701057000@Atlassian.JIRA> <JIRA.13004364.1473701057026@arcas>
Subject: [jira] [Assigned] (HDFS-10857) Rolling upgrade can make data
 unavailable when the cluster has many failed volumes
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Mon, 12 Sep 2016 17:25:24 -0000


     [ https://issues.apache.org/jira/browse/HDFS-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kihwal Lee reassigned HDFS-10857:
---------------------------------

    Assignee: Kihwal Lee

> Rolling upgrade can make data unavailable when the cluster has many failed volumes
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-10857
>                 URL: https://issues.apache.org/jira/browse/HDFS-10857
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>
> When the marker file or trash dir is created or removed during the heartbeat response processing, an {{IOException}} is thrown if tried on a failed volume.   This stops processing of the rest of storage directories and any DNA commands that were part of the heartbeat response.
> While this is happening, the block token key update does not happen and all read and write requests start to fail, until the upgrade is finalized and the DN receives a new key. All it takes is one failed volume. If there are three such nodes in the cluster, it is very likely that some blocks cannot be read. The NN has no idea unlike the common missing blocks scenarios, although the effect is the same.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org