Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C2CCA10C58 for ; Thu, 18 Dec 2014 23:45:16 +0000 (UTC) Received: (qmail 6967 invoked by uid 500); 18 Dec 2014 23:45:15 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 6907 invoked by uid 500); 18 Dec 2014 23:45:15 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 6824 invoked by uid 99); 18 Dec 2014 23:45:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Dec 2014 23:45:14 +0000 Date: Thu, 18 Dec 2014 23:45:14 +0000 (UTC) From: "Arpit Agarwal (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-7443) Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are present in the same volume MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252565#comment-14252565 ] Arpit Agarwal commented on HDFS-7443: ------------------------------------- bq. because the code would get a lot more complex. Because we do the hardlinks in parallel, we would have to somehow accumulate the duplicates and deal with them at the end, once all worker threads had been joined. We wouldn't need all that. A length check on src and dst when we hit an exception should suffice right, depending on the result either discard src or overwrite dst? Anyway I think your patch is fine to go as it is. > Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are present in the same volume > ------------------------------------------------------------------------------------------------------ > > Key: HDFS-7443 > URL: https://issues.apache.org/jira/browse/HDFS-7443 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.6.0 > Reporter: Kihwal Lee > Assignee: Colin Patrick McCabe > Priority: Blocker > Attachments: HDFS-7443.001.patch > > > When we did an upgrade from 2.5 to 2.6 in a medium size cluster, about 4% of datanodes were not coming up. They treid data file layout upgrade for BLOCKID_BASED_LAYOUT introduced in HDFS-6482, but failed. > All failures were caused by {{NativeIO.link()}} throwing IOException saying {{EEXIST}}. The data nodes didn't die right away, but the upgrade was soon retried when the block pool initialization was retried whenever {{BPServiceActor}} was registering with the namenode. After many retries, datenodes terminated. This would leave {{previous.tmp}} and {{current}} with no {{VERSION}} file in the block pool slice storage directory. > Although {{previous.tmp}} contained the old {{VERSION}} file, the content was in the new layout and the subdirs were all newly created ones. This shouldn't have happened because the upgrade-recovery logic in {{Storage}} removes {{current}} and renames {{previous.tmp}} to {{current}} before retrying. All successfully upgraded volumes had old state preserved in their {{previous}} directory. > In summary there were two observed issues. > - Upgrade failure with {{link()}} failing with {{EEXIST}} > - {{previous.tmp}} contained not the content of original {{current}}, but half-upgraded one. > We did not see this in smaller scale test clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)