Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4FF811066F for ; Mon, 28 Oct 2013 21:26:49 +0000 (UTC) Received: (qmail 54063 invoked by uid 500); 28 Oct 2013 21:26:39 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 54018 invoked by uid 500); 28 Oct 2013 21:26:39 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 53864 invoked by uid 99); 28 Oct 2013 21:26:36 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Oct 2013 21:26:36 +0000 Date: Mon, 28 Oct 2013 21:26:36 +0000 (UTC) From: "Kihwal Lee (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-5438) Flaws in block report processing can cause data loss MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-5438: ----------------------------- Description: The incremental block reports from data nodes and block commits are asynchronous. This becomes troublesome when the gen stamp for a block is changed during a write pipeline recovery. * If an incremental block report is delayed from a node but NN had enough replicas already, a report with the old gen stamp may be received after block completion. This replica will be correctly marked corrupt. But if the node had participated in the pipeline recovery, a new (delayed) report with the correct gen stamp will come soon. However, this report won't have any effect on the corrupt state of the replica. * If block reports are received while the block is still under construction (i.e. client's call to make block committed has not been received by NN), they are blindly accepted regardless of the gen stamp. If a failed node reports in with the old gen stamp while pipeline recovery is on-going, it will be accepted and counted as valid during commit of the block. Due to the above two problems, correct replicas can be marked corrupt and corrupt replicas can be accepted during commit. So far we have observed two cases in production. * The client hangs forever to close a file. All replicas are marked corrupt. * After the successful close of a file, read fails. Corrupt replicas are accepted during commit and valid replicas are marked corrupt afterward. was: The incremental block reports from data nodes and block commits are asynchronous. This becomes troublesome when the gen stamp for a block is changed during a write pipeline recovery. * If an incremental block report is delayed from a node but NN had enough replicas already, a report with the old gen stamp may be received after block completion. This replica will be correctly marked corrupt. But if the node had participated in the pipeline recovery, a new (delayed) report with the correct gen stamp will come soon. However, this report won't have any effect on the corrupt state of the replica. * If block reports are received while the block is still under construction (i.e. client's call to make block committed has not been received by NN), they are blindly accepted regardless of the gen stamp. If a failed node reports in with the old gen stamp while pipeline recovery is on-going, it will be accepted and counted as valid during commit of the block. Due to the above two problems, correct replicas can be marked corrupt and corrupt replicas can be accepted during commit. So far we have observed two cases in production. * The client hangs forever to close a file. All replicas are marked corrupt. * After the successful close of a file, read fails. Corrupt replicas are accepted them during commit and valid replicas are marked corrupt afterward. > Flaws in block report processing can cause data loss > ---------------------------------------------------- > > Key: HDFS-5438 > URL: https://issues.apache.org/jira/browse/HDFS-5438 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 0.23.9, 2.2.0 > Reporter: Kihwal Lee > Assignee: Kihwal Lee > Priority: Critical > Attachments: HDFS-5438.trunk.patch > > > The incremental block reports from data nodes and block commits are asynchronous. This becomes troublesome when the gen stamp for a block is changed during a write pipeline recovery. > * If an incremental block report is delayed from a node but NN had enough replicas already, a report with the old gen stamp may be received after block completion. This replica will be correctly marked corrupt. But if the node had participated in the pipeline recovery, a new (delayed) report with the correct gen stamp will come soon. However, this report won't have any effect on the corrupt state of the replica. > * If block reports are received while the block is still under construction (i.e. client's call to make block committed has not been received by NN), they are blindly accepted regardless of the gen stamp. If a failed node reports in with the old gen stamp while pipeline recovery is on-going, it will be accepted and counted as valid during commit of the block. > Due to the above two problems, correct replicas can be marked corrupt and corrupt replicas can be accepted during commit. So far we have observed two cases in production. > * The client hangs forever to close a file. All replicas are marked corrupt. > * After the successful close of a file, read fails. Corrupt replicas are accepted during commit and valid replicas are marked corrupt afterward. -- This message was sent by Atlassian JIRA (v6.1#6144)