Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AD7191743C for ; Tue, 7 Oct 2014 22:06:34 +0000 (UTC) Received: (qmail 30704 invoked by uid 500); 7 Oct 2014 22:06:34 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 30658 invoked by uid 500); 7 Oct 2014 22:06:34 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 30560 invoked by uid 99); 7 Oct 2014 22:06:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Oct 2014 22:06:34 +0000 Date: Tue, 7 Oct 2014 22:06:34 +0000 (UTC) From: "Hadoop QA (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-7203) Concurrent appending to the same file can cause data corruption MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162651#comment-14162651 ] Hadoop QA commented on HDFS-7203: --------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673416/HDFS-7203.patch against trunk revision 9196db9. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The test build failed in hadoop-hdfs-project/hadoop-hdfs {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8344//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8344//console This message is automatically generated. > Concurrent appending to the same file can cause data corruption > --------------------------------------------------------------- > > Key: HDFS-7203 > URL: https://issues.apache.org/jira/browse/HDFS-7203 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Kihwal Lee > Assignee: Kihwal Lee > Priority: Blocker > Attachments: HDFS-7203.patch > > > When multiple threads are calling append against the same file, the file can get corrupt. The root of the problem is that a stale file stat may be used for append in {{DFSClient}}. If the file size changes between {{getFileStatus()}} and {{namenode.append()}}, {{DataStreamer}} will get confused about how to align data to the checksum boundary and break the assumption made by data nodes. > When it happens, datanode may not write the last checksum. On the next append attempt, datanode won't be able to reposition for the partial chunk, since the last checksum is missing. The append will fail after running out of data nodes to copy the partial block to. > However, if there are more threads that try to append, this leads to a more serious situation. In a few minutes, a lease recovery and block recovery will happen. The block recovery truncates the block to the ack'ed size in order to make sure to keep only the portion of data that is checksum-verified. The problem is, during the last successful append, the last data node verified the checksum and ack'ed before writing data and wrong metadata to the disk and all data nodes in the pipeline wrote the same wrong metadata. So the ack'ed size contains the corrupt portion of the data. > Since block recovery does not perform any checksum verification, the file sizes are adjusted and after {{commitBlockSynchronization()}}, another thread will be allowed to append to the corrupt file. This latent corruption may not be detected for a very long time. > The first failing {{append()}} would have created a partial copy of the block in the temporary directory of every data node in the cluster. After this failure, it is likely under replicated, so the file will be scheduled for replication after being closed. Before HDFS-6948, replication didn't work until a node is added or restarted because of the temporary file being on all data nodes. As a result, the corruption could not be detected by replication. After HDFS-6948, the corruption will be detected after the file is closed by lease recovery or subsequent append-close. -- This message was sent by Atlassian JIRA (v6.3.4#6332)