Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 799EAD9CE for ; Sat, 1 Dec 2012 17:22:00 +0000 (UTC) Received: (qmail 13570 invoked by uid 500); 1 Dec 2012 17:22:00 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 13390 invoked by uid 500); 1 Dec 2012 17:22:00 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 13336 invoked by uid 99); 1 Dec 2012 17:21:59 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 01 Dec 2012 17:21:59 +0000 Date: Sat, 1 Dec 2012 17:21:58 +0000 (UTC) From: "Hadoop QA (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1784801744.49271.1354382518977.JavaMail.jiratomcat@arcas> In-Reply-To: <64203695.18862.1346367007675.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HDFS-3875) Issue handling checksum errors in write pipeline MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508006#comment-13508006 ] Hadoop QA commented on HDFS-3875: --------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555626/hdfs-3875.trunk.with.test.patch.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3585//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/3585//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3585//console This message is automatically generated. > Issue handling checksum errors in write pipeline > ------------------------------------------------ > > Key: HDFS-3875 > URL: https://issues.apache.org/jira/browse/HDFS-3875 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs-client > Affects Versions: 2.0.2-alpha > Reporter: Todd Lipcon > Assignee: Kihwal Lee > Priority: Blocker > Attachments: hdfs-3875.branch-0.23.no.test.patch.txt, hdfs-3875.branch-0.23.with.test.patch.txt, hdfs-3875.trunk.no.test.patch.txt, hdfs-3875.trunk.with.test.patch.txt, hdfs-3875-wip.patch > > > We saw this issue with one block in a large test cluster. The client is storing the data with replication level 2, and we saw the following: > - the second node in the pipeline detects a checksum error on the data it received from the first node. We don't know if the client sent a bad checksum, or if it got corrupted between node 1 and node 2 in the pipeline. > - this caused the second node to get kicked out of the pipeline, since it threw an exception. The pipeline started up again with only one replica (the first node in the pipeline) > - this replica was later determined to be corrupt by the block scanner, and unrecoverable since it is the only replica -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira