Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3B7C19CD1 for ; Wed, 28 Mar 2012 18:33:53 +0000 (UTC) Received: (qmail 50167 invoked by uid 500); 28 Mar 2012 18:33:53 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 50127 invoked by uid 500); 28 Mar 2012 18:33:52 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 50116 invoked by uid 99); 28 Mar 2012 18:33:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Mar 2012 18:33:52 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Mar 2012 18:33:50 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 7138F34BC6F for ; Wed, 28 Mar 2012 18:33:29 +0000 (UTC) Date: Wed, 28 Mar 2012 18:33:29 +0000 (UTC) From: "Todd Lipcon (Commented) (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1431575019.29467.1332959609465.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240611#comment-13240611 ] Todd Lipcon commented on HDFS-1218: ----------------------------------- Hi Uma. The idea was to exclude the restarted node for length calculation. It looks like you're right that we aren't putting them in syncList at all, whereas we could put them in syncList in the case that they have length >= the calculated minlength. However, it's still the case that DN3 might be shorter than the good replicas, and not included in recovery. In that case, it should be deleted when it reports the block with the too-low GS later. I guess the real issue is that we don't include all RBW blocks in block reports in the 1.0 implementation, so it sticks around forever? > 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization > --------------------------------------------------------------------------------------------------------- > > Key: HDFS-1218 > URL: https://issues.apache.org/jira/browse/HDFS-1218 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Affects Versions: 0.20-append > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Critical > Fix For: 0.20.205.0 > > Attachments: HDFS-1218.20s.2.patch, hdfs-1281.txt > > > When a datanode experiences power loss, it can come back up with truncated replicas (due to local FS journal replay). Those replicas should not be allowed to truncate the block during block synchronization if there are other replicas from DNs that have _not_ restarted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira