Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Wed, 28 Mar 2012 18:33:29 +0000 (UTC)
From: "Todd Lipcon (Commented) (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: 
 <1431575019.29467.1332959609465.JavaMail.tomcat@hel.zones.apache.org>
Subject: [jira] [Commented] (HDFS-1218) 20 append: Blocks recovered on
 startup should be treated with lower priority during block synchronization
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240611#comment-13240611 ] 

Todd Lipcon commented on HDFS-1218:
-----------------------------------

Hi Uma. The idea was to exclude the restarted node for length calculation. It looks like you're right that we aren't putting them in syncList at all, whereas we could put them in syncList in the case that they have length >= the calculated minlength.

However, it's still the case that DN3 might be shorter than the good replicas, and not included in recovery. In that case, it should be deleted when it reports the block with the too-low GS later. I guess the real issue is that we don't include all RBW blocks in block reports in the 1.0 implementation, so it sticks around forever?
                
> 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1218
>                 URL: https://issues.apache.org/jira/browse/HDFS-1218
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20-append
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.20.205.0
>
>         Attachments: HDFS-1218.20s.2.patch, hdfs-1281.txt
>
>
> When a datanode experiences power loss, it can come back up with truncated replicas (due to local FS journal replay). Those replicas should not be allowed to truncate the block during block synchronization if there are other replicas from DNs that have _not_ restarted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira