hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.
Date Fri, 04 May 2012 08:06:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268200#comment-13268200

Konstantin Shvachko commented on HDFS-3368:

I propose to adjust {{BlockPlacementPolicyDefault.chooseReplicaToDelete()}} to first look
at the oldest heartbeat time, and second at the free space, when all heartbeats are within
the heartbeat interval.
With such policy in the scenario above the replicas for deletion are most likely to be assigned
to do1, do2, do3, but will never be deleted, because the old nodes have already died. NN will
automatically remove replicas from the live ones 10 minutes later or so. 
Also when only one or two DNs malfunction in the similar scenario this will reduce unnecessary
deletions and replications.
No change in policy will be seen in regular case when all nodes function properly.
> Missing blocks due to bad DataNodes comming up and down.
> --------------------------------------------------------
>                 Key: HDFS-3368
>                 URL: https://issues.apache.org/jira/browse/HDFS-3368
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
> All replicas of a block can be removed if bad DataNodes come up and down during cluster
restart resulting in data loss.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message