hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srikanth Upputuri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7082) When replication factor equals number of data nodes, corrupt replica will never get substituted with good replica
Date Thu, 18 Sep 2014 05:55:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138555#comment-14138555
] 

Srikanth Upputuri commented on HDFS-7082:
-----------------------------------------

Currently if the below condition in BlockManager#markBlockAsCorrupt is true we go ahead and
invalidate the corrupt replica. But for the scenario in question, it will be false.
{code}
    boolean hasMoreCorruptReplicas = minReplicationSatisfied &&
        (numberOfReplicas.liveReplicas() + numberOfReplicas.corruptReplicas()) >
        bc.getBlockReplication();
{code}
I propose to change this to 
{code}
    boolean hasMoreCorruptReplicas = minReplicationSatisfied &&
        (numberOfReplicas.liveReplicas() + numberOfReplicas.corruptReplicas()) >=
        bc.getBlockReplication();
{code}
This solves the current problem as well as retains almost all the existing behavior.
Now we let the 'total replicas' become 'replication factor - 1'. And we don't let it go down
beyond that. This will effectively vacate a slot on exactly one datanode and let the replication
happen, thereby solving the reported problem. 

Example scenarios: 

1. DN1, DN2, DN3, replication factor =3, DN3 replica is corrupt. 
The corrupt replica is invalidated and deleted. New live replica will be written to DN3.

2. DN1, DN2, DN3, replication factor =3, DN2 and DN3 replicas are corrupt. 
DN3 sends block report. The corrupt replica on DN3 is invalidated and deleted. DN2 sends block
report. The corrupt replica on DN2 will not be invalidated as the current 'total replicas'
< 'replication factor'. New live replica will eventually be written to DN3. Then on further
block report from DN2, the corrupt replica will get deleted.


> When replication factor equals number of data nodes, corrupt replica will never get substituted
with good replica
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7082
>                 URL: https://issues.apache.org/jira/browse/HDFS-7082
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Srikanth Upputuri
>            Assignee: Srikanth Upputuri
>            Priority: Minor
>
> BlockManager will not invalidate a corrupt replica if this brings down the total number
of replicas below replication factor (except if the corrupt replica has a wrong genstamp).
On clusters where the replication factor = total data nodes, a new replica can not be created
from a live replica as all the available datanodes already have a replica each. Because of
this, the corrupt replicas will never be substituted with good replicas, so will never get
deleted. Sooner or later all replicas may get corrupt and there will be no live replicas in
the cluster for this block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message