hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6681) TestRBWBlockInvalidation#testBlockInvalidationWhenRBWReplicaMissedInDN is flaky and sometimes gets stuck in infinite loops
Date Mon, 12 Jan 2015 22:51:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274330#comment-14274330
] 

Ming Ma commented on HDFS-6681:
-------------------------------

Thanks, Ratandeep! I agree with your detailed analysis. For your description "One scenario
in which this loop will never break is when the Namenode tries to schedule a new replica on
the same node on which we actually corrupted the block.", {{BlockManager}}'s {{markBlockAsCorrupt}}
function will ask DN to invalidate the corrupt block after NN receives block report from that
DN. So if the replication is scheduled after that, then the loop should break. We can fix
the test code to make sure the correct order of events happen. But that is more complicated.

For the first loop, does it just reduce the chance? If the test runs on slow machine, maybe
3 is still not enough. We can delay the start of 3rd DN by moving  {{cluster.startDataNodes(conf,
1, true, null, null, null);}} after the loop check; but that doesn't prevent the the replication
to finish quickly on the same DN with old corrupted block.

> TestRBWBlockInvalidation#testBlockInvalidationWhenRBWReplicaMissedInDN is flaky and sometimes
gets stuck in infinite loops
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6681
>                 URL: https://issues.apache.org/jira/browse/HDFS-6681
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.4.1
>         Environment: Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
> Linux [hostname] 2.6.32-279.14.1.el6.x86_64 #1 SMP Mon Oct 15 13:44:51 EDT 2012 x86_64
x86_64 x86_64 GNU/Linux
>            Reporter: Ratandeep Ratti
>            Assignee: Ratandeep Ratti
>         Attachments: HDFS-6681.patch
>
>
> This testcase has 3 infinite loops which break only on certain conditions being satisfied.
> 1st loop checks if there should be a single live replica. It assumes this to be true
since it has just corrupted a block on one of the datanodes (testcase has replication factor
as 2). One scenario in which this loop will never break is if the Namenode invalidates the
corrupt replica, schedules a replication command, and the new copied replica is added all
before this testcase has the chance to check the live-replica count.
> 2nd loop checks there should be 2 live replicas. It assumes this to be true (in some
time) since the first loop has broken implying there is a single replica and now it is only
a matter of time when the Namenode schedules a replication command to copy a replica to another
datanode. One scenario in which this loop will never break is when the Namenode tries to schedule
a new replica on the same node on which we actually corrupted the block. That dst. datanode
will not copy the block, complaining that it already has the (corrupted) replica in the create
state. The situation that results is that Namenode has scheduled a copy to a datanode, the
block is now in the namenode's pending replication queue, this block will never be removed
from the pending replication queue because the namenode will never receive a report from the
datanodes that the block is 'added'.
> Note: The block can be transferred from the 'pending replication' to "needed replication"
queue once the pending timeout (5 minutes) expires. The Namenode then actively tries to schedule
a replication for blocks in 'needed replication' queue. This can cause the 2nd loop to break
but the time in which this process gets kicked in is more than 5 minutes.
> 3rd loop: This loops checks if there are no corrupt replicas. I don't see a scenario
in which this loop can go on for ever, since once the live replica count goes back to normal
(2), the corrupted block will be removed
> I guess increasing the heart beat interval time, so that the testcase has enough time
to check condition in loop 1 before a datanode reports a successful copy should help avoid
race condition in loop1. Regarding loop2 I guess we can reduce the timeout after which the
block is transferred from the pending replication to the needed replication queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message