hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ajay Kumar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12578) TestDeadDatanode#testNonDFSUsedONDeadNodeReReg failing in branch-2.7
Date Fri, 06 Oct 2017 19:25:03 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195078#comment-16195078
] 

Ajay Kumar commented on HDFS-12578:
-----------------------------------

[~xiaochen], Thanks for checking this. Seems we are narrowly escaping test failure in 2.8.
Below is what i found:
* {{HeartbeatManager#shouldAbortHeartbeatCheck}} return true in 2.7 and not in 2.8.
*  Below logs clearly shows that time elapsed by {{heartbeatStopWatch}} in 2.7 is almost always
>1 while in case of 2.8 it is mostly 0 or 1. (Run it multiple times and always got the
same result)
In 2.7
{code}
2017-10-06 12:19:13,425 INFO  blockmanagement.HeartbeatManager (HeartbeatManager.java:shouldAbortHeartbeatCheck(262))
- shouldAbortHeartbeatCheck: true elapsed:1087 heartbeatRecheckInterval:1
{code}
in 2.8.2
{code}
2017-10-06 12:20:28,376 INFO  blockmanagement.HeartbeatManager (HeartbeatManager.java:shouldAbortHeartbeatCheck(287))
- shouldAbortHeartbeatCheck: false elapsed:0 heartbeatRecheckInterval:1
2017-10-06 12:20:28,376 INFO  blockmanagement.HeartbeatManager (HeartbeatManager.java:shouldAbortHeartbeatCheck(287))
- shouldAbortHeartbeatCheck: false elapsed:0 heartbeatRecheckInterval:1
2017-10-06 12:20:28,378 INFO  blockmanagement.HeartbeatManager (HeartbeatManager.java:shouldAbortHeartbeatCheck(287))
- shouldAbortHeartbeatCheck: false elapsed:1 heartbeatRecheckInterval:1
{code}
Temp log statement
{code}
 @VisibleForTesting
  boolean shouldAbortHeartbeatCheck(long offset) {
    long elapsed = heartbeatStopWatch.now(TimeUnit.MILLISECONDS);
    if(offset==0)
      LOG.info("shouldAbortHeartbeatCheck: "+(elapsed + offset >
          heartbeatRecheckInterval)+" elapsed:"+elapsed+" "
          + "heartbeatRecheckInterval:"+heartbeatRecheckInterval);
return elapsed + offset > heartbeatRecheckInterval;
  }
{code}
* Even if we run the test case in 2.8 with changes for {{StopWatch}} from 2.7, test passes.


> TestDeadDatanode#testNonDFSUsedONDeadNodeReReg failing in branch-2.7
> --------------------------------------------------------------------
>
>                 Key: HDFS-12578
>                 URL: https://issues.apache.org/jira/browse/HDFS-12578
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: test
>            Reporter: Xiao Chen
>            Assignee: Ajay Kumar
>            Priority: Blocker
>         Attachments: HDFS-12578-branch-2.7.001.patch
>
>
> It appears {{TestDeadDatanode#testNonDFSUsedONDeadNodeReReg}} is consistently failing
in branch-2.7. We should investigate and fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message