hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8163) Using monotonicNow for block report scheduling causes test failures on recently restarted systems
Date Fri, 17 Apr 2015 21:54:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500757#comment-14500757
] 

Arpit Agarwal commented on HDFS-8163:
-------------------------------------

While working on more tests I found some more issues with the timestamp usage. The [System.nanotime|https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime()]
docs state that it can return a negative value and can overflow between successive invocations.
So two values should never be compared directly but diffed to handle overflow.

My guess is that negative values/overflow are unlikely on the platforms we care about but
we should be handling them correctly anyway. I plan to split out the timestamp handling logic
of BPServiceActor into a separate utility class for clarity and ease of unit testing. Will
post an updated patch later today.

> Using monotonicNow for block report scheduling causes test failures on recently restarted
systems
> -------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8163
>                 URL: https://issues.apache.org/jira/browse/HDFS-8163
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.6.1
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>            Priority: Blocker
>         Attachments: HDFS-8163.01.patch, HDFS-8163.02.patch
>
>
> {{BPServiceActor#blockReport}} has the following check:
> {code}
>   List<DatanodeCommand> blockReport() throws IOException {
>     // send block report if timer has expired.
>     final long startTime = monotonicNow();
>     if (startTime - lastBlockReport <= dnConf.blockReportInterval) {
>       return null;
>     }
> {code}
> Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}}
which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime
may be less than {{dnConf.blockReportInterval}} and the block report is not sent.
> {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an
arbitrary origin. The time should be used only for comparison with other values returned by
{{System#nanoTime}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message