hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5940) Avoid negative elapsed time in JHS/MRAM web UI and services
Date Thu, 03 Jul 2014 15:58:25 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051610#comment-14051610
] 

Zhijie Shen commented on MAPREDUCE-5940:
----------------------------------------

Thanks for review, Junping and Devaraj.

bq. If System.currentTimeMillis() < started, then we can return -1 or 0 instead

IMHO, Times#elapsed is to computed the delta between two timestamps: started and finished.
Given System.currentTimeMillis() < started <= finished, it still should be a valid case.
To make sure the elapsed time should always be non-negative, we need to check started <=
finished, and return -1 if not.

bq. (and log a warn that clock not getting synchronized)
bq. Adding a warning/info message before making it as 0 would help to diagnose/find out the
issues if any.
bq. Also adding a test in TestTimes.java could be a good idea.

Sounds a good idea. Will address it in the new patch.

In addition, add a code comment to explicitly declare the behavior of Times#elapsed

> Avoid negative elapsed time in JHS/MRAM web UI and services
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-5940
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5940
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver, mr-am, webapps
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>         Attachments: MAPREDUCE-5940.1.patch, MAPREDUCE-5940.2.patch
>
>
> Recently we observed a rare bug that an elapsed time of a reducer is going to be negative
on JHS web UI and via REST APIs. While the real reason for this bug seems to be clock asynchronization
on different hosts, the web frontend should have masked the negative values. However, in the
current code, *org.apache.hadoop.mapreduce.v2.app.webapp.dao.** only check whether the elapsed
time is -1 or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message