hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhihai xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6433) launchTime may be negative
Date Mon, 20 Jul 2015 08:41:04 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633176#comment-14633176
] 

zhihai xu commented on MAPREDUCE-6433:
--------------------------------------

Thanks for reporting this issue [~aw]! I can work on this issue.
I looked at the code, there is only one possibility to generate negative launchTime.
Based on the code:
job launch time is from JobInitedEvent which is created at JobImpl.java
{code}
      JobStartEvent jse = (JobStartEvent) event;
      if (jse.getRecoveredJobStartTime() != 0) {
        job.startTime = jse.getRecoveredJobStartTime();
      } else {
        job.startTime = job.clock.getTime();
      }
{code}
So the only possibility is jse.getRecoveredJobStartTime() returns negative value.
JobStartEvent is created at MRAppMaster.java. {{getRecoveredJobStartTime}} depends on {{MRAppMaster#recoveredJobStartTime}}
{{recoveredJobStartTime}} is initialized as 0 but it will be changed at the following code
in {{MRAppMaster#parsePreviousJobHistory}}
{code}
    recoveredJobStartTime = jobInfo.getLaunchTime();
{code}
It means jobInfo.getLaunchTime() may return negative value.
jobInfo is defined in {{JobHistoryParser#JobInfo}} and the launchTime is initialized as -1,
It will only be changed by handleJobInitedEvent.
{code}
      submitTime = launchTime = finishTime = -1;
{code}
So if previous JobHistory file doesn't have JOB_INITED event, the issue may happen.
Based on this, I think if the first job attempt failed to init job(JOB_INIT), we may hit this
issue for the second job attempt.
I will create a test case to reproduce this issue.

> launchTime may be negative
> --------------------------
>
>                 Key: MAPREDUCE-6433
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6433
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver
>    Affects Versions: 2.4.1
>            Reporter: Allen Wittenauer
>            Assignee: zhihai xu
>
> Under extremely rare conditions (.0017% in our sample size), launchTime in the jhist
files may be set to -1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message