hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5746) Job diagnostics can implicate wrong task for a failed job
Date Fri, 07 Feb 2014 20:52:19 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895016#comment-13895016
] 

Jason Lowe commented on MAPREDUCE-5746:
---------------------------------------

Looks like this is fallout from MAPREDUCE-5317.  The job now can linger a bit when it fails
to wait for all the tasks to complete.  This can cause other task failure events to be written
to the job history file, and the history server job parser currently assigns the last task
failed event as the reason the job failed.  It should be reporting the first one rather than
the last one.

> Job diagnostics can implicate wrong task for a failed job
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-5746
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5746
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver
>    Affects Versions: 0.23.10, 2.1.1-beta
>            Reporter: Jason Lowe
>
> We've seen a number of cases where the history server is showing the wrong task as the
reason a job failed.  For example, "Task task_1383802699973_515536_m_027135 failed 1 times"
when some other task had failed 4 times and was the real reason the job failed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message