hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Graves (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4152) map task left hanging after AM dies trying to connect to RM
Date Fri, 13 Apr 2012 15:30:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253460#comment-13253460
] 

Thomas Graves commented on MAPREDUCE-4152:
------------------------------------------

The Job did not kill off the map task that it had running before exiting.  In JobImpl when
it moves from RUNNING to ERROR, all it does is send the JobUnsuccessfulCompletion event. 
I would think it would atleast try to kill any tasks it has.

Now there might also be another issue with NM as to why it didn't kill it.  I need to investigate
that further.  The NM was also not able to connect to RM and I saw one of the threads restart.
I'm guessing when that restarted it lost that container but I need to investigate that further.


                
> map task left hanging after AM dies trying to connect to RM
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-4152
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4152
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.2
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>
> We had an instance where the RM went down for more then an hour.  The application master
exited with "Could not contact RM after 360000 milliseconds"
> 2012-04-11 10:43:36,040 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl:
job_1333003059741_15999Job Transitioned from RUNNING to ERROR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message