hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramgopal N (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3347) Resource manager is not respawning MRAppMaster process if it goes down in the middle of job execution and the job is getting failed.
Date Sat, 05 Nov 2011 04:58:51 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144582#comment-13144582
] 

Ramgopal N commented on MAPREDUCE-3347:
---------------------------------------

Hi vinod ,
By enabling yarn.resourcemanager.am.max-retries in yarn-site.xml the RM retries specified
number of times before failing the job. Thanks



                
> Resource manager is not respawning MRAppMaster process if it goes down in the middle
of job execution and the job is getting failed.
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3347
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3347
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Ramgopal N
>
> ApplicationMaster service should recover the job if MRAppMaster process goes down in
the middle of job execution.If not MRAppMaster process becomes the single point of failure
for the job and losses the advantage of MRV1 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message