hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (YARN-929) 2 MRAppMaster running parallely for same Job Id
Date Tue, 16 Jul 2013 14:06:49 GMT

     [ https://issues.apache.org/jira/browse/YARN-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jason Lowe resolved YARN-929.
-----------------------------

    Resolution: Duplicate

This is an issue with the MRAppMaster, currently tracked by MAPREDUCE-5396.
                
> 2 MRAppMaster running parallely for same Job Id
> -----------------------------------------------
>
>                 Key: YARN-929
>                 URL: https://issues.apache.org/jira/browse/YARN-929
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.0.5-alpha
>            Reporter: rohithsharma
>
> Configuration : 
>     yarn.resourcemanager.am.max-retries = 3
> Scenario is 
>     NodeManager is killed forcefully i.e using kill -9 NM_PID.
>     After Node expiry , RM killed all the container running in this NodeManager.
>     But , MRAppMaster JVM is still running.
>     RM spawn the 2nd attempt MRAppMaster since am retry is configured as 3. At this point,
there are 2 MRAppMaster is running parallely for same job Id
> Problem from running 2 MRApp is 1st attempt appmaster deletes the job information from
hdfs which cause FileNotFoundException for 2nd attempt MRApp.  
>      

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message