hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-3286) Unit tests for MAPREDUCE-3186 - User jobs are getting hanged if the Resource manager process goes down and comes up while job is getting executed.
Date Thu, 27 Oct 2011 19:40:32 GMT
Unit tests for MAPREDUCE-3186 - User jobs are getting hanged if the Resource manager process
goes down and comes up while job is getting executed.
--------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: MAPREDUCE-3286
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3286
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv2
    Affects Versions: 0.23.0
         Environment: linux
            Reporter: Eric Payne
            Assignee: Eric Payne
            Priority: Blocker


If the resource manager is restarted while the job execution is in progress, the job is getting
hanged.
UI shows the job as running.
In the RM log, it is throwing an error "ERROR org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
AppAttemptId doesnt exist in cache appattempt_1318579738195_0004_000001"
In the console MRAppMaster and Runjar processes are not getting killed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message