hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chelsey Chang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-5406) Task Tracker exiting with JVM manager inconsistent state
Date Fri, 19 Jul 2013 21:06:48 GMT
Chelsey Chang created MAPREDUCE-5406:
----------------------------------------

             Summary: Task Tracker exiting with JVM manager inconsistent state
                 Key: MAPREDUCE-5406
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5406
             Project: Hadoop Map/Reduce
          Issue Type: Bug
            Reporter: Chelsey Chang
            Assignee: Chelsey Chang


Looks like we are reaching JVM manager inconsistent state which cases TT to crash:
{code}
2013-06-09 06:41:11,250 FATAL org.apache.hadoop.mapred.JvmManager: Inconsistent state!!! JVM
Manager reached an unstable state while reaping a JVM for task: attempt_201306080400_104812_m_000001_0
Number of active JVMs:8
  JVMId jvm_201306080400_104517_m_1331138312 #Tasks ran: 0 Currently busy? true Currently
running: attempt_201306080400_104517_m_000001_0
  JVMId jvm_201306080400_104641_m_-1631395161 #Tasks ran: 0 Currently busy? true Currently
running: attempt_201306080400_104641_m_000000_0
  JVMId jvm_201306080400_104494_m_-1702464703 #Tasks ran: 0 Currently busy? true Currently
running: attempt_201306080400_104494_m_000000_0
  JVMId jvm_201306080400_104784_m_1407576088 #Tasks ran: 0 Currently busy? true Currently
running: attempt_201306080400_104784_m_000000_0
  JVMId jvm_201306080400_104530_m_186665365 #Tasks ran: 0 Currently busy? true Currently running:
attempt_201306080400_104530_m_000000_0
  JVMId jvm_201306080400_104589_m_-1080246077 #Tasks ran: 0 Currently busy? true Currently
running: attempt_201306080400_104589_m_000000_0
  JVMId jvm_201306080400_104674_m_830017814 #Tasks ran: 0 Currently busy? true Currently running:
attempt_201306080400_104674_m_000000_0
  JVMId jvm_201306080400_104719_m_-226910128 #Tasks ran: 0 Currently busy? true Currently
running: attempt_201306080400_104719_m_000000_0. Aborting. 
2013-06-09 06:41:11,250 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG: 
{code}

Although this causes TT to crash, the frequency of the error is rare and the error itself
is recoverable so the priority of the issue is not high.

However, this does look like a bug in the JVM manager state machine. I'm guessing there is
some race condition that we're hitting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message