hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4595) JVM Reuse triggers RuntimeException("Invalid state")
Date Thu, 06 Nov 2008 03:17:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645389#action_12645389
] 

Aaron Kimball commented on HADOOP-4595:
---------------------------------------

There are two reducers allowed per node; maxFreeSlots is 2. But it was in the middle of running
tasks. numFreeSlots hit 1, even though two JVMs were already out there and marked as busy.

The attempt did not get launched. The launch process was what triggered the exception. 

> JVM Reuse triggers RuntimeException("Invalid state")
> ----------------------------------------------------
>
>                 Key: HADOOP-4595
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4595
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Aaron Kimball
>            Assignee: Devaraj Das
>
> A Reducer triggers the following exception:
> 08/11/05 08:58:50 INFO mapred.JobClient: Task Id : attempt_200811040110_0230_r_000008_1,
Status : FAILED
> java.lang.RuntimeException: Inconsistent state!!! JVM Manager reached an unstable state
while reaping a JVM for task: attempt_200811040110_0230_r_000008_1 Number of active JVMs:2
>  JVMId jvm_200811040110_0230_r_-735233075 #Tasks ran: 0 Currently busy? true Currently
running: attempt_200811040110_0230_r_000012_0
>  JVMId jvm_200811040110_0230_r_-1716942642 #Tasks ran: 0 Currently busy? true Currently
running: attempt_200811040110_0230_r_000040_0
>    at java.lang.Throwable.<init>(Throwable.java:67)
>    at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:245)
>    at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:113)
>    at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:78)
>    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:410) 
> Other clues:
> In the three reduce task attempts where this was observed, this was attempt _1. Attempt
_0 had started and eventually switches to "SUCCEEDED." So I think this is happening only on
speculatively-executed reduce task attempts. The reduce output (part-XXXXX) gets lost when
this attempt fails, even though the other (earlier) attempt succeeded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message