hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4595) JVM Reuse triggers RuntimeException("Invalid state")
Date Tue, 11 Nov 2008 04:51:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Devaraj Das updated HADOOP-4595:
--------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this.

> JVM Reuse triggers RuntimeException("Invalid state")
> ----------------------------------------------------
>
>                 Key: HADOOP-4595
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4595
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Aaron Kimball
>            Assignee: Devaraj Das
>             Fix For: 0.19.0
>
>         Attachments: 4595.patch
>
>
> A Reducer triggers the following exception:
> 08/11/05 08:58:50 INFO mapred.JobClient: Task Id : attempt_200811040110_0230_r_000008_1,
Status : FAILED
> java.lang.RuntimeException: Inconsistent state!!! JVM Manager reached an unstable state
while reaping a JVM for task: attempt_200811040110_0230_r_000008_1 Number of active JVMs:2
>  JVMId jvm_200811040110_0230_r_-735233075 #Tasks ran: 0 Currently busy? true Currently
running: attempt_200811040110_0230_r_000012_0
>  JVMId jvm_200811040110_0230_r_-1716942642 #Tasks ran: 0 Currently busy? true Currently
running: attempt_200811040110_0230_r_000040_0
>    at java.lang.Throwable.<init>(Throwable.java:67)
>    at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:245)
>    at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:113)
>    at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:78)
>    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:410) 
> Other clues:
> In the three reduce task attempts where this was observed, this was attempt _1. Attempt
_0 had started and eventually switches to "SUCCEEDED." So I think this is happening only on
speculatively-executed reduce task attempts. The reduce output (part-XXXXX) gets lost when
this attempt fails, even though the other (earlier) attempt succeeded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message