hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-805) Deadlock in Jobtracker
Date Tue, 11 Aug 2009 03:56:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741680#action_12741680
] 

Amar Kamat commented on MAPREDUCE-805:
--------------------------------------

We cannot test the deadlock code but tested the code that has changed i.e job-init, job-kill,
empty-job, job-with-no-setup-cleanup and job with 0-maps/reduces. 
||job-empty?||setup-cleanup-required?||killed in init?||result||
|yes|yes|yes|pass (job killed)|
|yes|yes|no|pass (setup-cleanup launched and job succeeded)|
|yes|no|yes|pass (job killed)|
|yes|no|no|pass (job marked succeeded in JobTracker.initJob())|
|no|yes|yes|pass (job killed after init)|
|no|yes|no|pass (job runs to completion)| 
|no|no|yes|pass (job killed after init)|
|no|no|no|pass (job runs to completion)|

Did I miss anything?

> Deadlock in Jobtracker
> ----------------------
>
>                 Key: MAPREDUCE-805
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-805
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Michael Tamm
>         Attachments: MAPREDUCE-805-v1.1.patch, MAPREDUCE-805-v1.11-branch-0.20.patch,
MAPREDUCE-805-v1.11.patch, MAPREDUCE-805-v1.2.patch, MAPREDUCE-805-v1.3.patch, MAPREDUCE-805-v1.6.patch,
MAPREDUCE-805-v1.7.patch
>
>
> We are running a hadoop cluster (version 0.20.0) and have detected the following deadlock
on our jobtracker:
> {code}
> "IPC Server handler 51 on 9001":
> 	at org.apache.hadoop.mapred.JobInProgress.getCounters(JobInProgress.java:943)
> 	- waiting to lock <0x00007f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
> 	at org.apache.hadoop.mapred.JobTracker.getJobCounters(JobTracker.java:3102)
> 	- locked <0x00007f2b5f026000> (a org.apache.hadoop.mapred.JobTracker)
> 	at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>  "pool-1-thread-2":
> 	at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:2017)
> 	- waiting to lock <0x00007f2b5f026000> (a org.apache.hadoop.mapred.JobTracker)
> 	at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2483)
> 	- locked <0x00007f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
> 	at org.apache.hadoop.mapred.JobInProgress.terminateJob(JobInProgress.java:2152)
> 	- locked <0x00007f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
> 	at org.apache.hadoop.mapred.JobInProgress.terminate(JobInProgress.java:2169)
> 	- locked <0x00007f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
> 	at org.apache.hadoop.mapred.JobInProgress.fail(JobInProgress.java:2245)
> 	- locked <0x00007f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
> 	at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:86)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:619)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message