hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-1070) Deadlock in FairSchedulerServlet
Date Wed, 07 Oct 2009 07:42:31 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Todd Lipcon updated MAPREDUCE-1070:
-----------------------------------

    Attachment: deadlock.png

See attached diagram displaying inconsistent lock order based on dynamic analysis.

Here's a stack trace from an instance we saw this in production:

{noformat}
Thread 60324 (1823988020@qtp0-4064):
  State: BLOCKED
  Blocked count: 52
  Waited count: 32
  Blocked on org.apache.hadoop.mapred.JobInProgress@5d2044dd
  Blocked by 113 (IPC Server handler 9 on 7277)

  Stack:
    org.apache.hadoop.mapred.JobInProgress.finishedMaps(JobInProgress.java:560)
    org.apache.hadoop.mapred.FairSchedulerServlet.showJobs(FairSchedulerServlet.java:235)
    org.apache.hadoop.mapred.FairSchedulerServlet.doGet(FairSchedulerServlet.java:136)

    javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
 ...
Thread 113 (IPC Server handler 9 on 7277):
  State: BLOCKED
  Blocked count: 540572
  Waited count: 2658131
  Blocked on org.apache.hadoop.mapred.FairScheduler@a12d500

  Blocked by 60324 (1823988020@qtp0-4064)
  Stack:
    org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:2069)
    org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2538)
    org.apache.hadoop.mapred.JobInProgress.jobComplete(JobInProgress.java:2181)

    org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:2125)
    org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:892)
    org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:3415)

    org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:2712)
    org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2507)
{noformat}

The solution is that the servlet should synchronize on JobTracker before synchronizing on
jobs

> Deadlock in FairSchedulerServlet
> --------------------------------
>
>                 Key: MAPREDUCE-1070
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1070
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1, 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: deadlock.png
>
>
> FairSchedulerServlet can cause a deadlock with the JobTracker

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message