incubator-mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Mahler (JIRA)" <>
Subject [jira] [Resolved] (MESOS-429) Hadoop MesosScheduler has a deadlock.
Date Mon, 22 Apr 2013 18:47:15 GMT


Benjamin Mahler resolved MESOS-429.

    Resolution: Fixed
> Hadoop MesosScheduler has a deadlock.
> -------------------------------------
>                 Key: MESOS-429
>                 URL:
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Benjamin Mahler
>            Assignee: Benjamin Mahler
>            Priority: Blocker
> This was found with the help of Brenden Matthews.
> JobTracker.heartbeat (synchronized) calls MesosScheduler.assignTasks (synchronized)
> MesosScheduler.resourceOffers (synchronized) calls into JobTracker.getJobStatus (synchronized).
> Thread 24558: (state = BLOCKED)
>  - org.apache.hadoop.mapred.JobTracker.getJobStatus(java.util.Collection, boolean) @bci=0,
line=4592 (Interpreted frame)
>  - org.apache.hadoop.mapred.JobTracker.jobsToComplete() @bci=11, line=4157 (Interpreted
>  - org.apache.hadoop.mapred.MesosScheduler.resourceOffers(org.apache.mesos.SchedulerDriver,
java.util.List) @bci=9, line=273 (Compiled frame)
> Thread 24575: (state = BLOCKED)
>  - org.apache.hadoop.mapred.MesosScheduler.assignTasks(org.apache.hadoop.mapreduce.server.jobtracker.TaskTracker)
@bci=25, line=219 (Compiled frame)
>  - org.apache.hadoop.mapred.JobTracker.heartbeat(org.apache.hadoop.mapred.TaskTrackerStatus,
boolean, boolean, boolean, short) @bci=507, line=2951 (Compiled frame)
> The simplest fix for now would be to unsynchronize the Scheduler interface implementations.
As a result, when we have to modify the state of MesosScheduler inside those methods, we need
to do so in a synchronized block. So long as we don't invoke the JobTracker methods from these
synchronized blocks, we won't have a deadlock. We can clean this up later, if a cleaner abstraction
is needed.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message