hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5154) 4-way deadlock in FairShare scheduler
Date Wed, 18 Feb 2009 19:54:02 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674764#action_12674764
] 

Matei Zaharia commented on HADOOP-5154:
---------------------------------------

Hi Hemanth,

The reason showJobs doesn't lock the JobTracker is because it only looks at information in
the fair scheduler. I have the lock around the JT for the other two methods because update()
will require a lock on the JT once preemption is introduced in order to safely call killTask()
on the JobTracker without causing a deadlock. If you just lock the JT independently or the
FS independently, you can't cause a deadlock, but the problem is that the assignTasks method
in the JobTracker locks first the JT and then the FS and thus causes a problem. However, maybe
a better solution is to forget the lock on the JT in this patch, and change the way killTasks
is called so that it's not called with a lock around the FairScheduler. This could be done
by adding the tasks to kill into a queue and having a second thread kill them. Since the preemption
patch, HADOOP-4665 is not going into release 0.20, I'm going to do that for this patch and
fix up the patch for HADOOP-4665 to ensure that you only need a lock on the FS when calling
update().

I can also turn off the refresh by default. How is it handled on the JobTracker's web UI?
For some reason I thought that refreshed.

> 4-way deadlock in FairShare scheduler
> -------------------------------------
>
>                 Key: HADOOP-5154
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5154
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fair-share
>            Reporter: Vinod K V
>            Assignee: Matei Zaharia
>            Priority: Blocker
>             Fix For: 0.18.4, 0.20.0
>
>         Attachments: FairSchedulerDeadLock.txt, hadoop-5154-v0.patch, hadoop-5154-v1.patch,
hadoop-5154-v2.patch
>
>
> This happened while trying to change the priority of a job from the scheduler servlet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message