hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4803) large pending jobs hog resources
Date Mon, 09 Feb 2009 04:12:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671723#action_12671723
] 

Matei Zaharia commented on HADOOP-4803:
---------------------------------------

Yes exactly. The last time each pool was at its min / fair share is already being maintained
by the preemption patch (HADOOP-4665), so it won't be much work. One other benefit of this
change will be that jobs will tend to reuse the same slot more often, leading to more JVM
reuse. This can be a bad thing if it leads to poor locality, but HADOOP-4667 will ensure that
a job keeps using a node till it runs out of local blocks to read on that node, and then waits
and switches to hopefully a node where it has more local data to process. This should give
us the best of both JVM reuse and data locality. (When I talked to Arun and Owen about the
use of deficits in the fair scheduler before, they were concerned that it may lead to less
JVM reuse because jobs will jump between slots more often.)

> large pending jobs hog resources
> --------------------------------
>
>                 Key: HADOOP-4803
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4803
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fair-share
>            Reporter: Joydeep Sen Sarma
>            Assignee: Matei Zaharia
>
> observing the cluster over the last day - one thing i noticed is that small jobs (single
digit tasks) are not doing a good job competing against large jobs. what seems to happen is
that:
> - large job comes along and needs to wait for a while for other large jobs.
> - slots are slowly transfered from one large job to another
> - small tasks keep waiting forever.
> is this an artifact of deficit based scheduling? it seems that long pending large jobs
are out-scheduling small jobs

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message