hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5090) The capacity-scheduler should assign multiple tasks per heartbeat
Date Fri, 05 Jun 2009 07:06:07 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716526#action_12716526
] 

Arun C Murthy commented on HADOOP-5090:
---------------------------------------

I'd strongly urge *against* assigning multiple reduces per task. When I did it HADOOP-3136
it caused _bad_ imbalances with reduces... for e.g. consider 2 jobs - one with 'small' reduces
and other with 'heavy' reduces. If we assign multiple reduces then a portion of the cluster
(tasktrackers) will run the 'small' reduces and the others will run 'heavy' reduces leading
to bad imbalances in load on the machine. Given that we decided to assign only 1 reduce per
heartbeat wiht HADOOP-3136 to achieve better load balance.

> The capacity-scheduler should assign multiple tasks per heartbeat
> -----------------------------------------------------------------
>
>                 Key: HADOOP-5090
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5090
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Arun C Murthy
>            Assignee: Vinod K V
>            Priority: Critical
>         Attachments: HADOOP-5090-20090504.txt, HADOOP-5090-20090506.txt, HADOOP-5090-20090604.txt
>
>
> HADOOP-3136 changed the default o.a.h.mapred.JobQueueTaskScheduler to assign multiple
tasks per TaskTracker heartbeat, the capacity-scheduler should do the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message