hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Ratan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3651) When assigning tasks to trackers, the job tracker should try to balance the number of tasks among the available trackers
Date Fri, 27 Jun 2008 08:33:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608688#action_12608688

Vivek Ratan commented on HADOOP-3651:

In the current JT, the code for determining which task to hand a TT uses the following logic:
the JT first figures out the 'remaining load' per TT for maps/reduces (which is the total
number of map and reduce tasks that need to be run across all running jobs, divided by the
num of TTs). It then figures out how many maximum map or reduce tasks should be run on the
TT (which is the minimum of the TT's capacity and the 'remaining load') - call this the 'max
load'. Finally, if a TT can run something (ie, if the # of maps/reduces it is running is less
than the 'max load'), it looks to give it a map task or a reduce task. 

As I had mentioned in a mail I sent to core-dev on 5/23, this logic can result  in some TTs
not getting a task to run, even when there are tasks waiting to be run. It can also result
in a skewed distribution of tasks among TTs. Maye something like that is happening here. I
don't know if it's possible to see the log files and determine what exactly happened. 

The new Resource Manager will, I think, result in a better distribution. For one, a TT's request
is never rejected if there is a task to run. for another, the load will likely be spread out
more evenly. 

> When assigning tasks to trackers, the job tracker should try to balance the number of
tasks among the available trackers
> ------------------------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-3651
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3651
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
> I encounter a number of situations like this:
> A job tracker has 200 task trackers, each with 2 mapper slots and reducer slots.
> When a job with 200 or fewer reducers was submitted to the job tracker,
> one normally each task tracker will run one reducer.
> Unfortunately, it seems that only  about 1/3 of trackers have one reducer, and 1/3 trackers
don't have reducer, and 1/3 have 2 reducers!

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message