hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuri Pradkin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1245) value for mapred.tasktracker.tasks.maximum taken from two different sources
Date Sat, 06 Oct 2007 21:02:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532921

Yuri Pradkin commented on HADOOP-1245:

I too came across this problem the hard way (our weaker nodes started thrashing, because they
were running too many jobs.  In my case I don't even see it reverting to the desired behavior
after the first round of jobs is
finished, - it's always the jobtracker's config value.  Maybe I'm using a different version
of Hadoop? (it's 0.15 devel from svn). 

If anyone knows a way to run different number of tasks on different boxes, please let me know,
because from what I see, there is no way to do it, which makes our hadoop cluster as lame
as the lamest node in it.

> value for mapred.tasktracker.tasks.maximum taken from two different sources
> ---------------------------------------------------------------------------
>                 Key: HADOOP-1245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1245
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.3
>            Reporter: Michael Bieniosek
> I want to create a cluster with machines with different numbers of CPUs.  Consequently,
each machine should have a different value for mapred.tasktracker.tasks.maximum, since my
map tasks are CPU bound.
> However, hadoop uses BOTH the values for mapred.tasktracker.tasks.maximum on the jobtracker
and the tasktracker.  
> When a new job starts up, the jobtracker uses its (single) value for mapred.tasktracker.tasks.maximum
to assign tasks.  This means that each tasktracker gets the same number of tasks, regardless
of how I configured that particular machine.
> After the first task finishes on each tasktracker, the tasktracker will request new tasks
from the jobtracker according to the tasktracker's value for mapred.tasktracker.tasks.maximum.
 So after the first round of map tasks is done, the cluster reverts to a mode that works well
for heterogeneous clusters.
> The jobtracker should not consult its config for the value of mapred.tasktracker.tasks.maximum.
 It should assign tasks (or allow tasktrackers to request tasks) according to each tasktracker's
value of mapred.tasktracker.tasks.maximum.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message