hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Assigned: (HADOOP-1245) value for mapred.tasktracker.tasks.maximum taken from jobtracker, not tasktracker
Date Wed, 10 Oct 2007 20:17:50 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Doug Cutting reassigned HADOOP-1245:

    Assignee: Michael Bieniosek

+1 This looks reasonable to me.

Note that this is potentially incompatible, since previously folks could set the number of
tasks per node globally at the jobtracker, now it is determined by the configuration of the
tasktracker nodes.  So we should probably either add a compatibility note (i.e., move this
to the INCOMPATIBLE section of CHANGES.txt) or perhaps have a configuration parameter that
enables the old behavior.  I think a compatibility note is probably sufficient.  Thoughts?

> value for mapred.tasktracker.tasks.maximum taken from jobtracker, not tasktracker
> ---------------------------------------------------------------------------------
>                 Key: HADOOP-1245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1245
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.3
>            Reporter: Michael Bieniosek
>            Assignee: Michael Bieniosek
>         Attachments: tasktracker-max-tasks-1245.patch
> I want to create a cluster with machines with different numbers of CPUs.  Consequently,
each machine should have a different value for mapred.tasktracker.tasks.maximum, since my
map tasks are CPU bound.
> When a new job starts up, the jobtracker uses its (single) value for mapred.tasktracker.tasks.maximum
to assign tasks.  This means that each tasktracker gets the same number of tasks, regardless
of how I configured that particular machine.
> The jobtracker should not consult its config for the value of mapred.tasktracker.tasks.maximum.
 It should assign tasks (or allow tasktrackers to request tasks) according to each tasktracker's
value of mapred.tasktracker.tasks.maximum.
> Originally, I thought the behavior was slightly different, so this issue contained this
> After the first task finishes on each tasktracker, the tasktracker will request new tasks
from the jobtracker according to the tasktracker's value for mapred.tasktracker.tasks.maximum.
 So after the first round of map tasks is done, the cluster reverts to a mode that works well
for heterogeneous clusters.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message