hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gian Lorenzo Thione <thi...@powerset.com>
Subject Re: Per machine configuration of map.tasks
Date Thu, 10 Aug 2006 17:33:20 GMT
On the same type of issues, I have an additional suggestion or idea which we
may work on: has anyybody dealt with the issue of having different classes
of nodes in a hadoop cluster? Typicaly such nodes would be subject to
different constraints (memory, speed, disk, etc.) and some tasks may fail on
lower classes nodes while they could succeed on higher class nodes.

Since we're running on a cluster with a high percentage of lower class nodes
and a small percentage of higher class nodes, I'm afraid that running all
nodes equal to the eyes of the JobTracker would increase the likelihood that
a task failed many times over because it was sent to all lower class nodes,
as each one returned a failure.

The idea is that the JobTracker wouldn't classify a job as failed based
simply on the number of times a task fails, but would make sure that that
task was performed up to the maximum number of times allowed, on the higher
class node available.  So if a task is assigned to a high node and fails, it
continued to be assigned to other high nodes until the maximum number of
attempts is reached, whereas if it was assigned to a lower class node then
it could

A) be directly assigned to a higher node (climbing up each time through all
classes available). So that the actual maximum number of attempts is (L-1)+M
where L is the number of classes of nodes and M is the number of maximum
attempts

B) be assigned to another node in the same class until the maximum number of
attempts is reached, after which it is moved to a higher class. Thus the
ntotal number of possible attempts is L * M

C) be assigned to another random node of class equal or higher. If the class
is the same then a counter is kept that maxes out at the maximum number of
attempts per task, and if the class is the highest then the job fails,
otherwise it is forcefully moved to a higher class and the counter
restarted. The number of attempts is less deterministic but the worse case
seems to still be L *M

Ideas, suggestions, comments?

Lorenzo Thione

Powerset, Inc.


On 8/9/06 2:46 PM, "Owen O'Malley" <owen@yahoo-inc.com> wrote:

> 
> On Aug 8, 2006, at 11:07 PM, Gian Lorenzo Thione wrote:
> 
>> In my understanding, mapred.tasktracker.tasks.maximum is ued to
>> decide how
>> many tasks should be allocated simultaneously per tasktracker. My
>> problem is
>> I would like to set this parameter individually for each
>> tastracker, each
>> one telling a job tracker how many tasks that node can deal with
>> simultaneously (my tasks are extremely CPU and memory intensive),
>> so the
>> number would be a function of the number of CPUs, number of other
>> processes
>> running, amount of memory etc....
> 
> Your understanding of the current code is correct. Currently the job
> tracker assumes that the number is constant across the cluster.
> 
>> Is that something that hadoop supports? Is that something that we
>> could
>> implement and contribute back? Any interest in this functionality?
> 
> In my opinion, it is reasonable to let it vary between task trackers.
> The changes would not be extensive to support it.  If you wrote such
> a patch it would be nice to commit it back.
> 
> Thanks,
>     Owen


Mime
View raw message