hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5170) Set max map/reduce tasks on a per-job basis, either per-node or cluster-wide
Date Thu, 02 Jul 2009 19:01:48 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726633#action_12726633

Arun C Murthy commented on HADOOP-5170:

bq. If the patch interacts poorly with the capacity scheduler, wouldn't it make more sense
to tell users of the capacity scheduler not to use this feature, or to improve MAPREDUCE-516
so that it can take into account task limits?

Like I said, this patch introduced 2 features: per-node limits and per-job limits. The per-job
cluster limit interacts poorly with the MAPREDUCE-516, not per-node limits.

bq. The situation described in the original task was that a task is very CPU-intensive. The
user wanted to limit the number of those tasks running to less than the number of cores. However,
there is no need to fill up all slots on the machine - the other slots could be used for less
CPU-intensive tasks. Until there is a model for all system resources in the capacity scheduler,
this patch lets users with that kind of problem achieve reasonable behavior.

We are missing the woods for the trees here. The user whose task is CPU-intensive is happy.
But, what about the other users whose tasks need CPU too? How do we keep their tasks from
starving on the same node? In particular there are no checks and balances on preventing multiple
CPU-intensive tasks from being scheduled on the same node. If we *knew* that the other tasks
were going to be IO intensive we could co-schedule them on this node, but we don't. This is
the reason why Owen and I continue to insist that per-node task limits are a poor substitute
for modelling resource usage and that a resource model is a *pre-requisite* for this feature.

This feature works well for clusters with single users, but not in shared clusters.

bq. One situation in which you want to limit tasks on the whole cluster is if you have a housekeeping
job with long tasks, e.g. a database import or export. If you don't have a limit on running
tasks, such a job can take up the whole cluster and hold it for a long time.

The short-term fix is to submit these jobs to a special queue with limited capacity, possibly
to queues with a hard upper-limit on their capacity: MAPREDUCE-532. 

> Set max map/reduce tasks on a per-job basis, either per-node or cluster-wide
> ----------------------------------------------------------------------------
>                 Key: HADOOP-5170
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5170
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Jonathan Gray
>            Assignee: Matei Zaharia
>             Fix For: 0.21.0
>         Attachments: HADOOP-5170-tasklimits-v3-0.18.3.patch, tasklimits-v2.patch, tasklimits-v3-0.19.patch,
tasklimits-v3.patch, tasklimits-v4-20.patch, tasklimits-v4.patch, tasklimits.patch
> There are a number of use cases for being able to do this.  The focus of this jira should
be on finding what would be the simplest to implement that would satisfy the most use cases.
> This could be implemented as either a per-node maximum or a cluster-wide maximum.  It
seems that for most uses, the former is preferable however either would fulfill the requirements
of this jira.
> Some of the reasons for allowing this feature (mine and from others on list):
> - I have some very large CPU-bound jobs.  I am forced to keep the max map/node limit
at 2 or 3 (on a 4 core node) so that I do not starve the Datanode and Regionserver.  I have
other jobs that are network latency bound and would like to be able to run high numbers of
them concurrently on each node.  Though I can thread some jobs, there are some use cases that
are difficult to thread (scanning from hbase) and there's significant complexity added to
the job rather than letting hadoop handle the concurrency.
> - Poor assignment of tasks to nodes creates some situations where you have multiple reducers
on a single node but other nodes that received none.  A limit of 1 reducer per node for that
job would prevent that from happening. (only works with per-node limit)
> - Poor mans MR job virtualization.  Since we can limit a jobs resources, this gives much
more control in allocating and dividing up resources of a large cluster.  (makes most sense
w/ cluster-wide limit)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message