hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5583) Ability to limit running map and reduce tasks
Date Tue, 15 Oct 2013 18:00:50 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13795443#comment-13795443

Arun C Murthy commented on MAPREDUCE-5583:

Cluster with 100,000 containers, 1,000 jobs, each with 100000 tasks, and specifies that they
can only run 5 tasks. So, you are now only using 5% of the cluster and no one makes progress
leading to very poor utilization and peanut-buttering effect.

Admittedly it's a contrived example and yes, I agree a user can hack his own AM to do this
- but let's not make this trivial for normal users. This leads to all sorts of bad side-effects
by supporting it out of the box.

Some form of admin control (e.g. queue with a max-cap) for a small number of use-cases where
you *actually* need this feature is much safer.

> Ability to limit running map and reduce tasks
> ---------------------------------------------
>                 Key: MAPREDUCE-5583
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5583
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.9, 2.1.1-beta
>            Reporter: Jason Lowe
> It would be nice if users could specify a limit to the number of map or reduce tasks
that are running simultaneously.  Occasionally users are performing operations in tasks that
can lead to DDoS scenarios if too many tasks run simultaneously (e.g.: accessing a database,
web service, etc.).  Having the ability to throttle the number of tasks simultaneously running
would provide users a way to mitigate issues with too many tasks on a large cluster attempting
to access a serivce at any one time.
> This is similar to the functionality requested by MAPREDUCE-224 and implemented by HADOOP-3412
but was dropped in mrv2.

This message was sent by Atlassian JIRA

View raw message