hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2905) Allow mapred.fairscheduler.assignmultple to be set per job
Date Mon, 29 Aug 2011 18:10:38 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093036#comment-13093036
] 

Todd Lipcon commented on MAPREDUCE-2905:
----------------------------------------

There's code in the fair scheduler which tries to avoid doing this. I think something is broken
about it, currently, though, since I've heard many people report the same behavior. Look at
CapBasedLoadManager.java.

Either way, I don't think doing this per-job is the right solution - better to avoid exposing
the config, and just Do The Right Thing.

> Allow mapred.fairscheduler.assignmultple to be set per job
> ----------------------------------------------------------
>
>                 Key: MAPREDUCE-2905
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2905
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/fair-share
>    Affects Versions: 0.20.2
>            Reporter: Jeff Bean
>
> We encountered a situation where in the same cluster, large jobs benefit from mapred.fairscheduler.assignmultiple,
but small jobs with small numbers of mappers do not: the mappers all clump to fully occupy
just a few nodes, which causes those nodes to saturate and bottleneck. The desired behavior
is to spread the job across more nodes so that a relatively small job doesn't saturate any
node in the cluster.
> Testing has shown that setting mapred.fairscheduler.assignmultiple to false gives the
desired behavior for small jobs, but is unnecessary for large jobs. However, since this is
a cluster-wide setting, we can't properly tune.
> It'd be nice if jobs can set a param similar to mapred.fairscheduler.assignmultiple on
submission to better control the task distribution of a particular job.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message