hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4228) mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks
Date Mon, 07 May 2012 13:50:49 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269628#comment-13269628
] 

Jason Lowe commented on MAPREDUCE-4228:
---------------------------------------

As I understand it, this functionality works properly in Hadoop 1.0, so that reducers do not
start until the requisite number of map tasks have completed.  In 0.23/2.0 that's not the
case, and we need to correct that behavior rather than rename existing properties to match
an undesired behavior.
                
> mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling
of the reduce tasks
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4228
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4228
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster, mrv2
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>
> If no more map tasks need to be scheduled but not all have completed, the ApplicationMaster
will start scheduling reducers even if the number of completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps
threshold.  For example, if the property is set to 1.0 all maps should complete before any
reducers are scheduled.  However the reducers are scheduled as soon as the last map task is
assigned to a container.  For a job with very long-running maps, a cluster with enough capacity
to launch all map tasks could cause reducers to launch prematurely and waste cluster resources.
> Thanks to Phil Su for discovering this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message