hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Omkar Vinit Joshi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5507) MapReduce reducer ramp down is suboptimal with potential job-hanging issues
Date Mon, 23 Sep 2013 01:35:02 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774235#comment-13774235
] 

Omkar Vinit Joshi commented on MAPREDUCE-5507:
----------------------------------------------

attaching a very basic patch.. tested it locally on my machine.
* When the cluster gets saturated it will start preempting reducers after waiting for map
task. 
* Right now I am using fix interval of 2 min but this will be updated with a min of multiple
of hearbeat intervals or avg map task finish time.

Please let me know if the approach taken is correct.
                
> MapReduce reducer ramp down is suboptimal with potential job-hanging issues
> ---------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5507
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5507
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Omkar Vinit Joshi
>            Assignee: Omkar Vinit Joshi
>         Attachments: MAPREDUCE-5507.20130922.1.patch
>
>
> Today if we are setting "yarn.app.mapreduce.am.job.reduce.rampup.limit" and "mapreduce.job.reduce.slowstart.completedmaps"
then reducers are launched more aggressively. However the calculation to either Ramp up or
Ramp down reducer is not done in most optimal way. 
> * If MR AM at any point sees situation something like 
> ** scheduledMaps : 30
> ** scheduledReducers : 10
> ** assignedMaps : 0
> ** assignedReducers : 11
> ** finishedMaps : 120
> ** headroom : 756 ( when your map /reduce task needs only 512mb)
> * then today it simply hangs because it thinks that there is sufficient room to launch
one more mapper and therefore there is no need to ramp down. However, if this continues forever
then this is not the correct way / optimal way.
> * Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 and there
are running reducers around then it should wait for certain time ( upper limited by average
map task completion time ... for heuristic sake)..but after that if still it doesn't get new
container for map task then it should preempt the reducer one by one with some interval and
should ramp up slowly...
> ** Preemption of reducers can be done in little smarter way
> *** preempt reducer on a node manager for which there is any pending map request.
> *** otherwise preempt any other reducer. MR AM will contribute to getting new mapper
by releasing such a reducer / container because it will reduce its cluster consumption and
thereby may become candidate for an allocation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message