hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Chen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1463) Reducer should start faster for smaller jobs
Date Tue, 16 Feb 2010 18:45:27 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834371#action_12834371

Scott Chen commented on MAPREDUCE-1463:

@Amar: Sorry for the late reply. I have just got back from vacation. About your long running
mapper argument I think you are right. Using task counts is not sufficient. Maybe we need
more information than task counts to determine when to delay the reducers. Can you give me
some suggestions? Setting mapreduce.job.reduce.slowstart.completedmaps to zero does increase
the latency. But it hurts the reducer utilization.

I think the trade-off here is that we want to delay the reducers to increase the reducer utilization
but we also want to minimize the impact of this delay for smaller jobs because this delay
is significant for smaller jobs but is OK for large jobs. So these two cases should be treated
differently. There should be a way to balance the reducer utilization and small job latency,

> Reducer should start faster for smaller jobs
> --------------------------------------------
>                 Key: MAPREDUCE-1463
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1463
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/fair-share
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>         Attachments: MAPREDUCE-1463-v1.patch, MAPREDUCE-1463-v2.patch, MAPREDUCE-1463-v3.patch
> Our users often complain about the slowness of smaller ad-hoc jobs.
> The overhead to wait for the reducers to start in this case is significant.
> It will be good if we can start the reducer sooner in this case.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message