hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1463) Reducer should start faster for smaller jobs
Date Wed, 10 Feb 2010 07:12:28 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831894#action_12831894

Amar Kamat commented on MAPREDUCE-1463:

What should be the behavior where total number of maps and reducers are less (i.e a small
job for now) but takes huge amount of time to finish. For example the map takes a day to run
while the reduces are also compute intensive. In such a case would we still consider the job
as small job? I think what we want to capture is the job behavior (fast *finishing* job versus
others). Using task counts might not be sufficient. 

Scott, wouldn't this problem be solved if you set 'mapreduce.job.reduce.slowstart.completedmaps'
to a default value of 0 (instead of 0.5) for all your users? 

> Reducer should start faster for smaller jobs
> --------------------------------------------
>                 Key: MAPREDUCE-1463
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1463
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/fair-share
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>         Attachments: MAPREDUCE-1463-v1.patch, MAPREDUCE-1463-v2.patch, MAPREDUCE-1463-v3.patch
> Our users often complain about the slowness of smaller ad-hoc jobs.
> The overhead to wait for the reducers to start in this case is significant.
> It will be good if we can start the reducer sooner in this case.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message