hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1463) Reducer should start faster for smaller jobs
Date Sat, 06 Feb 2010 01:30:27 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12830406#action_12830406
] 

Todd Lipcon commented on MAPREDUCE-1463:
----------------------------------------

I think the transferred logic is wrong. Shouldn't it be:
{code}
return numMapTasks <= reduceRushMapsThreshold ||
            numReduceTasks <= reduceRushReducesThreshold ||
            finishedMapTasks >= completedMapsForReduceSlowstart;
{code}

Also, I'm not sure that the design is quite right. If I have 1 map but 200 reduces, I don't
want to rush the reduces, do I? That is to say, should the condition be && between
the two rush parameters, or ||?

> Reducer should start faster for smaller jobs
> --------------------------------------------
>
>                 Key: MAPREDUCE-1463
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1463
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/fair-share
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>         Attachments: MAPREDUCE-1463-v1.patch, MAPREDUCE-1463-v2.patch
>
>
> Our users often complain about the slowness of smaller ad-hoc jobs.
> The overhead to wait for the reducers to start in this case is significant.
> It will be good if we can start the reducer sooner in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message