hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-956) Shuffle should be broken down to only two phases (copy/reduce) instead of three (copy/sort/reduce)
Date Thu, 10 Sep 2009 14:56:57 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753647#action_12753647

Arun C Murthy commented on MAPREDUCE-956:

I can see hte appeal of this, but we should remember that there are applications where merge
is a significant part of the reduce runtime e.g. petasort's merge was _huge_.

> Shuffle should be broken down to only two phases (copy/reduce) instead of three (copy/sort/reduce)
> --------------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-956
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-956
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 0.21.0
>            Reporter: Jothi Padmanabhan
> For the progress calculations and displaying on the UI, shuffle, in its current form,
 is decomposed into three phases (copy/sort/reduce). Actually, the sort phase is no longer
applicable. I think we should just reduce the number of phases to two and assign 50% weight-age
to each of copy and reduce phases. Thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message