hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-956) Shuffle should be broken down to only two phases (copy/reduce) instead of three (copy/sort/reduce)
Date Wed, 09 Sep 2009 15:24:58 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753123#action_12753123

Tom White commented on MAPREDUCE-956:

The sort phase is actually when the map-outputs are being merged prior to being fed to the
reducer. Could you give a bit more detail about what has changed - presumably the merging
still takes place, so perhaps "sort phase" should just be renamed to "merge phase". 

> Shuffle should be broken down to only two phases (copy/reduce) instead of three (copy/sort/reduce)
> --------------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-956
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-956
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 0.21.0
>            Reporter: Jothi Padmanabhan
> For the progress calculations and displaying on the UI, shuffle, in its current form,
 is decomposed into three phases (copy/sort/reduce). Actually, the sort phase is no longer
applicable. I think we should just reduce the number of phases to two and assign 50% weight-age
to each of copy and reduce phases. Thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message