hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlo Curino (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-4585) Checkpoint shuffle aggregation as map output
Date Mon, 27 Aug 2012 19:16:07 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Carlo Curino updated MAPREDUCE-4585:
------------------------------------

    Attachment: shufflecheckpoint.pdf

Shuffle Checpoint-restart explained
                
> Checkpoint shuffle aggregation as map output
> --------------------------------------------
>
>                 Key: MAPREDUCE-4585
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4585
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: task
>            Reporter: Chris Douglas
>         Attachments: shufflecheckpoint.pdf
>
>
> Map output collected during the shuffle can be spilled and written as a composite of
map outputs. Particularly if the job employs a combiner, this checkpoint can provide fault
tolerance and improve job throughput by aggregating intermediate output. The latter is especially
helpful for jobs with multiple waves of reduces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message