hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-939) No-sort optimization
Date Tue, 29 May 2007 23:16:15 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499965
] 

Owen O'Malley commented on HADOOP-939:
--------------------------------------

Doug Judd,
    Has the recent change to support reduces = 0 addressed your need? If you set the number
of reduces to 0, the output collector is fed directly from the Mapper output. If the map output
is already sorted this saves all of the costs associated with the shuffle and the distributed
sort.

> No-sort optimization
> --------------------
>
>                 Key: HADOOP-939
>                 URL: https://issues.apache.org/jira/browse/HADOOP-939
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>         Environment: all
>            Reporter: Doug Judd
>
> There should be a way to tell the mapred framework that the output of the map() phase
will already be sorted.  The Reduce phase can just merge the intermediate files together without
sorting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message