hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-591) Reducer sort should even out the pass factors in merging different pass
Date Tue, 10 Oct 2006 17:16:19 GMT
Reducer sort should even out the pass factors in merging different pass
-----------------------------------------------------------------------

                 Key: HADOOP-591
                 URL: http://issues.apache.org/jira/browse/HADOOP-591
             Project: Hadoop
          Issue Type: Improvement
          Components: mapred
            Reporter: Runping Qi



When multiple pass merging is needed during sort, the current sort implementation in SequenceFile
class uses a simple "greedy" way to select pass factors, resulting uneven pass factor in different
passes. For example, if the factor pass is 100 (the default), and there are 101 segments to
be merged. The current implementation will first merge the first 100 segments into one and
then merge the big output file with the last segment with pass factor 2. It will be better
off to use pass factors 11 in the first pass and pass factor 10 in the second pass.



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message