hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Binglin Chang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-2910) Allow empty MapOutputFile segments
Date Tue, 30 Aug 2011 03:48:38 GMT
Allow empty MapOutputFile segments

                 Key: MAPREDUCE-2910
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2910
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: task, tasktracker
    Affects Versions: 0.20.2, 0.23.0
            Reporter: Binglin Chang
            Priority: Minor
             Fix For: 0.23.0

As the scale of cluster and job get larger, we see a lot of empty partitions in MapOutputFile
due to large reduce numbers or partition skew. When map output compression is enabled, empty
map output partitions gets larger & has additional compressor/decompressor initialization
This can be optimized by allowing empty MapOutputFile segments, where the rawLength &
partLength of IndexRecord all equal to 0. Corresponding support need to be added to IFile
reader, writer, and reduce shuffle copier.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message