hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-647) Map outputs can't have a different type of compression from the reduce outputs
Date Tue, 31 Oct 2006 20:42:19 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-647?page=comments#action_12446051 ] 
Doug Cutting commented on HADOOP-647:

If map.output.compression.type is not specified, shouldn't it default to the job's compression

Also, if  we have a codec that's can keep up with disk io (like lzo) then block compression
should be faster for sorting and merging, since it will reduce the amount of i/o.

> Map outputs can't have a different type of compression from the reduce outputs
> ------------------------------------------------------------------------------
>                 Key: HADOOP-647
>                 URL: http://issues.apache.org/jira/browse/HADOOP-647
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.7.2
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.8.0
>         Attachments: map-out-compress-type.patch
> Right now there is only a single knob to control the compression type for sequence files.
Sorting and merging is faster with record compression, but the files are smaller with block
compression. I'd like to introduce a mapOutputCompressionType that lets the application control
how the map outputs are compressed.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message