hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8462) Native-code implementation of bzip2 codec
Date Wed, 27 Feb 2013 15:43:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588446#comment-13588446

Jason Lowe commented on HADOOP-8462:

bq. Using the native bzip2 codec for the output format codec was around 6.5x the job runtime
performance of the pure Java codec

Sorry realized this was a bit unclear.  It might sound like job runtime increased by 6.5x
but it was less than 1/6th the original job runtime of the pure Java codec.  I also tested
interoperability between the native and pure Java codecs and they were each able to decompress
the other's outputs.
> Native-code implementation of bzip2 codec
> -----------------------------------------
>                 Key: HADOOP-8462
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8462
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 0.23.1
>            Reporter: Govind Kamat
>            Assignee: Govind Kamat
>         Attachments: HADOOP-8462-2.0.2a.1.patch, HADOOP-8462-2.0.2a.patch, HADOOP-8462-trunk.1.patch,
HADOOP-8462-trunk.1.patch, HADOOP-8462-trunk.patch, HADOOP-8462-trunk.patch, HADOOP-8462-trunk.patch
>   Original Estimate: 672h
>  Remaining Estimate: 672h
> The bzip2 codec supplied with Hadoop is currently available only as a Java implementation.
 A version that uses the system bzip2 library can provide improved performance and a better
memory footprint.  This will also make it feasible to utilize alternative bzip2 libraries
that may perform better for specific jobs.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message