hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4354) Performance improvement with compressor object reinit restriction
Date Wed, 18 Jul 2012 18:21:36 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417337#comment-13417337
] 

Robert Joseph Evans commented on MAPREDUCE-4354:
------------------------------------------------

The test results look great to me, but my comment about contributing this to trunk is off
base.  My ignorance is showing :).  The LZO compression libraries that you modified are not
hosted here.

You need to look at 

http://code.google.com/a/apache-extras.org/p/hadoop-gpl-compression/?redir=1

or

https://github.com/omalley/hadoop-gpl-compression

And email the dev list there.  Owen O'Mally is probably the right person to talk to there
about getting this patch in.  Once it is in it should work both on trunk and 0.20.205
                
> Performance improvement with compressor object reinit restriction
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-4354
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4354
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: performance
>    Affects Versions: 0.20.205.0
>            Reporter: Ankit Kamboj
>            Priority: Minor
>              Labels: performance
>             Fix For: 0.20.205.0
>
>         Attachments: codec_reinit_diff, modify_lzo_codec_reinit
>
>
> HADOOP-5879 patch aimed at picking the conf (instead of default) settings for GzipCodec.
It also involved re-initializing the recycled compressor object. 
> On our performance tests, this re-initialization led to performance degradation of 15%
for LzoCodec because re-initialization for Lzo involves reallocation of buffers. LzoCodec
takes the initial settings from config so it is not necessary to re-initialize it. This patch
checks for the codec class and calls reinit only if the codec class is Gzip. This led to significant
performance improvement of 15% for LzoCodec.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message