hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-4162) CodecPool.getDecompressor(LzopCodec) always creates a brand-new decompressor.
Date Thu, 11 Sep 2008 23:15:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630413#action_12630413
] 

chris.douglas edited comment on HADOOP-4162 at 9/11/08 4:13 PM:
----------------------------------------------------------------

bq. Using LzopCodec as anything but a stream doesn't make sense.

I should probably be clearer. Reusing the decompressor between streams makes sense, but using
LzopDecompressor like LzoDecompressor or ZlibDecompressor to effect block compression for
a structured file format is not going to work, or at least is unlikely to match the intent.
I'm assuming this is related to HADOOP-3315, which- like SequenceFile- shouldn't use LzopCodec.

As written, an LzopDecompressor instance can't be reused between streams. The checksums aren't
reset. LzopDecopressor should clear its checksum maps in initHeaderFlags before adding new
ones.

      was (Author: chris.douglas):
    bq. Using LzopCodec as anything but a stream doesn't make sense.

I should probably be clearer. Reusing the decompressor between streams makes sense, but using
LzopDecompressor like LzoDecompressor or ZlibDecompressor to effect block compression for
a structured file format is not going to work, or at least is unlikely to match the intent.
I'm assuming this is related to HADOOP-3315, which- like SequenceFile- shouldn't use LzopCodec.

The patch is good, but I'm concerned about possible (mis)uses.
  
> CodecPool.getDecompressor(LzopCodec) always creates a brand-new decompressor.
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-4162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4162
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.18.0
>            Reporter: Hong Tang
>            Assignee: Arun C Murthy
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4162_0_20080911.patch
>
>
> CodecPool.getDecompressor(LzopCodec) always creates a brand-new decompressor. I investigated
the code, the reason seems to be the following:
> LzopCodec inherits from LzoCodec. The getDecompressorType() method is supposed to return
the concrete Decompressor class type the specific Codec class creates. In this case, LzopCodec
creates LzopDecompressors and should return LzopDecompressor.class. But instead, it uses the
getDecompressorType() method defined in the parent and returns LzoDecompressor.class.
> This leads to CodecPool unable to properly recycle the decompressors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message