hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-441) SequenceFile should support 'custom compressors'
Date Wed, 06 Sep 2006 22:21:23 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-441?page=comments#action_12432982 ] 
            
Doug Cutting commented on HADOOP-441:
-------------------------------------

This is looking good.  Let's just commit it with the strange default codec performance and
work on that later, as we add more codecs.

One minor problem: we should not add any new public Writer constructor signatures.  We should
instead deprecate all existing public constructors and refer folks to the static factory methods.
 Some of the newly added constructors are also bogus, accepting a codec parameter that is
ignored.

> SequenceFile should support 'custom compressors'
> ------------------------------------------------
>
>                 Key: HADOOP-441
>                 URL: http://issues.apache.org/jira/browse/HADOOP-441
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: io
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.6.0
>
>         Attachments: codec.patch, codec.patch, codec20060831.patch, codec_updated_interfaces_20060830.patch,
reports.tgz
>
>
> SequenceFiles should support 'custom compressors' which can be specified by the user
on creation of the file. 
> Readily available packages for gzip and zip (java.util.zip) are among obvious choices
to support. Of course there will be hooks so that other compressors can be added in future
as long as there is a way to construct (input/output) streams on top of the compressor/decompressor.
> The 'classname' of the 'custom compressor/decompressor' could be stored in the header
of the SequenceFile which can then be used by SequenceFile.Reader to figure out the appropriate
'decompressor'. Thus I propose we add constructors to SequenceFile.Writer which take in the
'classname' of the compressor's input/output stream classes (e.g. DeflaterOutputStream/InflaterInputStream
or GZIPOutputStream/GZIPInputStream), which acts as the hook for future compressors/decompressors.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message