hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Work started: (HADOOP-441) SequenceFile should support 'custom compressors'
Date Thu, 17 Aug 2006 10:56:14 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-441?page=all ]

Work on HADOOP-441 started by Arun C Murthy.

> SequenceFile should support 'custom compressors'
> ------------------------------------------------
>                 Key: HADOOP-441
>                 URL: http://issues.apache.org/jira/browse/HADOOP-441
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: io
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.6.0
> SequenceFiles should support 'custom compressors' which can be specified by the user
on creation of the file. 
> Readily available packages for gzip and zip (java.util.zip) are among obvious choices
to support. Also 'bmdiff' seems a good candidate for adding support for. Of course there will
be hooks so that other compressors can be added in future as long as there is a way to construct
(input/output) streams on top of the compressor/decompressor.
> The 'classname' of the 'custom compressor/decompressor' could be stored in the header
of the SequenceFile which can then be used by SequenceFile.Reader to figure out the appropriate
'decompressor'. Thus I propose we add constructors to SequenceFile.Writer which take in the
'classname' of the compressor's input/output stream classes (e.g. DeflaterOutputStream/InflaterInputStream
or GZIPOutputStream/GZIPInputStream), which acts as the hook for future compressors/decompressors.
> Looks like there isn't a java library for bmdiff (I'd love to be corrected on this)...
thoughts on how to go about this? A JNI wrapper on top of a C api? If so how difficult does
hadoop-dev think it is to implement a input/output stream on top of this? Alternatives?

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message