hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4918) Make bzip2 work with SequenceFile
Date Tue, 30 Dec 2008 02:11:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659730#action_12659730
] 

Zheng Shao commented on HADOOP-4918:
------------------------------------

I saw this piece of code in TestCodec.java. 

Unfortunately SequenceFileWriter.BlockCompressWriter is not calling close() on the deflateOut
for each block. As a result, the codec is not working.

{code}
    //Necessary to close the stream for BZip2 Codec to write its final output.  Flush is not
enough.
    deflateOut.close();
{code}

We will probably need to modify BZip2 Codec to make this work.


> Make bzip2 work with SequenceFile
> ---------------------------------
>
>                 Key: HADOOP-4918
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4918
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Zheng Shao
>         Attachments: TestSequenceFileBZip.java
>
>
> Somehow bzip2 does not work with SequenceFile:
> {code}
>     String codec = "org.apache.hadoop.io.compress.BZip2Codec";
>     SequenceFile.Writer writer = SequenceFile.createWriter(fs, conf, new Path(output),

>         reader.getKeyClass(), reader.getValueClass(), CompressionType.BLOCK, 
>         (CompressionCodec)Class.forName(codec).newInstance());
> {code}
> The stack trace is here:
> {noformat}
> java.lang.UnsupportedOperationException
>         at org.apache.hadoop.io.compress.BZip2Codec.getCompressorType(BZip2Codec.java:80)
>         at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:98)
>         at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:914)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:329)
>         at org.apache.hadoop.mapred.TestSequenceFileBZip.main(TestSequenceFileBZip.java:43)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
>         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message