hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Kang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6663) BlockDecompressorStream get EOF exception when decompressing the file compressed from empty file
Date Mon, 29 Mar 2010 08:41:27 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850835#action_12850835

Xiao Kang commented on HADOOP-6663:

The EOF exception caused as follow:

BlockCompressorStream compresses the input data block-by-block. For each block, the uncopressed
block length is first written to the underlying output stream, followed by compressed chunks,
each consists of a chuck-length and compressed chunk-data. 

So BlockCompressorStream writes an int of 0 to the underlying output stream when compressing
empty file, without any following chunks. 

BlockDecompressorStream decompresses the underlying compressed input stream block by block.
For each block, it reads the uncompressed block length and then reads the chunk length and
compressed chunk. 

So BlockDecompressorStream read 0 and get EOF exception trying to read chunk length when decompressing
previous compressed empty file.

> BlockDecompressorStream get EOF exception when decompressing the file compressed from
empty file
> ------------------------------------------------------------------------------------------------
>                 Key: HADOOP-6663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6663
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.2
>            Reporter: Xiao Kang
> An empty file can be compressed using BlockDecompressorStream, which is for block-based
compressiong algorithm such as LZO. However, when decompressing the compressed file, BlockDecompressorStream
get EOF exception.
> Here is a typical exception stack:
> java.io.EOFException
> at org.apache.hadoop.io.compress.BlockDecompressorStream.rawReadInt(BlockDecompressorStream.java:125)
> at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:96)
> at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:82)
> at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
> at java.io.InputStream.read(InputStream.java:85)
> at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
> at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
> at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:18)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
> at org.apache.hadoop.mapred.Child.main(Child.java:196)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message