hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-8423) MapFile.Reader.get() crashes jvm or throws EOFException on Snappy or LZO block-compressed data
Date Wed, 06 Jun 2012 20:13:23 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-8423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Todd Lipcon updated HADOOP-8423:
--------------------------------

    Attachment: hadoop-8423.txt

Attached patch should fix the issue. I only tested with Snappy, not LZO, so please let me
know if LZO doesn't work.

The issue was that BlockDecompressorStream wasn't resetting its own state when resetState()
was called. So, when reseeking in the SequenceFile.Reader, it would get "out of sync" - and
be at the beginning of a block but think it was in the middle of a block. So, the codec got
invalid data fed to it.
                
> MapFile.Reader.get() crashes jvm or throws EOFException on Snappy or LZO block-compressed
data
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8423
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8423
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.2
>         Environment: Linux 2.6.32.23-0.3-default #1 SMP 2010-10-07 14:57:45 +0200 x86_64
x86_64 x86_64 GNU/Linux
>            Reporter: Jason B
>            Assignee: Todd Lipcon
>         Attachments: MapFileCodecTest.java, hadoop-8423.txt
>
>
> I am using Cloudera distribution cdh3u1.
> When trying to check native codecs for better decompression
> performance such as Snappy or LZO, I ran into issues with random
> access using MapFile.Reader.get(key, value) method.
> First call of MapFile.Reader.get() works but a second call fails.
> Also  I am getting different exceptions depending on number of entries
> in a map file.
> With LzoCodec and 10 record file, jvm gets aborted.
> At the same time the DefaultCodec works fine for all cases, as well as
> record compression for the native codecs.
> I created a simple test program (attached) that creates map files
> locally with sizes of 10 and 100 records for three codecs: Default,
> Snappy, and LZO.
> (The test requires corresponding native library available)
> The summary of problems are given below:
> Map Size: 100
> Compression: RECORD
> ==================
> DefaultCodec:  OK
> SnappyCodec: OK
> LzoCodec: OK
> Map Size: 10
> Compression: RECORD
> ==================
> DefaultCodec:  OK
> SnappyCodec: OK
> LzoCodec: OK
> Map Size: 100
> Compression: BLOCK
> ================
> DefaultCodec:  OK
> SnappyCodec: java.io.EOFException  at
> org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114)
> LzoCodec: java.io.EOFException at
> org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114)
> Map Size: 10
> Compression: BLOCK
> ==================
> DefaultCodec:  OK
> SnappyCodec: java.lang.NoClassDefFoundError: Ljava/lang/InternalError
> at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect(Native
> Method)
> LzoCodec:
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00002b068ffcbc00, pid=6385, tid=47304763508496
> #
> # JRE version: 6.0_21-b07
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0-b17 mixed mode linux-amd64 )
> # Problematic frame:
> # C  [liblzo2.so.2+0x13c00]  lzo1x_decompress+0x1a0
> #

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message