hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8582) Improve error reporting for GZIP-compressed SequenceFiles with missing native libraries.
Date Tue, 10 Jul 2012 12:54:34 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410312#comment-13410312

Harsh J commented on HADOOP-8582:

Hi Paul, thanks for filing this and the patch! I've run into this as well.

Patch looks good but can you also add in a test (you can selectively disable loading of native
libs via configuration if need be), so we don't regress from this test, if possible?

Also, perhaps we can instead be more specific in the error message (saying "SequenceFile.Reader
can't read Gzip compressed files without native-hadoop libraries" or so? Feel free to improve
the Writer when you're at it too, if needed)
> Improve error reporting for GZIP-compressed SequenceFiles with missing native libraries.
> ----------------------------------------------------------------------------------------
>                 Key: HADOOP-8582
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8582
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 2.0.0-alpha
>         Environment: Centos 5.8, Java 6 Update 26
>            Reporter: Paul Wilkinson
>            Priority: Minor
>         Attachments: HADOOP-8582-1.diff
> At present it is not possible to write or read block-compressed SequenceFiles using the
GZIP codec without the native libraries being available.
> The SequenceFile.Writer code checks for the availability of native libraries and throws
a useful exception, but the SequenceFile.Reader doesn't do the same:
> {noformat}
> Exception in thread "main" java.io.EOFException
> 	at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:249)
> 	at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:239)
> 	at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:142)
> 	at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:58)
> 	at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:67)
> 	at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.<init>(GzipCodec.java:95)
> 	at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.<init>(GzipCodec.java:104)
> 	at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:173)
> 	at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:183)
> 	at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1591)
> 	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1493)
> 	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1480)
> 	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1475)
> 	at test.SequenceReader.read(SequenceReader.java:23)
> {noformat}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message