hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Phillips (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-3569) RCFile requires native Hadoop library
Date Thu, 11 Oct 2012 18:07:03 GMT

     [ https://issues.apache.org/jira/browse/HIVE-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

David Phillips updated HIVE-3569:
---------------------------------

    Description: 
RCFile requires the native Hadoop library. It does not work when using the Java {{GzipCodec}}.

The root cause is that the two versions of {{GzipCodec.createInputStream()}} work differently.
The native version simply saves a reference to the supplied input stream. The Java version
wraps the stream in a Java {{GZIPInputStream}}, which immediately tries to read the header.

The problem occurs because the stream passed by the {{RCFile.ValueBuffer}} constructor is
empty (the buffer backing the stream is still empty at that point).



{noformat}
12/10/11 10:37:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
12/10/11 10:37:25 INFO io.CodecPool: Got brand-new decompressor
12/10/11 10:37:25 INFO io.CodecPool: Got brand-new decompressor
Exception in thread "main" java.io.EOFException
	at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:264)
	at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:254)
	at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:163)
	at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:78)
	at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:90)
	at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.<init>(GzipCodec.java:92)
	at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.<init>(GzipCodec.java:101)
	at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:169)
	at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:179)
	at org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer.<init>(RCFile.java:451)
	at org.apache.hadoop.hive.ql.io.RCFile$Reader.<init>(RCFile.java:1205)
	at org.apache.hadoop.hive.ql.io.RCFile$Reader.<init>(RCFile.java:1111)
	at org.apache.hadoop.hive.ql.io.RCFileRecordReader.<init>(RCFileRecordReader.java:52)
{noformat}

  was:
RCFile requires the native Hadoop library. It does not work when using the Java {{GzipCodec}}.

The root cause is that the two versions of {{GzipCodec.createInputStream()}} work differently.
The native version simply saves a reference to the supplied input stream. The Java version
wraps the stream in a Java {{GZIPInputStream}}, which immediately tries to read the header.

The problem occurs because the stream passed by the {{RCFile.ValueBuffer}} constructor is
empty (the buffer backing the stream is still empty at that point).


{noformat}
12/10/11 10:37:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
12/10/11 10:37:25 INFO io.CodecPool: Got brand-new decompressor
12/10/11 10:37:25 INFO io.CodecPool: Got brand-new decompressor
Exception in thread "main" java.io.EOFException
	at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:264)
	at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:254)
	at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:163)
	at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:78)
	at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:90)
	at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.<init>(GzipCodec.java:92)
	at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.<init>(GzipCodec.java:101)
	at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:169)
	at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:179)
	at org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer.<init>(RCFile.java:451)
	at org.apache.hadoop.hive.ql.io.RCFile$Reader.<init>(RCFile.java:1205)
	at org.apache.hadoop.hive.ql.io.RCFile$Reader.<init>(RCFile.java:1111)
	at org.apache.hadoop.hive.ql.io.RCFileRecordReader.<init>(RCFileRecordReader.java:52)
{noformat}

    
> RCFile requires native Hadoop library
> -------------------------------------
>
>                 Key: HIVE-3569
>                 URL: https://issues.apache.org/jira/browse/HIVE-3569
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: David Phillips
>
> RCFile requires the native Hadoop library. It does not work when using the Java {{GzipCodec}}.
> The root cause is that the two versions of {{GzipCodec.createInputStream()}} work differently.
The native version simply saves a reference to the supplied input stream. The Java version
wraps the stream in a Java {{GZIPInputStream}}, which immediately tries to read the header.
> The problem occurs because the stream passed by the {{RCFile.ValueBuffer}} constructor
is empty (the buffer backing the stream is still empty at that point).
> {noformat}
> 12/10/11 10:37:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
> 12/10/11 10:37:25 INFO io.CodecPool: Got brand-new decompressor
> 12/10/11 10:37:25 INFO io.CodecPool: Got brand-new decompressor
> Exception in thread "main" java.io.EOFException
> 	at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:264)
> 	at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:254)
> 	at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:163)
> 	at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:78)
> 	at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:90)
> 	at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.<init>(GzipCodec.java:92)
> 	at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.<init>(GzipCodec.java:101)
> 	at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:169)
> 	at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:179)
> 	at org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer.<init>(RCFile.java:451)
> 	at org.apache.hadoop.hive.ql.io.RCFile$Reader.<init>(RCFile.java:1205)
> 	at org.apache.hadoop.hive.ql.io.RCFile$Reader.<init>(RCFile.java:1111)
> 	at org.apache.hadoop.hive.ql.io.RCFileRecordReader.<init>(RCFileRecordReader.java:52)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message