hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nanheng Wu <nanhen...@gmail.com>
Subject Re: Use loadtable.rb with compressed data?
Date Fri, 28 Jan 2011 17:58:49 GMT
Awesome. I ran it on one of the hfiles and got this:
11/01/28 09:57:15 INFO compress.CodecPool: Got brand-new decompressor
java.io.IOException: Not in GZIP format
	at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:137)
	at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:58)
	at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:68)
	at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.<init>(GzipCodec.java:92)
	at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.<init>(GzipCodec.java:101)
	at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:169)
	at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:179)
	at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createDecompressionStream(Compression.java:168)
	at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1013)
	at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:966)
	at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1291)
	at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:1740)

So the problem could be that HFile writer is not writing properly
gzipped outputs?

On Fri, Jan 28, 2011 at 9:41 AM, Stack <stack@duboce.net> wrote:
> The section in 0.90 book on hfile tool should apply to 0.20.6:
> http://hbase.apache.org/ch08s02.html#hfile_tool  It might help you w/
> your explorations.
> St.Ack
> On Fri, Jan 28, 2011 at 9:38 AM, Nanheng Wu <nanhengwu@gmail.com> wrote:
>> Hi Stack,
>>  Get doesn't work either. It was a fresh table created by
>> loadtable.rb. Finally, the uncompressed version had the same number of
>> regions (8 total). I totally understand you guys shouldn't be patching
>> the older version, upgrading for me is an option but will be pretty
>> painful. I wonder if I can figure something out by comparing the two
>> version's Hfile. Thanks again!
>> On Fri, Jan 28, 2011 at 9:14 AM, Stack <stack@duboce.net> wrote:
>>> On Thu, Jan 27, 2011 at 9:35 PM, Nanheng Wu <nanhengwu@gmail.com> wrote:
>>>> In the compressed case, there are 8 regions and the region start/end
>>>> keys do line up. Which actually is confusing to me, how can hbase read
>>>> the files if they are compressed? does each hfile have some metadata
>>>> in it that has compression info?
>>> You got it.
>>>> Anyway, the regions are the same
>>>> (numbers and boundaries are same) in both compressed and uncompressed
>>>> version. So what else should I look into to fix this? Thanks again!
>>> You can't scan. Can you Get from the table at all?  Try getting start
>>> key from a few of the regions you see in .META.
>>> Did this table preexist or was this a fresh creation?
>>> When you created this table uncompressed, how many regions was it?
>>> How about just running uncompressed while you are on 0.20.6?  We'd
>>> rather be fixing bugs in the new stuff, not the version that we are
>>> leaving behind?
>>> Thanks,
>>> St.Ack

View raw message